Ansible Talk @ Infra Coders

Here are the notes that I used in my talk at Infrastructure Coders. Each section was also put on the screen as a ‘slide’. The configuration that I used in the demo is available at GitHub. A full video of the meetup is available on the Infrastructure Coders Youtube Channel, my talk starts at 25:05.

Ansible
--------

0. There is nothing in the hat
  - Start a RHEL install
    - Cmd line: console=ttyS0 ks=http://admin01/ns3.cfg
    - If you want to follow the demos grab the ansible config from my github
    - You will need to substitute hostnames in the ansible hosts file
    - You should copy the firewall config from ns2 (remove port 647 if paranoid)

1. The problem
  - Ansible is the combination of several functions
  - There was a plan to build config management on func
  - However func is a pain to setup
  - Puppet and Chef have a steep learning curve
  - Ansible was also built to simplfy complexrelease procedures
  - You need to know ruby to extend Puppet/Chef

2. Ansible
  - Designed so you can figure out how to use it in 15 minutes
  - Designed to be easy to setup
  - Doesn't require much to be installed on the managed host
  - Designed to do config management/deployment/ad hoc
  - Other people do security better, just use SSH
  - You can extend ansible in any language that can output JSON

3. Simple Ansible Demo
  - Ansible hosts file
  - Ansible can be run directly on the command line
    - Run cat /etc/redhat-release
    - Get info using the setup module
  - It can prompt for auth, or use key based auth
    - On the new machine show it prompting
    - Run the rhelsetup script on the new machine
    - Install vim-enhanced

4. Playbooks
  - This is the method of scripting Ansible
  - Done in YAML
  - Executed in order *gee thanks puppet*
  - Designed to be as easy as possible

5. Example playbook
  - Playbook for the name servers
    - https://github.com/smarthall/tildaslash.com/blob/master/playbooks/zones.play
    - Can have multiple plays in a book
    - Can serialise if you dont want all to be down at once
  - Template config for the name servers
    - https://github.com/smarthall/tildaslash.com/blob/master/playbooks/zones/named.conf.j2
  - Firewall install script
    - https://github.com/smarthall/tildaslash.com/blob/master/playbooks/firewall.play

6. My thoughts
  - Config management has been around a while, its going from art to science
  - Ansible covers more ground than puppet and chef do
  - Ansible doesn't compromise on simplicity to do that
  - I don't have to focus on the nodes, I can focus on services
  - There is something missing
    - Disk config is done in kickstarts
    - Network config can't be done by Ansible
    - Need to find a way to cover both with one

The playlist of all the videos is available at Youtube.

Jabberd2 lockup after authorisation

I recently spent forever debugging an issue where the server would lock up right after the client sent authentication details. What made it worse was that XMPP clients seem to have very poor error messages, so its near impossible figuring out what went wrong. That and Wireshark becomes much harder when using encrypted sessions. Finally though I managed to track it down and this page is mainly as a reminder to myself.

The problem is due to the fact I configured my server to not allow account registrations. I setup the following line in c2s.xml:

<id require-starttls='true'
        pemfile='/etc/jabberd/server.pem'
        password-change='true'
        instructions='Registrations not allowed...'
    >danielhall.me</id>

Followed by this in sm.xml:

<!--  
<auto-create/>
-->

Then I inserted all my users into the authreg table manually using queries like:

INSERT INTO authreg VALUES ('username', 'danielhall.me', 'sup@Hs3cr3tPassw0rd');

The problem here is that you’ve manually created the authorisation for the account, and not created the session. The best way to deal with this is to remove the comments around the auto-create tag in sm.xml. This isn’t a problem as users not in the authreg table will not even be able to reach the session manager
before their authorisation is rejected.

Arduino Traffic Lights

The completed Traffic Light
Please Note: The instructions here are provided ‘as is’ with no guarantee or warranty whatsoever. In no circumstances should they be used to build a traffic safety device. The traffic light device I built is a novelty and is used as such.

Materials

1 USB traffic light hubMaterials for a USB Traffic Light (search ebay for ‘traffic light hub’)
1 Atmega8u2 breakout board
1 each of 9mm Red Green and Orange LEDs
3 470 Ohm resistors
4 thin patch wires (preferably different colours)
1 USB mini cable
1 Pack of Black Sugru
Solder
Hot Glue
Corrugated Cardboard

Equiptment for the USB Traffic LightEquipment

A Dremel, or similar cutting tool
Hot Glue Gun
Soldering Iron
Wire Strippers
Wire Cutters
Spudger tool, or guitar pick, stanley knife or fingernails
Linux PC (with avr-gcc and dfu-programmer commands installed)

Details

1. Use the spudger to open the traffic lightThe Traffic Light once pulled apart

The weakest part of the traffic light is the stem but it is nearly impossible to open it from there. The approach that I took was to pry the top open a little, then pry the bottom open a little, then carefully pull the two halves apart at the same time. This ensures that you dont snap the plastic holding the stem together. If you don’t have four wires from the materials list, you can strip the USB cable and use those since it contains four differently coloured wires.

2. Gut the traffic lightThe traffic light after reordering the windows and removing hub board

Remove the USB hub circuit board. You can use this board later in another project or two, should you do you will probably want to cut off the LEDs. In my board the little plastic windows were in the wrong order for Australian traffic lights. Luckily this is easy to fix, simply pop them out and press them back in the correct place. Feel free to put the windows in any order that you wish, just remember to alter the program later before compiling and flashing it.

3. Make roomThe Traffic Light after cutting out the inside

Its a little hard to fit the Circuitry into the case with the two supports that held the hub in place, additionally you need a little extra space for the thickness of the cardboard. You can get all this by using a Dremel to remove the supports and some of the plastic around the USB sockets. Be careful to not hit the side of the case as I did on my first one as the plastic is thin and easily blemishes the outside. The photo shows one before modification on the left and after a fight with the Dremel on the left.

4. Make the cardboard circuit boardThe front of the cardboard board

The process of getting a circuit board built takes too long for a quick hack like this and I didn’t have any protoboard lying around. This all means that we get to make a crazy cardboard circuit board. The best thing about cardboard is that you can draw on it as you build it, and you can easily cut it to fit the interior of the device easily. Basically put the LEDs and resistors through the cardboard next to each other. For ease of wiring put align all the LEDs to have their long legs in the same direction. Wire each short leg to a resistor and the free legs of all the resistors to a single black wire. Finally wire The back of the cardboard boardeach of the long legs of the LEDs to a different coloured wire. Once this is done you should test each lead and LED to make sure it is correctly wired. You can do this by connecting ground to the black lead, and 3-5V to the coloured leads.

5. Connect to the Breakout board

Now we take out the SparkFun Atmega8U2 breakout and solder it to the LEDs on our cardboard board. This is pretty simple, basically you solder the wire from the resistors to the hole labelled GND, the green LED to PB7, the orange LED to PB6 and finally the red The breakout board showing the wires coming from the cardboard boardLED to PB5. You will likely want to cut the wires so that they reach the board where it will sit, and only have a few extra millimeters. If you leave too much room then you will have issues trying to put the wires inside the case, and of course if you don’t leave enough you wont be able to get the breakout board to sit in the right place.

6. Load the software on the device and test

You can find the code to build the firmware on my GitHub account. Provided that you have avr-gcc and dfu-programmer installed you should be able to simply clone that repository and type ‘make all’ inside it. If that refuses to work for some reason though, I have attached the output of compilation, a hex file (which is uploaded as a txt file to stop wordpress whining). You can download the hex file here: USBTrafficLight.hex. This hex file, or the one that’s output from programming can be programmed onto the device easily. First you put the device into bootloader mode by plugging it into the computer then hit the reset button. Finally you run the following commands:

dfu-programmer at90usb82 erase # Erase MCU
dfu-programmer at90usb82 flash USBTrafficLight.hex # Flash MCU
dfu-programmer at90usb82 reset # Reset MCU

You will have to unplug and plug the device back in to get it out of bootloader mode. Once you have programmed the device you should run a test to make sure it works. You can do this by writing characters to the virtual serial port the device creates. The following commands will do this for you:

echo 'g' > /dev/ttyACM0 # Should be green
echo 'o' > /dev/ttyACM0 # Should be orange
echo 'r' > /dev/ttyACM0 # Should be red

7. Cut a hole in the baseShowing the hole cut in the base of the back piece of the Traffic Light

This is the part I’m least happy with. Here you have to cut a hole in the base so that you can plug in a mini usb cable. I generally cut a rectangular hole in the back piece of the traffic light that is about the same size as the USB cable I’m using. This rectangle usually goes about halfway up the base, and goes right to the bottom while being a little bit wider then a USB mini cable. I also cut a small drill hole to allow access to the reset button for loading new firmware. Make sure you test that you can plug in the USB mini cable. Instead of cutting a hole in the base I’ve been thinking about building a USB cable into the device. This is difficult because the USB mini port on the breakout is SMD and the pins are not broken out. If you have any ideas on the best way to do this, please let me know in the comments.

8. Install the cardboard board and the breakout boardShowing how I glued in the cardboard board

The cardboard section should be easy to slide into the back case of the device. Should there be an issue making things fit you can always trim the cardboard, or pad it with paper. You should glue the cardboard into the back through the USB port hole where the hub was. This accomplishes two things, it attaches the cardboard to the back, and it partially fills the holes that we will later fill with Surgu, saving you a little Sugru. If you bought clear LEDs instead of diffused ones you may wish to glue something to diffuse the light to the front (I use baking paper, sometimes two layers). Install the breakout board by putting a little hot glue on the bottom, pushing it into place, plugging in the mini USB cable then finally wiggling it into the perfect position. You’ll also want to tie up your wires using a little electrical tape to make them easier to manage in the last step.

9. Install the front and patch holesThe Traffic Light USB port surrounded by Sugru to give it a nice clean look

Mould some Sugru into the holes left behind from where the USB hub was, and into the extra space around the USB mini plug. Make sure that you continuously test the USB mini plug to ensure you don’t add too much Sugru, also do not get any Sugru into the plug. You can use soapy water and rubbing to make the Sugru surface smooth if you wish. Then install the front of the case and leave the Sugru to set, which takes about 6-12 hours. Once the Surgu has set the device is ready to use.

Instructions for use

The interface to the device is implemented as a USB to Serial adapter, however since there is not serial interface, and the entire device is self contained we don’t have to implement the entire spec. A USB to Serial device is implemented by sending characters and control messages over the USB bus. However because there is no serial interface we can ignore all the control messages. The firmware above simply grabs the characters from the stream and acts on them. This means that the device will work regardless of the baud rate, stop bits or parity settings. On Linux and OSX this means that all you have to do to control the device is to echo characters to the character device. You can send ‘g’ for green, ‘o’ for orange, ‘r’ for red and any digit from 0-7 for all the possible light configurations. A simple test script for a Linux PC looks like this:

#!/bin/bash

while /bin/true; do
  echo 'g' > /dev/ttyACM0
  sleep 2
  echo 'o' > /dev/ttyACM0
  sleep 2
  echo 'r' > /dev/ttyACM0
  sleep 2
  echo 'a' > /dev/ttyACM0
  sleep 2
done

The completed Traffic Light

Error messages aren’t perfect

When diagnosing a problem with a complex system such as Linux you sometimes need to step back, stop what you’re doing and take a different approach. Usually when a program fails on Linux you will get some kind of error message, traceback or coredump. Most people prefer to see some kind of error message rather than the latter two..

Tracebacks and coredumps are computer generated, which makes them more accurate then error messages, but harder for humans to understand. Error messages however are put in place by the programmer which means they can occasionally be misleading, inaccurate, ambiguous or just plain wrong. This is not always the programmers fault, sometimes its hard to describe exactly what went wrong. Other times the error describes the situation perfectly, but the sysadmin jumps to a different conclusion based on his circumstances.

Example

Some time ago we had some users complaining about a problem when trying to use X Forwarding via SSH. On this server /home was mounted off a Novell NetWare NFS share. They were getting the following output and were unable to run X11 applications.

xauth: error in locking authority file /home/daniel/.Xauthority

Seeing this error I assumed that something was going wrong with the locking mechanism of NFS. I tried mounting the NFS share with the explicit lock option, but the same error remained. I tried explicitly giving the sync option too, but to no avail. I ended up trying many different NFS options until eventually I gave up and asked the Novell administrators to check their servers. I was convinced that something on their end was causing this locking error.

The Novell administrator responded that they could see nothing wrong on their end. This must mean that something was wrong on our side. I tried restarting the nfsstad and lockd initscripts and the whole machine but once again the same issue persisted. I checked the server using the rpcinfo command, which showed that everything was working fine. I even connected to the daemon using telnet (though I couldn’t talks its language) and confirmed a firewall was not in the way.

I thought that maybe there was something going wrong in the interaction between the client and the server, so I ran a tcpdump to capture all the packets transferred between them. this is where I made a small breakthrough. I found a NFS reply that had returned with SERVFAIL and error code 526. Googling for this error and Netware generally pointed towards a problem with character sets not getting preserved to the Novell server. There was nothing but ordinary characters on the filesystem, so much for that idea.

I wanted to know exactly what was happening when xauth was trying to lock the file, so I did an strace on it. Here are the last few lines (after xauth mmaped its libraries).:

stat("/home/e71377/.Xauthority-c", {st_mode=S_IFREG|0600, st_size=0, ...}) = 0
unlink("/home/e71377/.Xauthority-c")    = 0
unlink("/home/e71377/.Xauthority-l")    = -1 ENOENT (No such file or directory)
open("/home/e71377/.Xauthority-c", O_WRONLY|O_CREAT|O_EXCL, 0600) = 3
close(3)                                = 0
link("/home/e71377/.Xauthority-c", "/home/e71377/.Xauthority-l") = -1 ESERVERFAULT (Unknown error 526)
write(2, "xauth:  error in locking authori"..., 65xauth:  error in locking authority file /home/e71377/.Xauthority
) = 65
exit_group(1)                           = ?

So it appears that this was not a file locking problem at all. xauth was successfully creating the files but it failed when it tried to create a hardlink. Reviewing the code for libXau (AuLock.c) revealed exactly why:

    while (retries > 0) {
        if (creat_fd == -1) {
            creat_fd = open (creat_name, O_WRONLY | O_CREAT | O_EXCL, 0600);
            if (creat_fd == -1) {
                if (errno != EACCES)
                    return LOCK_ERROR;
            } else
                (void) close (creat_fd);
        }
        if (creat_fd != -1) {
#ifndef X_NOT_POSIX
            /* The file system may not support hard links, and pathconf should tell us that. */
            if (1 == pathconf(creat_name, _PC_LINK_MAX)) {
                if (-1 == rename(creat_name, link_name)) {
                    /* Is this good enough?  Perhaps we should retry.  TEST */
                    return LOCK_ERROR;
                } else {
                    return LOCK_SUCCESS;
                }
            } else {
#endif
                if (link (creat_name, link_name) != -1)
                    return LOCK_SUCCESS;
                if (errno == ENOENT) {
                    creat_fd= -1;       /* force re-creat next time around */
                    continue;
                }
                if (errno != EEXIST)
                    return LOCK_ERROR;
#ifndef X_NOT_POSIX
           }
#endif
        }
        (void) sleep ((unsigned) timeout);
        --retries;
    }

xauth isn’t trying to lock the file through flock() or another file locking method, which means that it is not the cause. Instead xauth is creating a file, and then to make sure it is the only program altering .Xauthority it creates a link. If the link succeeds then its the only program, if not then another program has the lock. The problem happens when xauth tries to make the hardlink. Interestingly there is a fallback for non-POSIX systems, but as RHEL is POSIX compatible it is not used.

It appeared that the NFS server did not support hard links. To test this theory I created several files, and attempted to create hard links using ‘cp -l file1 file2’. and they failed in the exact same way. All I had to do now was explain to the Novell Administrator that the problem was not locking, and was in fact that we were mounting a filesystem which did not support hard links on a POSIX compatible system. The Novell share was changed to support hard links (don’t ask me how, I’m not a Novell guy) and everything was working again.

Conclusion

The lesson to take away from here is not that hardlinks are required on POSIX, or that xauth doesn’t use file locking but locks itself via a dance of hardlinks. The lesson here is that you should never trust error messages. Take them as a hint, use them as a starting point but do not take them as law. You need to remember that the error message was written by a human and you may not be interpreting it how it was written.

mod_pagespeed is not (always) the answer

What is mod_pagespeed

Google recently released a chunk of code in the form of an Apache module. The idea is that you install it in your Apache server, it sits in between your application and the web browser and modifies the served requests to make the page load faster.
It does this by using combinations of filters, some are well known best practices, others are newer ideas. For example on filter simply minifies your JavaScript while another embeds small images in a page using data-uris. The changes these filters make range from low risk, to high risk. It should be noted that not all the filters will improve the page time some even making pages slower in some cases.

So what’s the issue?

The issue here really isn’t mod_pagespeed, but it’s the way people are viewing it. In my job as a Web Performance Engineer I have had several people recently say to me “let’s put mod_pagespeed on our web server to make it faster”. This is a break from normal attitudes, if someone were to to say “we should put our images into data-uris” then people would question the speed benefit, or the extra load on the server. For some reason when Google implement a page speed module people just assume that it will make their page faster, and that it will work in their environment. The truth is that Google really have no idea what the module will do to your page.

The second issue is that all these tweaks can usually be better implemented at the application level. If you minimize all your JavaScript as part of your build process then the web server will not have to do it for you. The same applies to data-uris. If they are simply part of the page then the browser doesn’t need to read in the extra image, uuencode it, then compress it. All that is quite a lot of work, which only really needs to be done once.

So what should I use mod_pagespeed for then?

You don’t always have access to the application code. If you are using third party software then before mod_pagespeed you may have had no control over the minification of CSS. This is where the module really shines. It gives you a layer between the application code and the web browser where you can apply all sorts of performance tuning.

The other advantage I can see is for looking for the best tunings to apply to your application quickly. You can setup mod_pagespeed and and run experimental tests with the filters on of and with a control to quickly figure out what rules you should apply in your application.