Tag Archives: System Administration

Error messages aren’t perfect

When diagnosing a problem with a complex system such as Linux you sometimes need to step back, stop what you’re doing and take a different approach. Usually when a program fails on Linux you will get some kind of error message, traceback or coredump. Most people prefer to see some kind of error message rather than the latter two..

Tracebacks and coredumps are computer generated, which makes them more accurate then error messages, but harder for humans to understand. Error messages however are put in place by the programmer which means they can occasionally be misleading, inaccurate, ambiguous or just plain wrong. This is not always the programmers fault, sometimes its hard to describe exactly what went wrong. Other times the error describes the situation perfectly, but the sysadmin jumps to a different conclusion based on his circumstances.

Example

Some time ago we had some users complaining about a problem when trying to use X Forwarding via SSH. On this server /home was mounted off a Novell NetWare NFS share. They were getting the following output and were unable to run X11 applications.

[code]xauth: error in locking authority file /home/daniel/.Xauthority[/code]

Seeing this error I assumed that something was going wrong with the locking mechanism of NFS. I tried mounting the NFS share with the explicit lock option, but the same error remained. I tried explicitly giving the sync option too, but to no avail. I ended up trying many different NFS options until eventually I gave up and asked the Novell administrators to check their servers. I was convinced that something on their end was causing this locking error.

The Novell administrator responded that they could see nothing wrong on their end. This must mean that something was wrong on our side. I tried restarting the nfsstad and lockd initscripts and the whole machine but once again the same issue persisted. I checked the server using the rpcinfo command, which showed that everything was working fine. I even connected to the daemon using telnet (though I couldn’t talks its language) and confirmed a firewall was not in the way.

I thought that maybe there was something going wrong in the interaction between the client and the server, so I ran a tcpdump to capture all the packets transferred between them. this is where I made a small breakthrough. I found a NFS reply that had returned with SERVFAIL and error code 526. Googling for this error and Netware generally pointed towards a problem with character sets not getting preserved to the Novell server. There was nothing but ordinary characters on the filesystem, so much for that idea.

I wanted to know exactly what was happening when xauth was trying to lock the file, so I did an strace on it. Here are the last few lines (after xauth mmaped its libraries).:

[code]stat("/home/e71377/.Xauthority-c", {st_mode=S_IFREG|0600, st_size=0, ...}) = 0
unlink("/home/e71377/.Xauthority-c") = 0
unlink("/home/e71377/.Xauthority-l") = -1 ENOENT (No such file or directory)
open("/home/e71377/.Xauthority-c", O_WRONLY|O_CREAT|O_EXCL, 0600) = 3
close(3) = 0
link("/home/e71377/.Xauthority-c", "/home/e71377/.Xauthority-l") = -1 ESERVERFAULT (Unknown error 526)
write(2, "xauth: error in locking authori"..., 65xauth: error in locking authority file /home/e71377/.Xauthority
) = 65
exit_group(1) = ?[/code]

So it appears that this was not a file locking problem at all. xauth was successfully creating the files but it failed when it tried to create a hardlink. Reviewing the code for libXau (AuLock.c) revealed exactly why:

[code lang="c"] while (retries > 0) {
if (creat_fd == -1) {
creat_fd = open (creat_name, O_WRONLY | O_CREAT | O_EXCL, 0600);
if (creat_fd == -1) {
if (errno != EACCES)
return LOCK_ERROR;
} else
(void) close (creat_fd);
}
if (creat_fd != -1) {
#ifndef X_NOT_POSIX
/* The file system may not support hard links, and pathconf should tell us that. */
if (1 == pathconf(creat_name, _PC_LINK_MAX)) {
if (-1 == rename(creat_name, link_name)) {
/* Is this good enough? Perhaps we should retry. TEST */
return LOCK_ERROR;
} else {
return LOCK_SUCCESS;
}
} else {
#endif
if (link (creat_name, link_name) != -1)
return LOCK_SUCCESS;
if (errno == ENOENT) {
creat_fd= -1; /* force re-creat next time around */
continue;
}
if (errno != EEXIST)
return LOCK_ERROR;
#ifndef X_NOT_POSIX
}
#endif
}
(void) sleep ((unsigned) timeout);
--retries;
}[/code]

xauth isn’t trying to lock the file through flock() or another file locking method, which means that it is not the cause. Instead xauth is creating a file, and then to make sure it is the only program altering .Xauthority it creates a link. If the link succeeds then its the only program, if not then another program has the lock. The problem happens when xauth tries to make the hardlink. Interestingly there is a fallback for non-POSIX systems, but as RHEL is POSIX compatible it is not used.

It appeared that the NFS server did not support hard links. To test this theory I created several files, and attempted to create hard links using ‘cp -l file1 file2′. and they failed in the exact same way. All I had to do now was explain to the Novell Administrator that the problem was not locking, and was in fact that we were mounting a filesystem which did not support hard links on a POSIX compatible system. The Novell share was changed to support hard links (don’t ask me how, I’m not a Novell guy) and everything was working again.

Conclusion

The lesson to take away from here is not that hardlinks are required on POSIX, or that xauth doesn’t use file locking but locks itself via a dance of hardlinks. The lesson here is that you should never trust error messages. Take them as a hint, use them as a starting point but do not take them as law. You need to remember that the error message was written by a human and you may not be interpreting it how it was written.

GPG Symmetric Encryption

I often come into a situation where I have to exchange some important confidential file with somebody who doesn’t have GPG keys setup. Explaining how to setup keys can be a pain, especially if you believe that the user will lose them or simply forget how to use them. There are all manner of propriety software packages to deal with this but this post is about an easy free way using software that almost anyone has access to. I will be showing you how to do this using GPG on Unix operating systems. For windows you could follow this guide.

Encrypting

To encrypt a file symmetrically using GPG just run:
[code lang="shell"]gpg --symmetric <filename>[/code]
It will prompt you for a password twice and create a <filename>.gpg file in the current directory. If you want to put the encrypted text in an email then add the –armour flag. The –armour flag will cause gpg to instead output a <filename>.asc file which consists of ASCII text.

Decrypting

You decrypt it like any other GPG encrypted file:
[code lang="shell"]gpg -d <filename>.gpg[/code]
This will prompt you for the password and decrypt the file, printing it to standard out.

Tips

  • Don’t send the password and the attachment over the same medium, especially not in the same message. I suggest you send the email with the file and call and tell them the password.
  • GPG uses really strong encryption, much more secure than that used in zipfile encryption. That said if you set the password to ’123′ or ‘password’ no amount of encryption will help you. Your encryption is only as secure as the weakest point.
  • With enough time files like this can be cracked using brute force. You should still do all that you can to prevent the encrypted file falling into the wrong hands.
  • You really should setup GPG keys and publish them to a keyserver. That way you won’t have to worry about secure passphrase distribution.

Random Thought: How did people find the first search engine?

SSH Agent Forwarding

So you use keys to SSH between your hosts, and you either have separate keys for each machine you use, or worse you have the same key on each machine. Lets go over why each of those are bad, and lets see how SSH Agent forwarding will help with those issues and make things easier for you in general.

So the key part of why a SSH agent and SSH agent forwarding forwarding is so useful is due to the way keys can be attacked. If I wanted to get your SSH private key I could find some flaw in the system that would give me that /home/you/.ssh/id_rsa file you have. Of course a malicious user with root access to the system could just go in and grab it. You can prevent this kind of attack by setting a passphrase on the key. Of course the root user could replace SSH with a special version designed to get your passphrase, steal the key out of memory or setup a keylogger. This means effectively that your private key is not safe on any system where a person you don’t trust has root access, or has other users and exploitable vulnerabilities.

Single Private Key on Multiple Machines

In this example you’re trusting the security of every single machine you have your private key on. Should it get compromised then you have to revoke you public key from every host, and regenerate private keys to place on every host. Every time you put your private key on a machine you increase the chances that it could be compromised.

Multiple Private Keys On Multiple Machines

So we’re getting a little closer to a good solution. In this instance we don’t have to generate our key and roll it out to all hosts in event of a compromise. You can also have segregate groups, on set of keys for work, another for home and so on. Your keys can still be compromised easily though, and once compromised they can be used until you revoke them manually.

SSH Agent Forwarding

There is a way to keep your key safe from compromise. Now I’ll have to explain how SSH authenticates you using your key. When your authenticating with SSH keys your key isn’t sent, the server sends you some random data and challenges your client to encrypt it with your private key. It then verifies the encrypted data by decrypting it with the public key and checking if it matches the data originally sent. Now the way most people would SSH from the second host to another third host is to utilise a private key on the second host to connect to the third host. Unfortunately this method means that you have to store a key (that is open for compromise) on the second host. SSH agent forwarding tells the SSH client on the second server to send the challenge data through to the SSH client (or ssh agent) on the first host. The agent encrypts the data and sends it via the SSH session to the third client.

The beauty of this method is that the second host never sees a private key, and the challenge data is useless to try and connect to a different host. Even if the second host is compromised there isn’t a private key there to compromise. It should be noted that if the second host is compromised it can still request the agent identify for a different host, or the session to the third host can be taken over. Both these are temporary though and unless the malicious user installs their key (something easy to notice) they cannot get back in.

Diagram detailing how an SSH connection is authenticated using agent forwarding.

Diagram detailing how an SSH connection is authenticated using agent forwarding.

If you want to know more about how this works, there is a wonderful tech tip at http://unixwiz.net/techtips/ssh-agent-forwarding.html.

But how?

SSH agent forwarding is even easier than copying keys all over the place. The first step is to generate keys for all the machines you log on to directly. You need to be sure these machines are secure and that your keys will stay safe, though this is sometimes not possible. You then add the generated public key to the authorized hosts file of all the machines you will connect to from this one, including ones that take two or more steps to get to. Finally you edit your ~/.ssh/ssh_config file to tell SSH to forward your agent through those hosts. Include the intermediate hosts in this list, but not the endpoints. You could also use SSHmenu to add the arguments automatically to those SSH commands. The following disables forwarding to all hosts, and explicitly enables it to fred, and aaron.missgner.com.

Host fred
  ForwardAgent yes

Host aaron.missgner.com
  ForwardAgent yes

Host *
  ForwardAgent no

Random thought: Linux has Plug ‘n Pray too, you plug the device in and pray the drivers aren’t proprietary.

Linux ‘top’ Commands

As a sysadmin working with Linux PCs I often need real time data on the status of the systems I manage. For example I might need to know what is using up all the bandwidth on an interface, whats taking up  all the memory or why my X displays are running sluggish. The impromptu
standard for naming these commands is to add the ‘top’ suffix. Here is a list of my favorite 8 ‘top’ commands.

top

Top, the grandaddy of all the Linux top commands, is most useful for  monitoring tasks running on your system. On my Fedora system its contained in package procps which on Fedora 11 was 3.2.7. Top has many keybindings to change its behaviour, for example ‘f’ is used to add and remove fields, ‘o’ will help you reorder those fields and the lesser-than and greater-than keys move the search field. You can type ‘h’ for a bigger list.

tload

You caught me! This one doesn’t end with top, but I put it here because on Fedora it comes as part of the procps packages with top, slabtop and others. tload is a good application to have in a small terminal in the background.  It comes packaged along with top. It displays a histogram of the current load for the system. I like to have it running in a transparent terminal that I leave open on my laptop.

htop

An improved, menu driven and colourised version of normal top. Htop allows you to get information on each thread of a program or combine all thread like normal top does. Some would argue that its more powerful, but others simply say its bloated. Whatever you believe, it has some nice features that any sysadmin will appreciate and you’ll soon be wishing htop was avaliable
everywhere.

iftop

top is to cputime what iftop is to your network interfaces. It displays a list of the top servers that are exchanging data over the selected interface. Because of the way it captures packets from the interface it needs root privleges to run.

iotop

iotop displays live system IO statistics. Like top it lists the top applications that are using IO. It can be toggled with the ‘o’ key to only
display programs currently performing IO, which is useful on large servers. You can read more about its keybindings on its manpage.

slabtop

slabtop is especially useful for kernel developers and pedantic system tuners. It displays a summary of all the slab objects allocated in the kernel. I can take options to tell it how to display its information, but only has two keybindings, spacebar is to refresh the screen and ‘q’ is to quit. You can see its options on its manpage.

xrestop

For X developers there is a utility called xrestop. xrestop displays a list of X server resources allocated. It can be useful to see if your application, or your X server is leaking resources. While it only accepts the ‘q’ key to exit it does accept a few options.

powertop

Built by Intel to help tune laptops to get the best performance out of your battery. It shows the percentage time spent in each CPU state and lists the programs and devices that caused the most wake ups from idle mode. Its most useful feature though is that it will analyse your system and give a suggestion on action to be taken to save just that little bit more power.