Tag Archives: System Administration

System Calls

What are System Calls?

Systems calls are the interface between userspace and kernelspace. They are essentially procedures belonging to the operating system that you can call when you want something done that can’t be done in userspace. This includes anything system related, accessing devices, files, managing processes and networking. On my 64bit machine with a 3.5.4 kernel there are 344 system calls (cat /usr/src/linux/arch/x86/syscalls/syscall_64.tbl | egrep -v ‘^#’ | egrep -v ‘^$’ | wc -l) .

Calling procedures is generally a simple affair, just push the current context onto the stack and jump to the new address. Calling a system call on the other hand is quite a bit more difficult. The kernel executes in a higher privileged mode on the CPU and userspace executes on a lower privilege mode. To transfer through to the higher privilege mode a CPU exception is triggered, which then launches in the higher mode, however as you can’t pass arguments through this mechanism they are placed in the CPU registers before the exception. Newer CPUs have implemented SYSENTER/SYSEXIT calls (or similar) that work differently and are designed to speed up the process.

You can see the system calls that a process makes while it is running by using strace. Strace runs a program in such a way that it can intercept all the system calls it makes and gather details about them. Watching a program’s system calls is a good way to see what it is doing, if its hanging on IO, if it is stuck in an infinite loop or if it has deadlocked.

Debugging with strace

Interpreting the output of strace can be daunting, the average binary does many things on startup and shutdown that you’ve probably never coded. Lets dissect an example of stracing the date command:

execve("/bin/date", ["date"], [/* 63 vars */]) = 0
brk(0)                                  = 0xabd000
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f13febdf000

You can see here the initial execution of the date binary. This is done with the execv call, which asks the operating system to replace the current process with the binary you provided and begin execution.

You can see the next thing that happens is a call to brk(0), this is a way to grab the processes current program break. The program break is the location of the end of the data segment (actually its the first location after), moving the program break up has the effect of allocating more memory to the process, and moving it down has the effect of deallocating it. This is the way malloc allocates memory.

Then mmap is called to allocate 4096 bytes at a location of the kernels choosing (addr is NULL). Reads and writes are allowed, the memory is anonymous (not backed by a file) and private to this process. This initial memory is associated with the loading of dynamic libraries by the dynamic linker.

access("/etc/ld.so.preload", R_OK)      = -1 ENOENT (No such file or directory)
open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=143007, ...}) = 0
mmap(NULL, 143007, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f13febbc000
close(3)                                = 0

So this is the dynamic linker running trying to find all the libraries that out program requires. The first thing it does is try to load any libraries that may be listed in /etc/ld.so.preload, this is the same way the LD_PRELOAD environment variable works, except backed by a file.

Next it opens /etc/ld.so.cache, checks its file attributes, maps the file to memory, then closes the file. This file is created by the command ldconfig based on the file /etc/ld.so.conf and other files it includes (its common to include /etc/ld.so.conf.d/*.conf). The /etc/ld.so.cache file contains, in machine readable format, a list of all the libraries in the previously mentioned files and their location.

open("/lib64/librt.so.1", O_RDONLY|O_CLOEXEC) = 3
read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0`\"\240\2356\0\0\0"..., 832) = 832
fstat(3, {st_mode=S_IFREG|0755, st_size=48128, ...}) = 0
mmap(0x369da00000, 2128984, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x369da00000
mprotect(0x369da07000, 2093056, PROT_NONE) = 0
mmap(0x369dc06000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x6000) = 0x369dc06000
close(3)                                = 0

Here we are loading the librt library. The librt library is a part of glibc which provides realtime extensions. You can see here that the open call returns with file descriptor 3. Then in the mmap call you can see it mapping the library with PROT_READ and PROT_EXEC permissions, keeping it private and denying writes. Loading things this way means that if another program loads the same library it can share the same page in physical ram. It then protects a portion of that libraries memory from any read or write. It maps another section of the file as read write, but not written to the file, and finally closes the file descriptor.

open("/lib64/libc.so.6", O_RDONLY|O_CLOEXEC) = 3
read(3, "\177ELF\2\1\1\3\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\260\27\242\2346\0\0\0"..., 832) = 832
fstat(3, {st_mode=S_IFREG|0755, st_size=2076800, ...}) = 0
mmap(0x369ca00000, 3896632, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x369ca00000
mprotect(0x369cbad000, 2097152, PROT_NONE) = 0
mmap(0x369cdad000, 24576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1ad000) = 0x369cdad000
mmap(0x369cdb3000, 17720, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x369cdb3000
close(3)                                = 0

This is basically the same thing as the above library, except it is the libc library. The libc library provides the methods specified in the ANSI C and POSIX C libraries.

open("/lib64/libpthread.so.0", O_RDONLY|O_CLOEXEC) = 3
read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\320k\340\2346\0\0\0"..., 832) = 832
fstat(3, {st_mode=S_IFREG|0755, st_size=145176, ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f13febbb000
mmap(0x369ce00000, 2208760, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x369ce00000
mprotect(0x369ce17000, 2093056, PROT_NONE) = 0
mmap(0x369d016000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x16000) = 0x369d016000
mmap(0x369d018000, 13304, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x369d018000
close(3)                                = 0

Again, another library being loaded into memory. This time its the libpthread library, which provides POSIX thread support.

mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f13febba000
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f13febb9000
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f13febb8000

Here we are mapping three pages of memory using mmap. This memory is not backed by a file and is private to the process.

arch_prctl(ARCH_SET_FS, 0x7f13febb9700) = 0
mprotect(0x60d000, 4096, PROT_READ)     = 0
mprotect(0x369dc06000, 4096, PROT_READ) = 0
mprotect(0x369cdad000, 16384, PROT_READ) = 0
mprotect(0x369d016000, 4096, PROT_READ) = 0
mprotect(0x369c821000, 4096, PROT_READ) = 0
munmap(0x7f13febbc000, 143007)          = 0
set_tid_address(0x7f13febb99d0)         = 4110
set_robust_list(0x7f13febb99e0, 24)     = 0
rt_sigaction(SIGRTMIN, {0x369ce06720, [], SA_RESTORER|SA_SIGINFO, 0x369ce0f500}, NULL, 8) = 0
rt_sigaction(SIGRT_1, {0x369ce067b0, [], SA_RESTORER|SA_RESTART|SA_SIGINFO, 0x369ce0f500}, NULL, 8) = 0
rt_sigprocmask(SIG_UNBLOCK, [RTMIN RT_1], NULL, 8) = 0
getrlimit(RLIMIT_STACK, {rlim_cur=8192*1024, rlim_max=RLIM64_INFINITY}) = 0

These lines are likely the setup of the threading for the application. arch_prctl is being used to set the FS register, which the pthread library uses to store a pointer to a struct containing the thread local information for that thread. mprotect is being called on all the memory that above was made protected from both reads and writes to make it read only. Some memory is unmapped to return it to the system with munmap. set_tid_address is called to set the tread id, and get the pid of the process (4110).

We then see a bunch of calls to rt_sigaction and a rt_sigprocmask which is basically plumbing all the signals needed for threading. Finally getrlimit is used to see that the limits on stack sizes are.

brk(0)                                  = 0xabd000
brk(0xade000)                           = 0xade000
brk(0)                                  = 0xade000

Malloc() isn’t a system call, instead it uses other calls such as brk() and mmap() to create memory. Because a calls to these methods are expensive libc batches them up. So malloc() when initially called allocates a bigger chunk of memory than you asked for, and as you ask for further bits it give you chunks of the same memory you allocated. What you can see here is more than likely a malloc() call checking the current program break (the end of the heap), then moving it ahead (to increase the heap size) then checking where it is again.

open("/usr/lib/locale/locale-archive", O_RDONLY) = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=104997456, ...}) = 0
mmap(NULL, 104997456, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f13f8795000
close(3)                                = 0

So all the library loading and setup is now done. What we are doing here is loading the locales file. This will be done by most things that use libc, but its particularly important for the date command as date formatting changes drastically in different locales. Basically we open a file descriptor, stat it (usually to find size, permissions and such), map the file into memory, then closes the file descriptor. Because mmap has been used to get the file into memory we don’t need to use the read() and write() calls to access it. This will prevent an expensive system call, but the first time you access each page of the data a page fault will be generated, which could be just as expensive.

open("/etc/localtime", O_RDONLY)        = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=2183, ...}) = 0
fstat(3, {st_mode=S_IFREG|0644, st_size=2183, ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f13febde000
read(3, "TZif2\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\4\0\0\0\4\0\0\0\0"..., 4096) = 2183
lseek(3, -1398, SEEK_CUR)               = 785
read(3, "TZif2\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\5\0\0\0\5\0\0\0\0"..., 4096) = 1398
lseek(3, 2182, SEEK_SET)                = 2182
close(3)                                = 0
munmap(0x7f13febde000, 4096)            = 0

Here we can see date is opening the /etc/localtime file. This file contains information about the current timezone (it is often a symlink to the correct file). You can see here, the file is opened, read, a couple of seeks happen (to find particular entries) and the file is read again from the new point. Finally having retrieved the required information from the file, it is closed and its memory is unmapped.

fstat(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 0), ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f13febde000</pre>
write(1, "Thu Sep 27 16:48:57 EST 2012\n", 29Thu Sep 27 16:48:57 EST 2012
 ) = 29

At this point the date command has all that is required to calculate the date, its text representation and display it. You’ll notice that there has been no call to the operating system to fetch the time, this is because modern x86 machines include a special method called vsyscalls that allows userspace programs to make certain system calls without ever context switching. One of the system calls implemented via this method is the one to get the current unix timestamp, so date doesn’t need to call into kernelspace for this.

close(1)                                = 0
munmap(0x7f13febde000, 4096)            = 0
close(2)                                = 0
exit_group(0)                           = ?
+++ exited with 0 +++

STDOUT and STDERR are closed and the process group is ended. The date command is now finished.

Ansible Talk @ Infra Coders

Here are the notes that I used in my talk at Infrastructure Coders. Each section was also put on the screen as a ‘slide’. The configuration that I used in the demo is available at GitHub. A full video of the meetup is available on the Infrastructure Coders Youtube Channel, my talk starts at 25:05.

Ansible
--------

0. There is nothing in the hat
  - Start a RHEL install
    - Cmd line: console=ttyS0 ks=http://admin01/ns3.cfg
    - If you want to follow the demos grab the ansible config from my github
    - You will need to substitute hostnames in the ansible hosts file
    - You should copy the firewall config from ns2 (remove port 647 if paranoid)

1. The problem
  - Ansible is the combination of several functions
  - There was a plan to build config management on func
  - However func is a pain to setup
  - Puppet and Chef have a steep learning curve
  - Ansible was also built to simplfy complexrelease procedures
  - You need to know ruby to extend Puppet/Chef

2. Ansible
  - Designed so you can figure out how to use it in 15 minutes
  - Designed to be easy to setup
  - Doesn't require much to be installed on the managed host
  - Designed to do config management/deployment/ad hoc
  - Other people do security better, just use SSH
  - You can extend ansible in any language that can output JSON

3. Simple Ansible Demo
  - Ansible hosts file
  - Ansible can be run directly on the command line
    - Run cat /etc/redhat-release
    - Get info using the setup module
  - It can prompt for auth, or use key based auth
    - On the new machine show it prompting
    - Run the rhelsetup script on the new machine
    - Install vim-enhanced

4. Playbooks
  - This is the method of scripting Ansible
  - Done in YAML
  - Executed in order *gee thanks puppet*
  - Designed to be as easy as possible

5. Example playbook
  - Playbook for the name servers
    - https://github.com/smarthall/tildaslash.com/blob/master/playbooks/zones.play
    - Can have multiple plays in a book
    - Can serialise if you dont want all to be down at once
  - Template config for the name servers
    - https://github.com/smarthall/tildaslash.com/blob/master/playbooks/zones/named.conf.j2
  - Firewall install script
    - https://github.com/smarthall/tildaslash.com/blob/master/playbooks/firewall.play

6. My thoughts
  - Config management has been around a while, its going from art to science
  - Ansible covers more ground than puppet and chef do
  - Ansible doesn't compromise on simplicity to do that
  - I don't have to focus on the nodes, I can focus on services
  - There is something missing
    - Disk config is done in kickstarts
    - Network config can't be done by Ansible
    - Need to find a way to cover both with one

The playlist of all the videos is available at Youtube.

Error messages aren’t perfect

When diagnosing a problem with a complex system such as Linux you sometimes need to step back, stop what you’re doing and take a different approach. Usually when a program fails on Linux you will get some kind of error message, traceback or coredump. Most people prefer to see some kind of error message rather than the latter two..

Tracebacks and coredumps are computer generated, which makes them more accurate then error messages, but harder for humans to understand. Error messages however are put in place by the programmer which means they can occasionally be misleading, inaccurate, ambiguous or just plain wrong. This is not always the programmers fault, sometimes its hard to describe exactly what went wrong. Other times the error describes the situation perfectly, but the sysadmin jumps to a different conclusion based on his circumstances.

Example

Some time ago we had some users complaining about a problem when trying to use X Forwarding via SSH. On this server /home was mounted off a Novell NetWare NFS share. They were getting the following output and were unable to run X11 applications.

xauth: error in locking authority file /home/daniel/.Xauthority

Seeing this error I assumed that something was going wrong with the locking mechanism of NFS. I tried mounting the NFS share with the explicit lock option, but the same error remained. I tried explicitly giving the sync option too, but to no avail. I ended up trying many different NFS options until eventually I gave up and asked the Novell administrators to check their servers. I was convinced that something on their end was causing this locking error.

The Novell administrator responded that they could see nothing wrong on their end. This must mean that something was wrong on our side. I tried restarting the nfsstad and lockd initscripts and the whole machine but once again the same issue persisted. I checked the server using the rpcinfo command, which showed that everything was working fine. I even connected to the daemon using telnet (though I couldn’t talks its language) and confirmed a firewall was not in the way.

I thought that maybe there was something going wrong in the interaction between the client and the server, so I ran a tcpdump to capture all the packets transferred between them. this is where I made a small breakthrough. I found a NFS reply that had returned with SERVFAIL and error code 526. Googling for this error and Netware generally pointed towards a problem with character sets not getting preserved to the Novell server. There was nothing but ordinary characters on the filesystem, so much for that idea.

I wanted to know exactly what was happening when xauth was trying to lock the file, so I did an strace on it. Here are the last few lines (after xauth mmaped its libraries).:

stat("/home/e71377/.Xauthority-c", {st_mode=S_IFREG|0600, st_size=0, ...}) = 0
unlink("/home/e71377/.Xauthority-c")    = 0
unlink("/home/e71377/.Xauthority-l")    = -1 ENOENT (No such file or directory)
open("/home/e71377/.Xauthority-c", O_WRONLY|O_CREAT|O_EXCL, 0600) = 3
close(3)                                = 0
link("/home/e71377/.Xauthority-c", "/home/e71377/.Xauthority-l") = -1 ESERVERFAULT (Unknown error 526)
write(2, "xauth:  error in locking authori"..., 65xauth:  error in locking authority file /home/e71377/.Xauthority
) = 65
exit_group(1)                           = ?

So it appears that this was not a file locking problem at all. xauth was successfully creating the files but it failed when it tried to create a hardlink. Reviewing the code for libXau (AuLock.c) revealed exactly why:

    while (retries > 0) {
        if (creat_fd == -1) {
            creat_fd = open (creat_name, O_WRONLY | O_CREAT | O_EXCL, 0600);
            if (creat_fd == -1) {
                if (errno != EACCES)
                    return LOCK_ERROR;
            } else
                (void) close (creat_fd);
        }
        if (creat_fd != -1) {
#ifndef X_NOT_POSIX
            /* The file system may not support hard links, and pathconf should tell us that. */
            if (1 == pathconf(creat_name, _PC_LINK_MAX)) {
                if (-1 == rename(creat_name, link_name)) {
                    /* Is this good enough?  Perhaps we should retry.  TEST */
                    return LOCK_ERROR;
                } else {
                    return LOCK_SUCCESS;
                }
            } else {
#endif
                if (link (creat_name, link_name) != -1)
                    return LOCK_SUCCESS;
                if (errno == ENOENT) {
                    creat_fd= -1;       /* force re-creat next time around */
                    continue;
                }
                if (errno != EEXIST)
                    return LOCK_ERROR;
#ifndef X_NOT_POSIX
           }
#endif
        }
        (void) sleep ((unsigned) timeout);
        --retries;
    }

xauth isn’t trying to lock the file through flock() or another file locking method, which means that it is not the cause. Instead xauth is creating a file, and then to make sure it is the only program altering .Xauthority it creates a link. If the link succeeds then its the only program, if not then another program has the lock. The problem happens when xauth tries to make the hardlink. Interestingly there is a fallback for non-POSIX systems, but as RHEL is POSIX compatible it is not used.

It appeared that the NFS server did not support hard links. To test this theory I created several files, and attempted to create hard links using ‘cp -l file1 file2′. and they failed in the exact same way. All I had to do now was explain to the Novell Administrator that the problem was not locking, and was in fact that we were mounting a filesystem which did not support hard links on a POSIX compatible system. The Novell share was changed to support hard links (don’t ask me how, I’m not a Novell guy) and everything was working again.

Conclusion

The lesson to take away from here is not that hardlinks are required on POSIX, or that xauth doesn’t use file locking but locks itself via a dance of hardlinks. The lesson here is that you should never trust error messages. Take them as a hint, use them as a starting point but do not take them as law. You need to remember that the error message was written by a human and you may not be interpreting it how it was written.

GPG Symmetric Encryption

I often come into a situation where I have to exchange some important confidential file with somebody who doesn’t have GPG keys setup. Explaining how to setup keys can be a pain, especially if you believe that the user will lose them or simply forget how to use them. There are all manner of propriety software packages to deal with this but this post is about an easy free way using software that almost anyone has access to. I will be showing you how to do this using GPG on Unix operating systems. For windows you could follow this guide.

Encrypting

To encrypt a file symmetrically using GPG just run:

gpg --symmetric <filename>

It will prompt you for a password twice and create a <filename>.gpg file in the current directory. If you want to put the encrypted text in an email then add the –armour flag. The –armour flag will cause gpg to instead output a <filename>.asc file which consists of ASCII text.

Decrypting

You decrypt it like any other GPG encrypted file:

gpg -d <filename>.gpg

This will prompt you for the password and decrypt the file, printing it to standard out.

Tips

  • Don’t send the password and the attachment over the same medium, especially not in the same message. I suggest you send the email with the file and call and tell them the password.
  • GPG uses really strong encryption, much more secure than that used in zipfile encryption. That said if you set the password to ’123′ or ‘password’ no amount of encryption will help you. Your encryption is only as secure as the weakest point.
  • With enough time files like this can be cracked using brute force. You should still do all that you can to prevent the encrypted file falling into the wrong hands.
  • You really should setup GPG keys and publish them to a keyserver. That way you won’t have to worry about secure passphrase distribution.

Random Thought: How did people find the first search engine?

SSH Agent Forwarding

So you use keys to SSH between your hosts, and you either have separate keys for each machine you use, or worse you have the same key on each machine. Lets go over why each of those are bad, and lets see how SSH Agent forwarding will help with those issues and make things easier for you in general.

So the key part of why a SSH agent and SSH agent forwarding forwarding is so useful is due to the way keys can be attacked. If I wanted to get your SSH private key I could find some flaw in the system that would give me that /home/you/.ssh/id_rsa file you have. Of course a malicious user with root access to the system could just go in and grab it. You can prevent this kind of attack by setting a passphrase on the key. Of course the root user could replace SSH with a special version designed to get your passphrase, steal the key out of memory or setup a keylogger. This means effectively that your private key is not safe on any system where a person you don’t trust has root access, or has other users and exploitable vulnerabilities.

Single Private Key on Multiple Machines

In this example you’re trusting the security of every single machine you have your private key on. Should it get compromised then you have to revoke you public key from every host, and regenerate private keys to place on every host. Every time you put your private key on a machine you increase the chances that it could be compromised.

Multiple Private Keys On Multiple Machines

So we’re getting a little closer to a good solution. In this instance we don’t have to generate our key and roll it out to all hosts in event of a compromise. You can also have segregate groups, on set of keys for work, another for home and so on. Your keys can still be compromised easily though, and once compromised they can be used until you revoke them manually.

SSH Agent Forwarding

There is a way to keep your key safe from compromise. Now I’ll have to explain how SSH authenticates you using your key. When your authenticating with SSH keys your key isn’t sent, the server sends you some random data and challenges your client to encrypt it with your private key. It then verifies the encrypted data by decrypting it with the public key and checking if it matches the data originally sent. Now the way most people would SSH from the second host to another third host is to utilise a private key on the second host to connect to the third host. Unfortunately this method means that you have to store a key (that is open for compromise) on the second host. SSH agent forwarding tells the SSH client on the second server to send the challenge data through to the SSH client (or ssh agent) on the first host. The agent encrypts the data and sends it via the SSH session to the third client.

The beauty of this method is that the second host never sees a private key, and the challenge data is useless to try and connect to a different host. Even if the second host is compromised there isn’t a private key there to compromise. It should be noted that if the second host is compromised it can still request the agent identify for a different host, or the session to the third host can be taken over. Both these are temporary though and unless the malicious user installs their key (something easy to notice) they cannot get back in.

Diagram detailing how an SSH connection is authenticated using agent forwarding.

Diagram detailing how an SSH connection is authenticated using agent forwarding.

If you want to know more about how this works, there is a wonderful tech tip at http://unixwiz.net/techtips/ssh-agent-forwarding.html.

But how?

SSH agent forwarding is even easier than copying keys all over the place. The first step is to generate keys for all the machines you log on to directly. You need to be sure these machines are secure and that your keys will stay safe, though this is sometimes not possible. You then add the generated public key to the authorized hosts file of all the machines you will connect to from this one, including ones that take two or more steps to get to. Finally you edit your ~/.ssh/ssh_config file to tell SSH to forward your agent through those hosts. Include the intermediate hosts in this list, but not the endpoints. You could also use SSHmenu to add the arguments automatically to those SSH commands. The following disables forwarding to all hosts, and explicitly enables it to fred, and aaron.missgner.com.

Host fred
  ForwardAgent yes

Host aaron.missgner.com
  ForwardAgent yes

Host *
  ForwardAgent no

Random thought: Linux has Plug ‘n Pray too, you plug the device in and pray the drivers aren’t proprietary.