I have been knee deep this week debugging a rather complex DNS issue. I’ll do a full write up on that next week. While I was debugging the issue I needed to fire up tcpdump to watch the DNS queries from one of my authoritative servers to various servers on the Internet. What I noticed when I fed the data into wireshark were periods of time with no data, and I wasn’t quite sure why at first.
Based on what I could find on the tcpdump / BPF sites when tcpdump is busy processing existing data and is not able to take packets captured by BPF out of the queue fast enough, the kernel will drop them. If this occurs you will see the tcpdump message " packets dropped by kernel” become non-zero:
$ tcpdump -w dns-capture.cap -s 1520 -ttt -vvv -i bond0 port 53
......
9559 packets captured
12533 packets received by filter
2974 packets dropped by kernel
I started to do some digging to see why tcpdump couldn’t keep up, and after a bit of profiling I noticed that the program was spending an excessive amount of time resolving IPs to names. This processing was stalling the program from reading more data from the queue, and resulted in packets being dropped. Once I ran tcpdump with the “-n” (do not resolve IPs to names) option I no longer experienced this issue:
$ tcpdump -w dns-capture.cap -s 1520 -ttt -vvv -n -i bond0 port 53
.....
9339 packets captured
9339 packets received by filter
0 packets dropped by kernel
This stopped gaps from occurring in my wireshark display, and since wireshark can resolve IPs to names all was well. It’s really crazy how you can start debugging one issue and wind up debugging 3 - 4 more prior to solving the original problem. Debugging issues is definitely fun, and I guess it gives me plenty to write about. :) This past week I’ve learned more about the DNS protocol and the various implementations than I have in the past 10 years. It’s amazing how many cool nuggets of information are buried in the various DNS RFCs!
This past week I needed to compare the contents of two directories to see if there were any differences. There are a TON of ways to do this, though my preferred way is to use diff with the “-r” (when comparing directories do so recursively) option to compare two folders:
$ find foo1
foo1
foo1/services2
foo1/services
$ find foo2
foo2
foo2/services
$ diff -r foo1 foo2
Only in foo1: services2
Simple, easy and it gives you the output you’re most likely after. Anyone found a simpler solution that this? :)
Periodically I need to start a service, but only if it’s not currently running. Other times I need to restart services on a machine, but only if they are currently running. Services may have been started on the system at boot, manually by an admin, or through a systems wide management infrastructure. They may also have been disabled on a server for one reason or another. Most Redhat Linux rc initialization scripts have a “condrestart” target to help facilitate conditionally restarting a server, which can be useful when you need to conditionally start services across dozens of machines.
The conditional restart logic is usually implemented as a test similar to this (I took this from the dnsmasq init script):
condrestart)
if test "x`pidfileofproc dnsmasq`" != x; then
stop
start
RETVAL=$?
fi
This allows the script to restart the service if it’s currently running, but you could alter this behavior to start or restart a service if it’s not running (you can get a bit more fancy and check the result of pidof to see if the process is indeed running):
condrestart)
[ ! -e /var/lock/subsys/myservice ] && stop && start
This isn’t something you will use every day, but it’s rather handy when you need it. It also helps to know what will happen when you use the “condrestart” logic baked into various system init scripts. If you develop init scripts for services other than the defaults that ship with CentOS, Fedora or RHEL Linux, you may be interested in this!
When I was studying for my RHCE exam, I came across a number of references to Redhat’s satellite server and its opensource spacewalk counterpart. To dig into these products a bit more, I recently attended Redhat’s deployment and systems management class. I’ve been using satellite server for the past two years, and it’s actually a really useful tool for managing configuration data and systems updates in data centers that solely run Redhat Enterprise Linux. Satellite server provides a number of handy features:
The commercial version of satellite server will set you back some serious cheddar, but luckily for us satellite server is based off the spacewalk opensource implementation. Spacewalks seems to have a pretty decent following, and it provides an easy way to manage CentOS and Fedora machines.
There are roughly a dozen systems management products that I want to test out this year, several of which are wrappers around puppet. Once I give spacewalk a good beating, I am going to shift gears and start looking at The Foreman. The Foreman appears to have everything I’ve been looking for in a configuration management and provisioning suite, and the fact that it can provision Solaris hosts intrigues me. Only time will tell if it works though!
Which packages or tools are you using to manage (provision, configuration management, monitor) your Linux hosts?
I periodically need to retrieve new CentOS and Fedora releases. Sometimes I need to snag CDs (I still support machines without DVD drives), and in other cases I need DVDs. Typically when I’m playing around with new releases I grab both, and use the wget to retrieve them all at once. If you pass wget a FTP URL that contains a *, it will retrieve all of the files in the directory you are retrieving files from. Here is an example:
$ wget ftp://foo/pub/centos/5.6/isos/x86_64/
This will grab all of the files in the x86_64 directory and place them in the current working directory. You can also disable prompting and use the mget command with your favorite FTP client, but I find wget to be a bit more versatile since you can use the "-c” option to continue failed transfers. What is your favorite method to retrieve files?