This week I encountered a weird issue while developing a new Ganglia plug-in. After moving my ganglia processes to a docker container I noticed that the grid overview images weren’t displaying. This was a new ganglia installation so I figured I typo’ed something in the gmetad.conf configuration file. I reviewed the file in my git repo and everything looked perfectly fine. Any Apache errors? Gak!:
$ tail -6 /var/log/httpd/error_log
sh: -c: line 0: unexpected EOF while looking for matching `''
sh: -c: line 1: syntax error: unexpected end of file
sh: -c: line 0: unexpected EOF while looking for matching `''
sh: -c: line 1: syntax error: unexpected end of file
sh: -c: line 0: unexpected EOF while looking for matching `''
sh: -c: line 1: syntax error: unexpected end of file
Well those messages are super helpful lol. Something is being run through /bin/sh and it’s missing a matching single quote. But what command or script? Luckily I follow the world famous performance expert Brendan Gregg’s blog so I was aware of the Linux eBPF and bcc project. If you aren’t familiar with this amazing project here is a short blurb from their website:
“BCC is a toolkit for creating efficient kernel tracing and manipulation programs, and includes several useful tools and examples. It makes use of extended BPF (Berkeley Packet Filters), formally known as eBPF, a new feature that was first added to Linux 3.15. Much of what BCC uses requires Linux 4.1 and above.”
I knew Brendan ported a large number of the DTraceToolkit scripts to bcc so I went searching for execsnoop to see what is being executed:
$ ./execsnoop
PCOMM PID PPID RET ARGS
sh 18748 18350 0 /usr/bin/rrdtool
rrdtool 18748 18350 0 /usr/bin/rrdtool
sh 18749 18350 0 /usr/bin/rrdtool graph /dev/null --start -3600s --end now DEF:avg='/ganglia/rrds// __SummaryInfo__/cpu_num.rrd':'sum':AVERAGE PR
Bingo! Now I have a lead, but it appears execsnoop limit’s the arguments displayed (NTB: need to figure out why). So lets go to the Ganglia source code to see whats up:
$ cd /usr/share/ganglia
$ grep rrdtool .php
*** lots of output ***
In the output above I noticed several exec() statements in graph.php so I decided to begin there. Starting from the main code block and reading down I came across the following:
if ($debug) {
error_log("Final rrdtool command: $command");
}
Digging further I noticed that debug output (including the full string that is being passed to the shell) could be set through a query string variable:
$ debug = isset(_GET['debug']) ? clean_number(sanitize(_GET["debug"])) : 0;
Sweet! Adding “&debug=1” to the end of the ganglia URL got me a useful log entry in my error_log:
[Fri Oct 07 13:18:01.429310 2016] [:error] [pid 19884] [client 192.168.1.178:51124] Final rrdtool command: /usr/bin/rrdtool graph - --start '-3600s' --end now --width 650 --height 300 --title 'Matty\'s Bodacious Grid Load last hour' --vertical-label 'Loads/Procs' --lower-limit 0 --slope-mode DEF:'a0'='/ganglia/rrds//__SummaryInfo__/load_one.rrd':'sum':AVERAGE DEF:'a1'='/ganglia/rrds//__SummaryInfo__/cpu_num.rrd':'num':AVERAGE DEF:'a2'='/ganglia/rrds//__SummaryInfo__/cpu_num.rrd':'sum':AVERAGE DEF:'a3'='/ganglia/rrds//__SummaryInfo__/proc_run.rrd':'sum':AVERAGE AREA:'a0'#BBBBBB:'1-min' VDEF:a0_last=a0,LAST VDEF:a0_min=a0,MINIMUM VDEF:a0_avg=a0,AVERAGE VDEF:a0_max=a0,MAXIMUM GPRINT:'a0_last':'Now\:%5.1lf%s' GPRINT:'a0_min':'Min\:%5.1lf%s' GPRINT:'a0_avg':'Avg\:%5.1lf%s' GPRINT:'a0_max':'Max\:%5.1lf%s\l' LINE2:'a1'#00FF00:'Nodes' VDEF:a1_last=a1,LAST VDEF:a1_min=a1,MINIMUM VDEF:a1_avg=a1,AVERAGE VDEF:a1_max=a1,MAXIMUM GPRINT:'a1_last':'Now\:%5.1lf%s' GPRINT:'a1_min':'Min\:%5.1lf%s' GPRINT:'a1_avg':'Avg\:%5.1lf%s' GPRINT:'a1_max':'Max\:%5.1lf%s\l' LINE2:'a2'#FF0000:'CPUs ' VDEF:a2_last=a2,LAST VDEF:a2_min=a2,MINIMUM VDEF:a2_avg=a2,AVERAGE VDEF:a2_max=a2,MAXIMUM GPRINT:'a2_last':'Now\:%5.1lf%s' GPRINT:'a2_min':'Min\:%5.1lf%s' GPRINT:'a2_avg':'Avg\:%5.1lf%s' GPRINT:'a2_max':'Max\:%5.1lf%s\l' LINE2:'a3'#2030F4:'Procs' VDEF:a3_last=a3,LAST VDEF:a3_min=a3,MINIMUM VDEF:a3_avg=a3,AVERAGE VDEF:a3_max=a3,MAXIMUM GPRINT:'a3_last':'Now\:%5.1lf%s' GPRINT:'a3_min':'Min\:%5.1lf%s' GPRINT:'a3_avg':'Avg\:%5.1lf%s' GPRINT:'a3_max':'Max\:%5.1lf%s\l'
Running that manually from a shell prompt generated the same error as the one I originally found! Excellent, now I know exactly what is causing the error. I opened the output above in vim and searched for a single quote. That highlighted all single quotes and my eyes immediately picked up the following:
--title 'Matty\'s Bodacious Grid Load last hour'
Oh snap, I found a gem here. A LOOOONG time ago in a shell scripting book (possibly in UNIX shell programming?) far, far away I remembered reading that you can never have a single quote inside a pair of single quotes. Googling this turned up the following POSIX standard which states:
“A single-quote cannot occur within single-quotes.”
When I removed the single quote after Matty and re-ran rrdtool it ran w/o error and spit out a nice png. So could that be the issue? I removed the single quote from the gridname directive in gmetad.conf, bounced gmetad and low and behold all of my graphs started showing up. Now to finish up my plug-in.
I’m a long time console guy and haven’t found a graphical Python development tool that suits my needs as well as vim. It takes a couple of amazing vim plug-ins but the experience is great IMHO. I’m currently using the vim-jedi, pylint and vim-syntastic plug-ins which can be installed with yum on CentOS/Fedora machines:
$ sudo yum install pylint
$ sudo yum install vim-jedi
$ sudo yum install vim-syntastic-python
To enable syntax highlighting and auto indentation you can add the following to your $HOME/.vimrc:
syntax on
filetype indent plugin on
Once these are in place your code will be highlighted with a nice set of colors, hitting return will generate the correct spacing and pylint will be run automatically to note problems with your code. If you are using other plug-ins let me know via a comment.
While automating a process this week I needed a way to get the serial number off a batch of tape drives. At first I thought I could retrieve this information through /sys. But after a bunch of poking around with cat, systool and udevadm I realized I couldn’t get what I want through /sys. One such failure:
$ udevadm info -a -n /dev/nst0 | grep -i serial
If I can’t get the serial # through /sys I can always poke the drive directly, right? Definitely! SCSI vital product data (VPD) is stored on the drive in a series of SCSI mode pages. These pages can be viewed with the sg_vpd utility:
$ sg_vpd --page=0x80 /dev/nst0
VPD INQUIRY: Unit serial number page
Unit serial number: 123456789012
The command above retrieves SCSI mode page 0x80 (Unit Serial Number) and displays it in an easily parsed format. Seagates SCSI commands reference is an excellent reference for understanding the SCSI protocol.
I’m a long time admirer of Bob Ross and the amazing paintings he produced on his hit tv show the joy of painting. I’m equally a fan of udev and the power it places in administrators hands. While Bob painted amazing clouds, seascapes and mountains with a swipe of his brush I’m able to make /dev just as beautiful with a few keyboard strokes (I guess that makes my keyboard my easel).
This past weekend I decided to clean up and automate the device creation process on several of my database servers. These servers had hundreds of mapper devices which were defined in multipath {} sections in /etc/multipath.conf. The aliases used in these blocks had upper and lower case names so the udev rules file had become a bit of a mess. Here is a snippet to illustrate:
multipath {
wwid 09876543210
alias happydev_lun001
}
multipath {
wwid 01234567890
alias Happydev_lun002
}
By default udev creates /dev/dm-[0-9]+ entries in /dev. I prefer to have the friendly names (happydev_lunXXX in this case) added as well so I don’t have to cross reference the two when I’m dealing with storage issues. To ensure that the friendly names in /dev were consistently named I created the following udev rule:
$ cd /etc/udev.rules.d && cat 90-happydevs.rules
ENV{DM_NAME}=="[Hh]appydev*", NAME="%c" OWNER:="bob", GROUP:="ross", MODE:="660", PROGRAM="/usr/local/bin/lc_devname.sh $env{DM_NAME}"
The rule above checks the DM_NAME to see if it starts with the string [Hh]appydev. If this check succeeds UDEV will call the /usr/local/bin/lc_devname.sh helper script which in turn calls tr to lower() the name passed as an argument. Here are the contents of the lc_devname.sh script:
$ cat /usr/local/bin/lc_devname.sh
#!/bin/sh
echo ${1} | tr '[A-Z]' '[a-z]'
The STDOUT from lc_devname.sh gets assigned to the %c variable which I reference in the NAME key to create a friendly named device in /dev. We can run udevadm test to make sure this works as expected:
$ udevadm test /block/dm-2 2>&1 | grep DEVNAME
udevadm_test: DEVNAME=/dev/happydev_lun001
In the spirit of Bob Ross I udev’ed some pretty little block devices. :)
I’ve been on the docker train for quite some time. While the benefits of running production workloads in containers is well known, I find docker just as valuable for evaluating and testing new software on my laptop. I’ll use this blog post to walk through how I build transient test environments for software evaluation.
Docker is based around images (Fedora, CentOS, Ubuntu, etc.), and these images can be created and customized through the use of a Dockerfile. The Dockerfile contains statements to control the OS that is used, the software that is installed and post configuration. Here is a Dockerfile I like to use for building test environments:
$ cat Dockerfile
FROM centos:7
MAINTAINER Matty
RUN yum -y update
RUN yum -y install openssh-server openldap-servers openldap-clients openldap
RUN sed -i 's/PermitRootLogin without-password/PermitRootLogin yes/' /etc/ssh/sshd_config
RUN echo 'root:XXXXXXXX' | chpasswd
RUN /usr/bin/ssh-keygen -t rsa -f /etc/ssh/ssh_host_rsa_key -C '' -N ''
RUN /usr/bin/ssh-keygen -t rsa -f /etc/ssh/ssh_host_dsa_key -C '' -N ''
EXPOSE 22
CMD ["/usr/sbin/sshd", "-D"]
To create an image from this Dockerfile you can use docker build:
$ docker build -t centos:7 .
The “-t” option assigns a tag to the image which can be referenced when a new container is instantiated. To view the new image you can run docker images:
$ docker images centos
REPOSITORY TAG IMAGE ID CREATED SIZE
centos 7 4f798f95cfe1 8 minutes ago 414.8 MB
docker.io/centos 6 f07f6ca555a5 3 weeks ago 194.6 MB
docker.io/centos 7 980e0e4c79ec 3 weeks ago 196.7 MB
docker.io/centos latest 980e0e4c79ec 3 weeks ago 196.7 MB
Not to have some fun! To create a new container we can use docker run:
$ docker run -d -P -h foo --name foo --publish 2222:22 centos:7
f84477722896b2701506ee65a3f5a909199675a9cd591f3591e906a8795eba5c
This instantiates a new CentOS container with the name (–name) foo, the hostname (-h) foo and uses the centos:7 image I created earlier. It also maps (–publish) port 22 in the container to port 2222 on my local PC. To access the container you can fire up SSH and connect to port 2222 as root (this is a test container so /dev/null the hate mail):
$ ssh root@localhost -p 2222
root@localhost's password:
[root@foo ~]#
Now I can install software, configure it, break it and debug issues all in an isolated environment. Once I’m satisfied with my testing I can stop the container and delete it:
$ docker stop foo
$ docker rm foo
I find that running an SSH daemon in my test containers is super valuable. For production I would take Jérôme’s advice and look into other methods for getting into your containers.