Which file descriptor (STDOUT, STDERR, etc.) is my application writing to?

When developing ansible playbooks a common pattern is to run a command and use the output in a future task. Here is a simple example:

---
- hosts: localhost
  connection: local
  tasks:
    - name: Check if mlocate is installed
      command: dnf info mlocate
      register: mlocate_output

    - name: Update the locate database
      command: updatedb
      when: '"No matching Packages to list" in mlocate_output.stderr'

In the first task dnf will run and the output from the command will be placed in either STDOUT or STDERR. But how do you know which one? One way is to add a debug statement to your playbook:

---
- hosts: localhost
  connection: local
  tasks:
    - name: Check if mlocate is installed
      command: dnf info mlocate
      register: mlocate_output

    - name: Print the contents of mlocate_output
      debug:
        var: mlocate_output

Once the task runs you can view the stderr and stdout fields to see which of the two is populated:

TASK [Print the contents of mlocate_output] ************************************************************************
ok: [localhost] => {
    "mlocate_output": {
        "changed": true, 
        "cmd": [
            "dnf", 
            "info", 
            "mlocate"
        ], 
        "delta": "0:00:31.239145", 
        "end": "2017-09-27 16:39:46.919038", 
        "rc": 0, 
        "start": "2017-09-27 16:39:15.679893", 
        "stderr": "", 
        "stderr_lines": [], 
        "stdout": "Last metadata expiration check: 0:43:16 ago on Wed 27 Sep 2017 03:56:05 PM EDT.\nInstalled Packages\nName         : mlocate\nVersion      : 0.26\nRelease      : 16.fc26\nArch         : armv7hl\nSize         : 366 k\nSource       : mlocate-0.26-16.fc26.src.rpm\nRepo         : @System\nFrom repo    : fedora\nSummary      : An utility for finding files by name\nURL          : https://fedorahosted.org/mlocate/\nLicense      : GPLv2\nDescription  : mlocate is a locate/updatedb implementation.  It keeps a database\n             : of all existing files and allows you to lookup files by name.\n             : \n             : The 'm' stands for \"merging\": updatedb reuses the existing\n             : database to avoid rereading most of the file system, which makes\n             : updatedb faster and does not trash the system caches as much as\n             : traditional locate implementations.", 
.....

In the output above we can see that stderr is empty and stdout contains the output from the command. While this works fine it requires you to write a playbook and wait for it to run to get feedback. Strace can provide the same information and in most cases is a much quicker. To get the same information we can pass the command as as argument to strace and limit the output to just write(2) system calls:

$ strace -yy -s 8192 -e trace=write dnf info mlocate
.....
write(1, "Description  : mlocate is a locate/updatedb implementation.  It keeps a database of\n             : all existing files and allows you to lookup files by name.\n             : \n             : The 'm' stands for \"merging\": updatedb reuses the existing database to avoid\n             : rereading most of the file system, which makes updatedb faster and does not\n             : trash the system caches as much as traditional locate implementations.", 442Description  : mlocate is a locate/updatedb implementation.  It keeps a database of
.....

The first argument to write(2) is the file descriptor being written to. In this case that’s STDOUT. This took less than 2 seconds to run and by observing the first argument to write you know which file descriptor the application is writing to.

Using xargs and lscpu to spawn one process per CPU core

One of my friends reached out to me earlier this week to ask if there was an easy way to run multiple Linux processes in parallel. There are
several ways to approach this problem but most of them don’t take into account hardware cores and threads. My preferred solution for CPU intensive operations is to use the xargs parallel option (“-P”) along with the CPU cores listed in lscpu. This allows me to run one process per core which is ideal for CPU intensive applications. But enough talk, let’s see an example.

Let’s say you need to compress a directory full of log files and want to run one compression job on each CPU core. To locate the number of cores you can combine lscpu and grep:

$ CPU_CORES=`lscpu -p=CORE,ONLINE | grep -c ‘Y’`

To generate a list of files we can run find and pass the output of that to xargs:

$ find . -type f -name \*.log | xargs -n1 -P${CPU_CORES} bzip2

The xargs command listed above will create one bzip2 process per core and pass it a log file to process. To monitor the pipeline to make sure it is working as intended we can run a simple while loop:

$ while :; do ps auxwww | grep [b]zip; sleep 1; done

matty    14322  0.0  0.0 113968  1228 pts/0    S+   07:24   0:00 xargs -n1 -P4 bzip2
matty    14323 95.0  0.0  13748  7624 pts/0    R+   07:24   0:11 bzip2 ./log10.txt.log
matty    14324 95.9  0.0  13748  7616 pts/0    R+   07:24   0:11 bzip2 ./log2.txt.log
matty    14325 96.0  0.0  13748  7664 pts/0    R+   07:24   0:11 bzip2 ./log3.txt.log
matty    14326 94.9  0.0  13748  7632 pts/0    R+   07:24   0:11 bzip2 ./log4.txt.log

There are a number of other useful things you can do with the items listed above but I will leave that to your imagination. Viva la xargs!

Using awk character classes to simplify parsing complex strings

This week I was reading a shell script in a github repository to see if it would be good candidate to automate a task. As I was digging through the code I noticed a lengthy shell pipeline to parse a string similar to this:

Thu Jul 20 18:13:04 EDT 2017 snarble foo bar (gorp): blatch (fmep): gak+

Here is the code she/he was using to extract the string “gorp”:

$ cat /foo/bar.txt | grep “snarble” | awk ‘{print $10}’ | awk -F'(‘ ‘{print $2}’ | awk -F’)’ ‘{print $1}’

After my eyes recovered I thought this would be a good candidate to simplify with awk character classes. These are incredibly useful for applying numerous field separators to a given line of input. I took what the original author had and simplified it to this:

$ awk -F'[()]+’ ‘/snarble/ {print $2}’ /foo/bar.txt

The argument passed to the field separated option (-F) contains a list of characters to use as delimiters. The string inside the slashes are used to match all lines that contain the word snarble. I find the second a bit easier to read and character classes are a super useful!

What are bash return codes > 128?

I like to keep PS1 pretty simple. I like to see the host I’m working on, the directory I’m in, the user I’m currently logged in as and the return code from teh last command I executed. This is easy to cobble together with bash escape sequences:

PS1=’\n[\u@\h][RC:$?][\w]$ ‘

The other day I was testing some new ansible playbooks and saw a return code of 130. Several years ago I read learning the bash shell and recalled something about the magic number 128. A quick search of the bash man page provided this gem:

“The return value of a simple command is its exit status, or 128+n if the command is terminated by signal n”

So a return code of 130 represents a process that is terminated by signal 2 which `kill -l` lists as SIGINT:

$ kill -l
 1) SIGHUP	 2) SIGINT	 3) SIGQUIT	 4) SIGILL	 5) SIGTRAP
 6) SIGABRT	 7) SIGBUS	 8) SIGFPE	 9) SIGKILL	10) SIGUSR1
11) SIGSEGV	12) SIGUSR2	13) SIGPIPE	14) SIGALRM	15) SIGTERM
16) SIGSTKFLT	17) SIGCHLD	18) SIGCONT	19) SIGSTOP	20) SIGTSTP
21) SIGTTIN	22) SIGTTOU	23) SIGURG	24) SIGXCPU	25) SIGXFSZ
26) SIGVTALRM	27) SIGPROF	28) SIGWINCH	29) SIGIO	30) SIGPWR
31) SIGSYS	34) SIGRTMIN	35) SIGRTMIN+1	36) SIGRTMIN+2	37) SIGRTMIN+3
38) SIGRTMIN+4	39) SIGRTMIN+5	40) SIGRTMIN+6	41) SIGRTMIN+7	42) SIGRTMIN+8
43) SIGRTMIN+9	44) SIGRTMIN+10	45) SIGRTMIN+11	46) SIGRTMIN+12	47) SIGRTMIN+13
48) SIGRTMIN+14	49) SIGRTMIN+15	50) SIGRTMAX-14	51) SIGRTMAX-13	52) SIGRTMAX-12
53) SIGRTMAX-11	54) SIGRTMAX-10	55) SIGRTMAX-9	56) SIGRTMAX-8	57) SIGRTMAX-7
58) SIGRTMAX-6	59) SIGRTMAX-5	60) SIGRTMAX-4	61) SIGRTMAX-3	62) SIGRTMAX-2
63) SIGRTMAX-1	64) SIGRTMAX	

Noting this here so I have it handy for the future.

Having fun in the shell with cowsay and fortune

Last weekend while I was waiting for several ansible playbooks to apply I thought it would be fun to play around with cowsay and fortune. If you aren’t familiar with these tools cowsay gives you an ASCII cow which says whatever is passed to it as an argument. Here is a example:

$ fortune | cowsay
 ________________________________
< Tomorrow, you can be anywhere. >
 --------------------------------
        \   ^__^
         \  (oo)\_______
            (__)\       )\/\
                ||----w |
                ||     ||

To have a cowtastic time each time I open a shell I added the following to my bashrc:

# Have some fun
if [ -x /bin/cowsay ] && [ -x /bin/fortune ] || 
   [ -x /usr/games/cowsay ] && [ /usr/games/fortune ]; then
	   fortune | cowsay
fi

I guess the old saying is right. All work and no play makes an admin mooooooooo. :)

Automatically updating your .bashrc when you log into a server

I am a long time bash user and have found numerous aliases and shell functions that allow me to be more productive at the prompt. Depending on how you manage (configuration management, NFS mounted home directories, etc.) ${HOME} making sure your bashrc gets updated when you find a cool new feature can be a pain. I was tinkering around last weekend and thought about adding a block of code to my bashrc to run curl to grab the latest version of my bashrc from github. The following short code block works for my needs:

# Location to pull bashrc from
bashrc_source="https://raw.githubusercontent.com/Matty9191/bashrc/master/bashrc"

# Take precaution when playing with temp files
temp_file=$(mktemp /tmp/tmp.XXXXXXXX)
touch ${temp_file}

curl -s -o ${temp_file} ${bashrc_source}
RC=$?

if [ ${RC} -eq 0 ]; then
    version=$(head -1 ${temp_file} | awk -F'=' '/VERSION/ {print $2}')

    if [ "${version}" -gt "${VERSION}" ]; then
        echo "Upgrading bashrc from version ${VERSION} to ${version}"
        cp ${HOME}/.bashrc ${HOME}/.bashrc.bak.$(/bin/date "+%m%d%Y.%S")
        mv ${temp_file} ${HOME}/.bashrc
    fi
else
    echo "Unable to retrive bashrc from ${bashrc_source}"
    rm ${temp_file}
fi

If a new version is available (a VERSION variable tracks the release #) I get the following output when I log in:

Upgrading bashrc from version 44 to 46

If github is unavailable due to a service issue or a firewall won’t let me out the script will let me know:

Unable to retrieve bashrc from https://gik/Matty9191/bashrc/master/bashrc

How are you keeping your shell profiles up to date? Let me know in the comment section.