This week I was reading a shell script in a github repository to see if it would be good candidate to automate a task. As I was digging through the code I noticed a lengthy shell pipeline to parse a string similar to this:
Thu Jul 20 18:13:04 EDT 2017 snarble foo bar (gorp): blatch (fmep): gak+
Here is the code they were using to extract the string “gorp”:
$ cat /foo/bar.txt | grep “snarble” | awk ‘{print 10}’ | awk -F'(' ‘{print $2}’ | awk -F’)’ ‘{print $1}’
After my eyes recovered I thought this would be a good candidate to simplify with awk character classes. These are incredibly useful for applying numerous field separators to a given line of input. I took what the original author had and simplified it to this:
$ awk -F'[()]+' '/snarble/ {print 2}' /foo/bar.txt
The argument passed to the field separated option (-F) contains a list of characters to use as delimiters. The string inside the slashes are used to match all lines that contain the word snarble. I find the second a bit easier to read and character classes are a super useful!