Using awk functions


I am a huge fan of awk, and find myself constantly using it to parse simple data steams. awk contains numerous string-related functions, which are an invaluable resource for awk script developers. To illustrate some of the cool awk functions, I created a single-line text file with the string “String of string”:

$ cat text
string of string

To get the length of each line in the file text, the length() function can be used:

$ awk '{ i = length(0); print i }' text
16

To see if the string “ing” is present in the file text, the index() function can be used:

$ awk '{ i = index(0,"ing"); print i}' text
4

index() will return the location of the first occurence of “ing,” which can then be used to facilitate further string processing. To retrieve a range of characters in a string, a beginning and ending offset can be passed to the awk substr() function:

$ awk '{ i = substr(0,5,10); print i }' text
ng of stri

And finally, to tokenize (split a line into word-length pieces) a string, the split() function can be used:

$ awk 'BEGIN { i = 1 } { n = split(0,a," "); while (i <= n) {print a[i]; i++;} }' text string of string

I dig awk!

This article was posted by Matty on 2005-11-16 23:36:00 -0400 -0400