Processing files with awk


I have used awk(1) for years to tokenize strings and to extract specific lines fom files. To tokenize a string, you can use awk(1)‘s positional parameters when processing a file:

$ ` awk '{ print 1, 2 }' /etc/services | head -10`   
#ident "@(#)services
#
#
# Copyright
# All
#
# Network
#
tcpmux 1/tcp
echo 7/tcp

When awk(1) processes the file /etc/services, each line will be split into tokens based on the value of IFS and placed into positional parameters (e.g., $1 … $N). The awk(1) print function is then used to print all of the values passed as an argument. You can also process files ranges by invoking awk(1) with two comparison statements separated by a comma:

$ awk ' /ssh/ , /smtp/ { print 0 }' /etc/services

ssh 22/tcp # Secure Shell
telnet 23/tcp
smtp 25/tcp mail

If you need to grab all lines from a beginning point to the end of the file, awk(1)‘a EOF keyword can be used in one of the comparison statements:

$ awk ' /dtspc/ , /EOF/ { print 0 }' /etc/services

dtspc 6112/tcp # CDE subprocess control
fs 7100/tcp # Font server
apocd 38900/udp
snmpd 161/udp snmp # SMA snmp daemon

I find myself using awk(1) pretty regularly, and really dig some of the cool stuff that is built-in!

This article was posted by Matty on 2005-10-15 12:07:00 -0400 -0400