awk(1) cheatsheet by Ypnose

The following page matches nawk only (it might work with other variants).

Builtin variables

      ARGC  |  Number of arguments on command line
      ARGV  |  An array containing the command-line arguments, indexed from 0 to ARGC - 1
   CONVFMT  |  String conversion format for numbers (%.6g) (POSIX)
   ENVIRON  |  An associative array of environment variables
  FILENAME  |  Current filename
       FNR  |  Like NR, but relative to the current file
        FS  |  field separator (a space)
        NF  |  Number of fields in current record
        NR  |  Number of the current record
      OFMT  |  Output format for numbers (%.6g)
       OFS  |  Output field separator (a space)
       ORS  |  Output record separator (a newline)
   RLENGTH  |  Length of the string matched by match() function
        RS  |  Record separator (a newline)
    RSTART  |  First position in the string matched by match() function
    SUBSEP  |  Separator character for array subscripts (\034)

printf() and sprintf() formats

  -  |  Left justify
  c  |  ASCII character
  d  |  Decimal integer
  i  |  Decimal integer. Added in POSIX
  e  |  Floating-point format ([-]d.precisione[+-]dd)
  E  |  Floating-point format ([-]d.precisionE[+-]dd)
  f  |  Floating-point format ([-]ddd.precision)
  g  |  e or f conversion, whichever is shortest, with trailing zeros removed
  G  |  E or f conversion, whichever is shortest, with trailing zeros removed
  o  |  Unsigned octal value
  s  |  String
  x  |  Unsigned hexadecimal number. Uses a-f for 10 to 15
  X  |  Unsigned hexadecimal number. Uses A-F for 10 to 15
  %  |  Literal %

Commands and functions

break

Exit from a loop (for or while).

continue

Jump to the next iteration in a loop (for or while).

exit

Exit from awk script. Return value can be specified. If END procedure exists, it will be executed.

for

C-style loop. First parameter i=1 sets the initial value, second parameter is a condition evaluated each time, before executing the next statement and last parameter i++ increments (or decrements) the variable i.

echo "/usr/bin/,/usr/sbin/,/usr/local/bin/,/bin/,/sbin/" | awk '{
	c = split($0,path,",")
	for (i=1; i<=c; i++)
		print path[i]
}'

Even if it's very practical for arrays, the following syntax must be used carefully : the order is random and depends on the awk variant.

echo "foo bro unix bsd solaris" | awk '{
	split($0,word," ")
	for (i in word)
		printf("Array number: %d --> %s\n", i, word[i])
}'

With nawk (the first array item is often in the last position)

Array number: 2 --> bro
Array number: 3 --> unix
Array number: 4 --> bsd
Array number: 5 --> solaris
Array number: 1 --> foo

With gawk

Array number: 1 --> foo
Array number: 2 --> bro
Array number: 3 --> unix
Array number: 4 --> bsd
Array number: 5 --> solaris

function

Create user-defined function and apply statements to the parameters.

printf "Karen\nTom\nCharles\nJohn\nJuliet" | awk '
	function who(PERSON) {
		if (PERSON ~ /^(John|Karen)/)
			w = "adult"
		else
			w = "children"
		printf("%s is: %s\n", PERSON, w)
	}
	{
	who($0)
}'

getline

Read next line of input.

gsub

Replace all occurences foo by bar. If the string (here line) is not specified, it defaults to $0.

echo "foo foo foo foo foo" | awk '{
	line = $0
	gsub(/foo/,"bar",line)
	print line
}'

if

Apply actions if conditions are true or false.

printf "coco\nbar\nfoocow\n" | awk '{
	if ($0 ~ /^foo/)
		printf("%15s: BAD\n", $0)
	else if ($0 == "bar")
		printf("%-10s: OK!\n", $0)
	else
		printf("the line is <%s>\n", $0)
}'

length

Return length of the string or length of $0 if unspecified.

echo "AveryLONGstringtoTestAWK" | awk '{
	size = length($0)
	print size
}'

match

Match the specified pattern using a regex. If the string is found, it sets two values: RSTART and RLENGTH, to the start and the length of the string. It's often "associated" with substr.

echo "The path is: PATH=/usr/local/sbin:/usr/local/bin: DONE" | awk '{
	match($0,/PATH=.*:/)
	strt = substr($0,1,RSTART-1)
	path = substr($0,RSTART,RLENGTH)
	end = substr($0,RSTART+RLENGTH+1)
	printf("%s\n%s\n%s\n", strt, path, end)
}'

next

Read next line and start all the statements.

print

Print variable or command. Use comma to separate variables with OFS value. It ends with ORS.

echo "John 28yo Elen 25yo" | awk '{
	print $1, $3
	print $2, $4
}'

echo "John 28yo Elen 25yo" | awk 'BEGIN { OFS = "|"; ORS = "\n\n===\n\n" } {
	print $1, $3
	print $2, $4
}'

printf

Print formatted output like C language.

echo "Super Man 666" | awk '{
	printf("%-18s %s %d\n", $1, $2, $3)
}'

return

Exit user-defined function. Return value can be specified.

split

Split string in array of elements. The string is split at each occurence of separator or FS if undefined.

echo "The/night/is/scary" | awk '{
	split($0,feel,"/")
	printf("%s-%s-%s-%s\n", feel[1], feel[2], feel[3], feel[4])
}'

sprintf

Apply format on a string. The string can be assigned to a variable. Data is formatted but not printed.

echo "word1 word2 word3" | awk '{
	str = sprintf("%-10s|%10s -> %s", $1, $2, $3)
	print str
}'

sub

Same as gsub, except it applies only to the first match.

echo "foo foo foo" | awk '{
	sub(/foo/,"bar")
	print
}'

substr

"Extract" a substring of string $0 from the beginning position 7 and to the length of 13.

echo "Fact: Ypnose rocks! Yeah." | awk '{
	str = substr($0,7,13)
	print str
}'

system

Run the specified command and return only the exit status. Use quotes !

echo "return code was:" | awk '{
	cmd1 = system("true")
	cmd2 = system("false")
	printf("%s %d\n%s %d\n", $0, cmd1, $0, cmd2)
}'

tolower

Translate all uppercase characters in lowercase. Despite what people think, it is POSIX.

echo "QWERTYUIOP" | awk '{
	print tolower($0)
}'

toupper

Translate all lowercase characters in uppercase. Despite what people think, it is POSIX.

echo "qwertyuiop" | awk '{
	print toupper($0)
}'

while

While condition is true, run the statements.

echo "cheers:" | awk '{
	i=1
	while (i <= 4) {
		print $0, i
		i++
	}
}'