awk(1) cheatsheet by Ypnose
The following page matches nawk
only (it might
work with other variants).
Builtin variables
ARGC | Number of arguments on command line ARGV | An array containing the command-line arguments, indexed from 0 to ARGC - 1 CONVFMT | String conversion format for numbers (%.6g) (POSIX) ENVIRON | An associative array of environment variables FILENAME | Current filename FNR | Like NR, but relative to the current file FS | field separator (a space) NF | Number of fields in current record NR | Number of the current record OFMT | Output format for numbers (%.6g) OFS | Output field separator (a space) ORS | Output record separator (a newline) RLENGTH | Length of the string matched by match() function RS | Record separator (a newline) RSTART | First position in the string matched by match() function SUBSEP | Separator character for array subscripts (\034)
printf() and sprintf() formats
- | Left justify c | ASCII character d | Decimal integer i | Decimal integer. Added in POSIX e | Floating-point format ([-]d.precisione[+-]dd) E | Floating-point format ([-]d.precisionE[+-]dd) f | Floating-point format ([-]ddd.precision) g | e or f conversion, whichever is shortest, with trailing zeros removed G | E or f conversion, whichever is shortest, with trailing zeros removed o | Unsigned octal value s | String x | Unsigned hexadecimal number. Uses a-f for 10 to 15 X | Unsigned hexadecimal number. Uses A-F for 10 to 15 % | Literal %
Commands and functions
break
Exit from a loop (for
or while
).
continue
Jump to the next iteration in a loop (for
or
while
).
exit
Exit from awk
script. Return value can be
specified. If END
procedure exists, it will be
executed.
for
C-style loop. First parameter i=1
sets the
initial value, second parameter is a condition evaluated each
time, before executing the next statement and last parameter
i++
increments (or decrements) the variable
i
.
echo "/usr/bin/,/usr/sbin/,/usr/local/bin/,/bin/,/sbin/" | awk '{
c = split($0,path,",")
for (i=1; i<=c; i++)
print path[i]
}'
Even if it's very practical for arrays, the following
syntax must be used carefully : the order is random and depends
on the awk
variant.
echo "foo bro unix bsd solaris" | awk '{
split($0,word," ")
for (i in word)
printf("Array number: %d --> %s\n", i, word[i])
}'
With nawk
(the first array item is often in the
last position)
Array number: 2 --> bro
Array number: 3 --> unix
Array number: 4 --> bsd
Array number: 5 --> solaris
Array number: 1 --> foo
With gawk
Array number: 1 --> foo
Array number: 2 --> bro
Array number: 3 --> unix
Array number: 4 --> bsd
Array number: 5 --> solaris
function
Create user-defined function and apply statements to the parameters.
printf "Karen\nTom\nCharles\nJohn\nJuliet" | awk '
function who(PERSON) {
if (PERSON ~ /^(John|Karen)/)
w = "adult"
else
w = "children"
printf("%s is: %s\n", PERSON, w)
}
{
who($0)
}'
getline
Read next line of input.
gsub
Replace all occurences foo
by bar
.
If the string (here line
) is not specified, it
defaults to $0
.
echo "foo foo foo foo foo" | awk '{
line = $0
gsub(/foo/,"bar",line)
print line
}'
if
Apply actions if conditions are true or false.
printf "coco\nbar\nfoocow\n" | awk '{
if ($0 ~ /^foo/)
printf("%15s: BAD\n", $0)
else if ($0 == "bar")
printf("%-10s: OK!\n", $0)
else
printf("the line is <%s>\n", $0)
}'
length
Return length of the string or length of $0
if
unspecified.
echo "AveryLONGstringtoTestAWK" | awk '{
size = length($0)
print size
}'
match
Match the specified pattern using a regex. If the string is
found, it sets two values: RSTART
and
RLENGTH
, to the start and the length of the string.
It's often "associated" with substr
.
echo "The path is: PATH=/usr/local/sbin:/usr/local/bin: DONE" | awk '{
match($0,/PATH=.*:/)
strt = substr($0,1,RSTART-1)
path = substr($0,RSTART,RLENGTH)
end = substr($0,RSTART+RLENGTH+1)
printf("%s\n%s\n%s\n", strt, path, end)
}'
next
Read next line and start all the statements.
Print variable or command. Use comma to separate variables
with OFS
value. It ends with ORS
.
echo "John 28yo Elen 25yo" | awk '{
print $1, $3
print $2, $4
}'
echo "John 28yo Elen 25yo" | awk 'BEGIN { OFS = "|"; ORS = "\n\n===\n\n" } {
print $1, $3
print $2, $4
}'
printf
Print formatted output like C language.
echo "Super Man 666" | awk '{
printf("%-18s %s %d\n", $1, $2, $3)
}'
return
Exit user-defined function. Return value can be specified.
split
Split string in array of elements. The string is split at
each occurence of separator or FS
if undefined.
echo "The/night/is/scary" | awk '{
split($0,feel,"/")
printf("%s-%s-%s-%s\n", feel[1], feel[2], feel[3], feel[4])
}'
sprintf
Apply format on a string. The string can be assigned to a variable. Data is formatted but not printed.
echo "word1 word2 word3" | awk '{
str = sprintf("%-10s|%10s -> %s", $1, $2, $3)
print str
}'
sub
Same as gsub
, except it applies only to the first
match.
echo "foo foo foo" | awk '{
sub(/foo/,"bar")
print
}'
substr
"Extract" a substring of string $0
from the
beginning position 7
and to the length of
13
.
echo "Fact: Ypnose rocks! Yeah." | awk '{
str = substr($0,7,13)
print str
}'
system
Run the specified command and return only the exit status. Use quotes !
echo "return code was:" | awk '{
cmd1 = system("true")
cmd2 = system("false")
printf("%s %d\n%s %d\n", $0, cmd1, $0, cmd2)
}'
tolower
Translate all uppercase characters in lowercase. Despite what people think, it is POSIX.
echo "QWERTYUIOP" | awk '{
print tolower($0)
}'
toupper
Translate all lowercase characters in uppercase. Despite what people think, it is POSIX.
echo "qwertyuiop" | awk '{
print toupper($0)
}'
while
While condition is true, run the statements.
echo "cheers:" | awk '{
i=1
while (i <= 4) {
print $0, i
i++
}
}'