-
For small structured text files, it is suggested that you use the q command to manipulate it.
For complicated logic, it is suggested that you use a scripting language (e.g., Python) instead. I personally discourage using of
awk
unless you have a large file (that q cannot handle) and the operations you want do are simple. -
Basic syntax of
awk
awk 'BEGIN {start_action} {action} END {stop_action}' file_name
-
Whether to user single or double quote depends on whether you use column variables in the expression. This is consistent with shell variable substitution.
-
awk
ignorecase when working on files make unnecessary redundant output very annoying, not sure why -
awk
does not recognize escaped characters in CSV formatted. Make sure that the fileawk
works on is in simple format.
Field Delimiter
-
The delimiter must be quoted. For example, if the field delimiter is tab, you must use
awk -F'\t'
rather thanawk -F\t
. -
The filed delimiter of AWK supports can be a regular expression.
awk -F'[/=]' '{print $3 "\t" $5 "\t" $8}' file_name
Column/Field Filtering/Manipulation
-
Select 1st and 3rd column (seprated by tab)
awk '{print $1 "\t" $3}' file_name
-
Sum of the 5th filed.
awk 'BEGIN {s=0} {s=s+$5} END {print s}' file_name
Rows Filtering/Manipulation
-
Print rows of the file with the first field greater than 3.
awk '{ if($1 > 3) print }' file_name
-
Print Docker image IDs that has no repositories names.
docker images | awk '{ if ($1 == "<none>") print $3 }'
-
Print Docker image IDs whose name contains
che
using regular expression match.docker images | awk '{if ($1 ~ "che") print $3}'
-
Print rows with 2 fileds.
awk 'NF == 2' file_name
Or more verbosally (and more portable)
awk 'NF == 2 {print} {}' file_name
-
Count the number of fields in each line.
awk '{print NF}' file_name
References
https://stackoverflow.com/questions/15386632/awk-4th-column-everything-matching-wildcard-before-the