-
For small structured text files, it is suggested that you use the q command to manipulate it.
For complicated logic, it is suggested that you use a scripting language (e.g., Python) instead. I personally discourage using of
awkunless you have a large file (that q cannot handle) and the operations you want do are simple. -
Basic syntax of
awkawk 'BEGIN {start_action} {action} END {stop_action}' file_name -
Whether to user single or double quote depends on whether you use column variables in the expression. This is consistent with shell variable substitution.
-
awkignorecase when working on files make unnecessary redundant output very annoying, not sure why -
awkdoes not recognize escaped characters in CSV formatted. Make sure that the fileawkworks on is in simple format.
Field Delimiter
-
The delimiter must be quoted. For example, if the field delimiter is tab, you must use
awk -F'\t'rather thanawk -F\t. -
The filed delimiter of AWK supports can be a regular expression.
awk -F'[/=]' '{print $3 "\t" $5 "\t" $8}' file_name
Column/Field Filtering/Manipulation
-
Select 1st and 3rd column (seprated by tab)
awk '{print $1 "\t" $3}' file_name -
Sum of the 5th filed.
awk 'BEGIN {s=0} {s=s+$5} END {print s}' file_name
Rows Filtering/Manipulation
-
Print rows of the file with the first field greater than 3.
awk '{ if($1 > 3) print }' file_name -
Print Docker image IDs that has no repositories names.
docker images | awk '{ if ($1 == "<none>") print $3 }' -
Print Docker image IDs whose name contains
cheusing regular expression match.docker images | awk '{if ($1 ~ "che") print $3}' -
Print rows with 2 fileds.
awk 'NF == 2' file_nameOr more verbosally (and more portable)
awk 'NF == 2 {print} {}' file_name -
Count the number of fields in each line.
awk '{print NF}' file_name
References
https://stackoverflow.com/questions/15386632/awk-4th-column-everything-matching-wildcard-before-the