Python pandas
My first choice is pandas in Python. However, below are some tools for quick and dirty solutions.
q
q -t -H 'select c1, c3 from file.txt'
cut
cut -d\t -f1,3 file.txt
awk
awk -F'\t' '{print $1 "\t" $3}' file.tsv
Note: neither cut …
My first choice is pandas in Python. However, below are some tools for quick and dirty solutions.
q -t -H 'select c1, c3 from file.txt'
cut -d\t -f1,3 file.txt
awk -F'\t' '{print $1 "\t" $3}' file.tsv
Note: neither cut …
NOTE: the article talks about sampling "lines" rather than "records".
If a records can occupy multiple lines,
e.g., if any field contains a new line (\n),
the following tutorial does not work
and you have to fall back to more powerful tools such as Python or R.
Let's say …