Ben Chuanlong Du's Blog

And let it direct your passion with reason.

Select Columns from Structured Text Files

Python pandas

My first choice is pandas in Python. However, below are some tools for quick and dirty solutions.

q

q -t -H 'select c1, c3 from file.txt'

cut

cut -d\t -f1,3 file.txt

awk

awk -F'\t' '{print $1 "\t" $3}' file.tsv 

Note: neither cut …

Sample Lines from a File Using Command Line

NOTE: the article talks about sampling "lines" rather than "records". If a records can occupy multiple lines, e.g., if any field contains a new line (\n), the following tutorial does not work and you have to fall back to more powerful tools such as Python or R.

Let's say …