Ben Chuanlong Du's Blog

And let it direct your passion with reason.

How Long Does It Take to Observe a Sequence?

There are many interesting while at the same time very tricky problems in statistics. One famous question is that how many steps (expected) does it take to observe a given sequence (e.g. THTH, TTHH), if we flip a balanced coin?

This problem can be solved using (delay) renewal theory …

Select Columns from Structured Text Files

Python pandas

My first choice is pandas in Python. However, below are some tools for quick and dirty solutions.

q

q -t -H 'select c1, c3 from file.txt'

cut

cut -d\t -f1,3 file.txt

awk

awk -F'\t' '{print $1 "\t" $3}' file.tsv 

Note: neither cut …

Sample Lines from a File Using Command Line

NOTE: the article talks about sampling "lines" rather than "records". If a records can occupy multiple lines, e.g., if any field contains a new line (\n), the following tutorial does not work and you have to fall back to more powerful tools such as Python or R.

Let's say …

Account Management in Linux

** Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement! **

Create a User

Both adduser and useradd can be used to create a new user. adduser is interactive while useradd is non-interactive. It is suggested that you use useradd in batch …

Get Group Names on Linux/Unix

Linux

  1. Get information of the staff group.

    $ getent group staff
    staff:x:20:
    
  2. Get group ID of the staff group.

    $ getent group staff | cut -d: -f3
    20
    

Mac

  1. Get information of the staff group.

    $ dscl . -read /Groups/staff
    
  2. Get group ID of the staff group.

    $ dscl . -read /Groups/staff | awk …

List Running Jupyter Notebook Servers

You can list running Jupyter Notebook servers using the following command.

jupyter notebook list

It works well most of the time. However, if the servers are launched using the root account (e.g., in a Docker container), you might encounter issues. In this case, a better alternative is to list …