Ben Chuanlong Du's Blog

And let it direct your passion with reason.

Runtime Paths in Python

__file__ is the path of the Python script. Note that if you make a sybolic link to a Python script and run the symbolic link, then __file__ is the path of the symbolic link. Of course, you can use os.path.realpath to get real path of files.

pathlib.Path …

Good Ways to Do Scientific Computing

  1. Break down the work into smaller modules and develop pipelines (consists of module) for the work. Be sure to save (important) intermediate results so that you can resume failed modules withouting reruning succeeded ones.

  2. Manage your project in GitHub and use issues to manage tasks to do and their priorities …

Compare Two Directories on Linux

On the Same Machine

If the two directories are on the same machine, you can use either colordiff (preferred over diff) or git diff to find the differences between them.

colordiff -qr dir_1 dir_2
git diff --no-index dir_1 dir_2

On Different Machines

It is a little bit tricky when the …

Advanced Use of "ls" in Linux

List Files Sorted by Time

You can list files sorted by time (newest first) using the -t option. Notice that the -t option is also support by hdfs dfs -ls.

ls -lht

Ignore Files

  1. You have to either enclose the pattern in quotes or escape the wildcard in patterns.

  2. Equivalent …

Proxy for `sudo`

You can setup proxy in a terminal by export environment variables http_proxy and `https_proxy'.

export http_proxy='proxy_server:port'
export https_proxy='proxy_server:port'

However, you might find the exported environment variables are not visible to sudo. This can be resovled by simplying adding the -E (preserve environment) option to sudo.

sudo …