Ben Chuanlong Du's Blog

And let it direct your passion with reason.

Concurrency and Parallel Computing in Python

The GIL is controversial because it prevents multithreaded CPython programs from taking full advantage of multiprocessor systems in certain situations. Note that potentially blocking or long-running operations, such as I/O, image processing, and NumPy number crunching, happen outside the GIL. Therefore it is only in multithreaded programs that spend …

Boolean Values in C++

  1. Boolean expressions are evaluated from left to right (the same in Java), so it is totally OK to write code like

    if(a < x.size() && x[a]){
        ...
    }
    

    where x is a vector.

  2. There is no &&= and ||= operators in C++, instead you can use &= and |=. Though &= and |= are not specially for …

Parallel Computing Using Multithreading

  1. Not all jobs are suitable for parallel computing. The more comminication that threads has to make, the more dependent the jobs are and the less efficient the parallel computing is.

  2. Generally speaking, commercial softwares (Mathematica, MATLAB and Revolution R, etc.) have very good support on parallel computing.

Python

Please refer …

Runtime Paths in Python

__file__ is the path of the Python script. Note that if you make a sybolic link to a Python script and run the symbolic link, then __file__ is the path of the symbolic link. Of course, you can use os.path.realpath to get real path of files.

pathlib.Path …

Select Columns from Structured Text Files

Python pandas

My first choice is pandas in Python. However, below are some tools for quick and dirty solutions.

q

q -t -H 'select c1, c3 from file.txt'

cut

cut -d\t -f1,3 file.txt

awk

awk -F'\t' '{print $1 "\t" $3}' file.tsv 

Note: neither cut …