Ben Chuanlong Du's Blog

And let it direct your passion with reason.

Date Functions in Spark

Tips and Traps

  1. HDFS table might contain invalid data (I'm not clear about the reasons at this time) with respct to the column types (e.g., Date and Timestamp). This will cause issues when Spark tries to load the data. For more discussions, please refer to Unrecognized column type:TIMESTAMP_TYP.
  1. datetime.datetime or datetime.date

Best Filesystem Format for Cross-platform Data Exchanging

FAT32

FAT32 is an outdated filesystem. The maximum size for a single file is 4G. You should instead exFAT instead of FAT32 where possible.

exFAT

exFAT is great cross-platform filesystem that is support out-of-box by Windows, Linux and macOS. There is practically no limit (big enough for average users) on …

Fonts for Linux

  1. ttf-arphic-uming, ttf-wqy-microhei, ttf-wqy-zenhei, xfonts-wqy and ttf-opensymbol are some packages related to Chinese fonts.

  2. If you have Adobe Reader installed on your computer, you can use Adobe Chinese fonts for free.

  3. To check Chinese fonts installed on your computer, you can use the command fc-list :lang=zh-cn | sort.

  4. To install extra …

Synchronize Files Using Dropbox

  1. Dropbox won't sync files that you don't have read permissions.

  2. You'd better not merge an old Dropbox folder while installing/configuring Dropbox.

  3. You'd better not store symbolic links in the Dropbox folder, because the symbolic links will be replaced by the real files/folders when synchronized to other computers.

  4. It …