Tips and Traps¶
- Make sure to use the mode
rb
/wb
when read/write pickle files.
Date Functions in Spark
Tips and Traps¶
- HDFS table might contain invalid data (I'm not clear about the reasons at this time) with respct to the column types (e.g., Date and Timestamp). This will cause issues when Spark tries to load the data. For more discussions, please refer to Unrecognized column type:TIMESTAMP_TYP.
datetime.datetime
ordatetime.date
Best Filesystem Format for Cross-platform Data Exchanging
FAT32
FAT32 is an outdated filesystem. The maximum size for a single file is 4G. You should instead exFAT instead of FAT32 where possible.
exFAT
exFAT is great cross-platform filesystem that is support out-of-box by Windows, Linux and macOS. There is practically no limit (big enough for average users) on …
Fonts for Linux
-
ttf-arphic-uming
,ttf-wqy-microhei
,ttf-wqy-zenhei
,xfonts-wqy
andttf-opensymbol
are some packages related to Chinese fonts. -
If you have Adobe Reader installed on your computer, you can use Adobe Chinese fonts for free.
-
To check Chinese fonts installed on your computer, you can use the command
fc-list :lang=zh-cn | sort
. -
To install extra …
Search for Files in Command-line Using grep
The article
14 Practical examples of the grep command
has some good examples on how to use the grep
command.
-
The Perl style (option
-P
) regular expression is more powerful than the basic (default) and extended (option-E
) regular expression. It is suggested that you use the perl style as …
Synchronize Files Using Dropbox
-
Dropbox won't sync files that you don't have read permissions.
-
You'd better not merge an old Dropbox folder while installing/configuring Dropbox.
-
You'd better not store symbolic links in the Dropbox folder, because the symbolic links will be replaced by the real files/folders when synchronized to other computers.
-
It …