Ben Chuanlong Du's Blog

And let it direct your passion with reason.

Editing PDF Files

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

Type Name Comments
Web Tools Parseur - AI-based PDF parser
DocuSign - Great for convert PDF files to MS Office files, etc.
- non-free: 1 file per 30 minutes …

Use pdftk to Manipulating PDF Files

It is suggested that you use Python modules instead of pdftk to manipulating PDFs for several reasons. First, even though pdftk is a great command-line tool, it is hard to remember its syntax. On the contratry, Python code is easy to read and understand (even though it is more verbose …

Tips on the find command in Linux

It is suggested that you use Python (the pathlib module), ripgrep, fselect or osquery (currently have some bugs) instead of find to locate files.

  • The Python module pathlib is the most suitable one for relatively complex jobs.
  • ripgrep is a more user-friendly alternative to find.
  • Both fselect and osquery support …

My Docker Images

Most of my Docker images have different variants (corresponding to tags latest, next, etc) for different use cases. And each tag might have histocial versions with the pattern mmddhh (mm, dd and hh stand for the month, day and hour) for fallback if a tag …