Ben Chuanlong Du's Blog

And let it direct your passion with reason.

Use pdftk to Manipulating PDF Files

  1. It is suggested that you use Python modules or Stirling-PDF instead of pdftk to manipulating PDFs for several reasons. First, even though pdftk is a great command-line tool, it is hard to remember its syntax. On the contratry, Python code is easy to read and understand (even though it is …

Editing PDF Files

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

Type Name Comments
Web Tools Stirling-PDF - robust
- local hosted
- Docker container based
Parseur - AI-based PDF parser
DocuSign - Great for convert PDF files to MS Office files …

Tips on the find command in Linux

It is suggested that you use Python (the pathlib module), ripgrep, fselect or osquery (currently have some bugs) instead of find to locate files.

  • The Python module pathlib is the most suitable one for relatively complex jobs.
  • ripgrep is a more user-friendly alternative to find.
  • Both fselect and osquery support …

My Docker Images

Most of my Docker images have different variants (corresponding to tags latest, next, etc) for different use cases. And each tag might have histocial versions with the pattern mmddhh (mm, dd and hh stand for the month, day and hour) for fallback if a tag …