Ben Chuanlong Du's Blog

And let it direct your passion with reason.

Use pdftk to Manipulating PDF Files

  1. It is suggested that you use Python modules or Stirling-PDF instead of pdftk to manipulating PDFs for several reasons. First, even though pdftk is a great command-line tool, it is hard to remember its syntax. On the contratry, Python code is easy to read and understand (even though it is …

Editing PDF Files

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

Type Name Comments
Web Tools Stirling-PDF - robust
- local hosted
- Docker container based
Parseur - AI-based PDF parser
DocuSign - Great for convert PDF files to MS Office files …

Working with Spreadsheet in Python

It is suggested that you avoid using Excel files (or other spreadsheet tools) for storing data. Parquet file is currently the best format for storing table-like data. If you do want to interact and manipulate your data using Excel (or other spreadsheet tools), dump your data into CSV files and …