Convert Web Pages to PDF Using Python

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

Stirling-PDF is is a robust, locally hosted web-based PDF manipulation tool using Docker.

python-pdfkit¶

Python wrapper for wkhtmltopdf utility to convert HTML to PDF using Webkit.

!pip install pdfkit
!sudo apt-get install wkhtmltopdf

import pdfkit 
pdfkit.from_url('https://www.google.co.in/', 'shaurya.pdf')

WeasyPrint¶

!pip3 install weasyprint


pdf = weasyprint.HTML('http://www.google.com').write_pdf()
file('google.pdf', 'wb').write(pdf)

Selenium¶

https://stackoverflow.com/questions/31136581/automate-print-save-web-page-as-pdf-in-chrome-python-2-7


DesiredCapabilities cap = DesiredCapabilities.chrome();
cap.setCapability("download.default_directory","C:");
cap.setCapability("download.prompt_for_download","false");
cap.setCapability("directory_upgrade","true");
cap.setCapability("plugins.plugins_disabled","Chrome PDF Viewer");

WebDriver driver = new ChromeDriver(cap);

Or you can add options.AddArgument("---printing") to automatically click the print button.

https://stackoverflow.com/questions/30452395/selenium-pdf-automatic-download-not-working