Skip to content

chrisclements/html2pdf

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

HTML to PDF Converter

This project uses Puppeteer to convert a web page to a PDF file which renders selectable text instead of flattened image.

When you use a browser's "Print to PDF" feature, it's using a print driver to interpret the rendered page. Sometimes, especially with complex layouts, fonts, or CSS, the driver might "flatten" the page into an image to ensure the visual output is exactly what you see on screen. This process is called rasterization, and it results in non-selectable text.

Puppeteer, however, doesn't simulate the print dialog. It directly accesses Chromium's internal PDF rendering engine. This engine is specifically designed to translate the web page's structure (the DOM) into a structured PDF document. It creates the PDF by defining text objects, vector shapes, and images directly. This method preserves the underlying text information, making it selectable, searchable, and accessible in the final PDF file.

Setup

  1. Ensure you have Node.js installed.
  2. Install dependencies:
    npm install

Usage

Run the script from your command line, providing the URL to convert and the desired output file name.

npm start -- --url <your-url> --output <output-filename.pdf>

Options

  • --url, -u: The URL to convert to PDF. (Required)
  • --output, -o: The output PDF file name. (Required)

Example

npm start -- --url "https://example.com" --output "example.pdf"

Configuration

You can adjust the PDF output settings in convert_to_pdf.js, such as:

  • format: 'A4', 'Letter', etc.
  • margin: Margins for the PDF.
  • printBackground: Whether to print background graphics.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published