Skip to content

UnicodeDecodeError: 'ascii' codec can't decode . . .  #42

@quietlyconfident

Description

@quietlyconfident

When I try to use python-pdfkit with certain HTML content that has certain characters in it, it fails with one of these errors if the html content is loaded into memory:

File ". . . /pdfkit.py", line 100, in to_pdf
    input = self.source.to_s().encode('utf-8')
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc2 in position 64: ordinal not in range(128)

or

File ". . ./pdfkit.py", line 102, in to_pdf
    input = self.source.source.read().encode('utf-8')
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc2 in position 64: ordinal not in range(128)

But, python pdfkit works just fine if it is provided with just a filename, and so does wkhtmltopdf.

I think that python pdfkit is doing something unsafe with strings; perhaps it should assume that the input is just bytes.

python-pdfkit error demo.zip

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions