fix: make parseXMLAsync and parseXML have the same defaults for encoding #668

cranberryofdoom · 2025-03-13T17:30:06Z

Context

We're currently using libxmljs to parse XML for SAML IDP metadata. When we fetch the metadata from the service, the response.body may sometimes come back as a string with the following invisible first character: \uFEFF.

This character is the Unicode Byte Order Mark (BOM). It’s a special invisible character that's often used at the beginning of text files to indicate the byte order (endianness) of the file, and it is particularly common for UTF-8 encoded files.

I noticed that the behavior between libxmljs.parseXml and libxmljs.parseXmlAsync handled this invisible character differently.

libxmljs.parseXml succeeds with no issue because if the buffer is a string, it will default to an encoding of UTF-8.

However, libxmljs.parseXmlAsync fails because it only uses DEFAULT_XML_PARSE_OPTIONS.encoding. This is not defined, which then it throws with the following error: Error: Start tag expected, '<' not found.

The Change

libxmljs.parseXml and libxmljs.parseXmlAsync both now have the following as its encoding argument: options.encoding || DEFAULT_XML_PARSE_OPTIONS.encoding || (typeof buffer === "string" ? "UTF-8" : null), which should handle this invisible character case.

rchipka · 2025-04-08T16:51:19Z

Nice work, thanks!!

…ing (libxmljs#668) (cherry picked from commit 39092a9)

elb-notion · 2025-04-17T17:43:19Z

Hey @rchipka, will there be a new release any time soon? Would love to get some of the recent updates in without having to import the github repo instead of the package

fix: make parseXMLAsync and parseXML have the same defaults for encoding

bd662b5

rchipka merged commit 39092a9 into libxmljs:master Apr 8, 2025
0 of 3 checks passed

axel-capodaglio pushed a commit to axel-capodaglio/libxmljs that referenced this pull request Apr 11, 2025

fix: make parseXMLAsync and parseXML have the same defaults for encod…

29a4b55

…ing (libxmljs#668) (cherry picked from commit 39092a9)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: make parseXMLAsync and parseXML have the same defaults for encoding #668

fix: make parseXMLAsync and parseXML have the same defaults for encoding #668

Uh oh!

cranberryofdoom commented Mar 13, 2025

Uh oh!

rchipka commented Apr 8, 2025

Uh oh!

Uh oh!

elb-notion commented Apr 17, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

fix: make parseXMLAsync and parseXML have the same defaults for encoding #668

fix: make parseXMLAsync and parseXML have the same defaults for encoding #668

Uh oh!

Conversation

cranberryofdoom commented Mar 13, 2025

Context

The Change

Uh oh!

rchipka commented Apr 8, 2025

Uh oh!

Uh oh!

elb-notion commented Apr 17, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants