Skip to content

Conversation

@pixeebot
Copy link

@pixeebot pixeebot bot commented Aug 2, 2025

This codemod sets the parser parameter in calls to lxml.etree.parse and lxml.etree.fromstring if omitted or set to None (the default value). Unfortunately, the default parser=None means lxml will rely on an unsafe parser, making your code potentially vulnerable to entity expansion attacks and external entity (XXE) attacks.

The changes look as follows:

  import lxml.etree
- lxml.etree.parse("path_to_file")
- lxml.etree.fromstring("xml_str")
+ lxml.etree.parse("path_to_file", parser=lxml.etree.XMLParser(resolve_entities=False))
+ lxml.etree.fromstring("xml_str", parser=lxml.etree.XMLParser(resolve_entities=False))
More reading

🧚🤖 Powered by Pixeebot

Feedback | Community | Docs | Codemod ID: pixee:python/safe-lxml-parsing

@pixeebot
Copy link
Author

pixeebot bot commented Aug 10, 2025

I'm confident in this change, but I'm not a maintainer of this project. Do you see any reason not to merge it?

If this change was not helpful, or you have suggestions for improvements, please let me know!

@pixeebot
Copy link
Author

pixeebot bot commented Aug 11, 2025

Just a friendly ping to remind you about this change. If there are concerns about it, we'd love to hear about them!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant