Skip to content

[CVE-2025-68463] Bio.Entrez.DataHandler, Bio.Entrez.read, Bio.Entrez.parse are vulnerable to XXE and SSRF #5109

@hartwork

Description

@hartwork

Hi!

The code in class Bio.Entrez.DataHandler — that is also powering API functions Bio.Entrez.read and Bio.Entrez.parse — parses arbitrary XML content in a way where contained DTD and XSD URLs are requested via a HTTP GET request through stdlib function urllib.request.urlopen unless a file with the same basename — the last path component of the URL — matches a local file in the filesystem based DTD cache (or XSD cache respectively):

handle = self.open_xsd_file(os.path.basename(schema))
# if there is no local xsd file grab the url and parse the file
if not handle:
handle = urlopen(schema)

biopython/Bio/Entrez/Parser.py

Lines 1126 to 1131 in d07dde9

handle = self.open_dtd_file(filename)
if not handle:
# DTD is not available as a local file. Try accessing it through
# the internet instead.
try:
handle = urlopen(url)

That urlopen allows an attacker to craft XML that when parsed makes the parser do arbitrary(!) HTTP GET requests which can be used to e.g. access internal network resources or cause denial of service: the code is vulnerable to server-side request forgery via a form of "doctype XXE" (XML external entity attack).

Demo: biopython_doctype_xxe_demo.py

The very same code in Biopython is also vulnerable to MITM (due to not enforcing TLS) and cache poisoning (..), but that is secondary and likely goes away for free once the SSRF issue is plugged properly.

Official Python >=3.15 docs are now warning of problems like these at https://docs.python.org/3.15/library/pyexpat.html#xml.parsers.expat.xmlparser.ExternalEntityRefHandler .

I hope that you will find some complete fix to this problem, so that users are safe from attacks through BioPython's XML parser.

Thanks and best, Sebastian

CC @mdehoon @peterjc

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions