Bug report
Bug description:
In short
_read_until_null (added a few days ago in dd94457), appends every byte from a stream into a bytearray with no upper bound (here). Any gzip stream that sets FNAME or FCOMMENT together with any other flag (bypassing the early-exit at the flag == FNAME fast-path) and never emits a NUL byte will cause the header bytearray to grow until memory is exhausted. This occurs before any decompression takes place, so existing mitigations for zip-bomb-like inputs that bound decompressed output size do not apply.
Proposed fix
Cap _read_until_null at a reasonable per-field limit and raise BadGzipFile if exceeded:
_MAX_GZIP_HEADER_FIELD_SIZE = 65536 # consistent with max FEXTRA length (16-bit)
def _read_until_null(fp, append_to: bytearray) -> None:
for _ in range(_MAX_GZIP_HEADER_FIELD_SIZE + 1):
s = fp.read(1)
append_to += s
if not s or s == b'\000':
return
raise BadGzipFile('Header field exceeds maximum size '
f'({_MAX_GZIP_HEADER_FIELD_SIZE} bytes)')
(65535 bytes was picked to match the maximum FEXTRA field size (a 16-bit length prefix), making all variable-length header fields consistent.)
I have a branch that adds a mitigation here in my fork.
CPython versions tested on:
CPython main branch
Operating systems tested on:
macOS
Linked PRs
Bug report
Bug description:
In short
_read_until_null(added a few days ago in dd94457), appends every byte from a stream into abytearraywith no upper bound (here). Any gzip stream that setsFNAMEorFCOMMENTtogether with any other flag (bypassing the early-exit at theflag == FNAMEfast-path) and never emits a NUL byte will cause theheaderbytearray to grow until memory is exhausted. This occurs before any decompression takes place, so existing mitigations for zip-bomb-like inputs that bound decompressed output size do not apply.Proposed fix
Cap
_read_until_nullat a reasonable per-field limit and raiseBadGzipFileif exceeded:(65535 bytes was picked to match the maximum
FEXTRAfield size (a 16-bit length prefix), making all variable-length header fields consistent.)I have a branch that adds a mitigation here in my fork.
CPython versions tested on:
CPython main branch
Operating systems tested on:
macOS
Linked PRs