Skip to content

gh-151295: Fix use-after-free in bytes.join()/bytearray.join() via re-entrant __buffer__#151296

Merged
serhiy-storchaka merged 3 commits into
python:mainfrom
tonghuaroot:fix-bytesjoin-reentrant-buffer
Jun 11, 2026
Merged

gh-151295: Fix use-after-free in bytes.join()/bytearray.join() via re-entrant __buffer__#151296
serhiy-storchaka merged 3 commits into
python:mainfrom
tonghuaroot:fix-bytesjoin-reentrant-buffer

Conversation

@tonghuaroot

Copy link
Copy Markdown
Contributor

What

bytes.join() and bytearray.join() could crash with a use-after-free when a
non-bytes item's __buffer__() dropped the last reference to that item by
mutating the sequence being joined.

STRINGLIB(bytes_join) (Objects/stringlib/join.h) borrows each item from the
PySequence_Fast view. The exact-bytes fast path keeps the item alive with
Py_NewRef(item), but the general buffer path called
PyObject_GetBuffer(item, ...) on the borrowed item without owning a
reference. PyObject_GetBuffer() runs item.__buffer__() (PEP 688), which can
execute arbitrary Python; if that callback drops the item's last reference
(e.g. L.clear()), the item is freed while the buffer machinery is still
dereferencing it.

Why

This is the same class of re-entrant __buffer__ mutation bug as gh-143988
(sendmsg/recvmsg_into). The end-of-loop "sequence changed size" recheck
does not catch it because the use-after-free happens inside the current
iteration's PyObject_GetBuffer() call.

Unlike the sendmsg fix, this does not snapshot the whole sequence up front.
bytes_join already has a "sequence changed size during iteration" recheck
after each item, so the minimal fix is just to keep the current item alive
across its own PyObject_GetBuffer() call (Py_INCREF/Py_DECREF); the
existing recheck then reports the mutation as a RuntimeError. Snapshotting
the whole sequence would be a larger change and would alter that error
behaviour, so the targeted reference fix is preferred.

This is a crash-robustness fix, not a security vulnerability: the caller has to
pass an object with a malicious __buffer__ into its own join() call, so it
crashes the caller's own process with no trust boundary crossed.

Fix

Hold a reference to item across PyObject_GetBuffer() in the buffer path
(Py_INCREF before the call, Py_DECREF on both the success and the failure
path), mirroring the Py_NewRef(item) already used by the exact-bytes fast
path.

Testing

BaseBytesTest.test_join_reentrant_buffer_mutation (runs for both bytes and
bytearray) exercises a __buffer__ that mutates the joined list, for empty
and non-empty separators. One case clears the list (changing its length) and
asserts a RuntimeError is raised instead of crashing. A second case replaces
the item in place with an object: the list length is unchanged, so the
"sequence changed size" recheck cannot fire and only keeping the item alive
across __buffer__() avoids the use-after-free; it asserts the join still
returns the expected result. A benign __buffer__ negative control (also run
for both types and separators) verifies normal joins still produce the correct
result.

Verified locally on macOS arm64 against a fresh build (both --with-pydebug
and a release build): without the patch the new test fatally crashes the
interpreter (on a debug build the fault is in bufferwrapper_releasebuf,
confirming a use-after-free); with the patch python -m test test_bytes
passes (312 tests) with no reference leaks under -R 3:3.

This affects 3.12+, since PEP 688 made __buffer__ overridable in pure
Python, so the fix should be backported to the maintained 3.13 and 3.14
branches.

AI usage disclosure

I used Claude to help draft this change (the C fix, the regression test, and
this description). I reviewed every line, ran the reproducer and the full
test_bytes suite locally (debug and release builds), and confirmed the fix
and the test behave as described.

Comment thread Lib/test/test_bytes.py Outdated
seq = make_seq(replace)
self.assertEqual(sep.join(seq), sep.join([b'a', b'x', b'c']))

# A benign __buffer__() that does not mutate joins normally.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this related to the issue?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, removed it.

Comment thread Lib/test/test_bytes.py Outdated
with self.assertRaises(TypeError):
dot_join([memoryview(b"ab"), "cd", b"ef"])

def test_join_reentrant_buffer_mutation(self):

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue is not reentrant mutation, but concurrent mutation. The GIL can be released due to execution of __buffer__, and the sequence can be concurrently mutated in other thread. This is more probably to happen in real world than intentional reentrant mutation. Reentrant mutation is just a simple way to imitate concurrent mutation. I suggest to focus on concurrent mutation in comments and the test name.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point — renamed to test_join_concurrent_buffer_mutation and reworded the comment accordingly. Thanks.

@serhiy-storchaka serhiy-storchaka left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. 👍

@serhiy-storchaka serhiy-storchaka enabled auto-merge (squash) June 11, 2026 07:26
@serhiy-storchaka serhiy-storchaka added needs backport to 3.13 bugs and security fixes needs backport to 3.14 bugs and security fixes needs backport to 3.15 pre-release feature fixes, bugs and security fixes labels Jun 11, 2026
@serhiy-storchaka serhiy-storchaka merged commit 84a322a into python:main Jun 11, 2026
64 checks passed
@miss-islington-app

Copy link
Copy Markdown

Thanks @tonghuaroot for the PR, and @serhiy-storchaka for merging it 🌮🎉.. I'm working now to backport this PR to: 3.13, 3.14, 3.15.
🐍🍒⛏🤖

@bedevere-app

bedevere-app Bot commented Jun 11, 2026

Copy link
Copy Markdown

GH-151304 is a backport of this pull request to the 3.15 branch.

@bedevere-app bedevere-app Bot removed the needs backport to 3.15 pre-release feature fixes, bugs and security fixes label Jun 11, 2026
@bedevere-app

bedevere-app Bot commented Jun 11, 2026

Copy link
Copy Markdown

GH-151305 is a backport of this pull request to the 3.14 branch.

@bedevere-app bedevere-app Bot removed the needs backport to 3.14 bugs and security fixes label Jun 11, 2026
@bedevere-app

bedevere-app Bot commented Jun 11, 2026

Copy link
Copy Markdown

GH-151306 is a backport of this pull request to the 3.13 branch.

@bedevere-app bedevere-app Bot removed the needs backport to 3.13 bugs and security fixes label Jun 11, 2026
serhiy-storchaka pushed a commit that referenced this pull request Jun 11, 2026
… via re-entrant __buffer__ (GH-151296) (GH-151304)

(cherry picked from commit 84a322a)

Co-authored-by: tonghuaroot (童话) <tonghuaroot@gmail.com>
serhiy-storchaka pushed a commit that referenced this pull request Jun 11, 2026
… via re-entrant __buffer__ (GH-151296) (GH-151305)

(cherry picked from commit 84a322a)

Co-authored-by: tonghuaroot (童话) <tonghuaroot@gmail.com>
serhiy-storchaka pushed a commit that referenced this pull request Jun 11, 2026
… via re-entrant __buffer__ (GH-151296) (GH-151306)

(cherry picked from commit 84a322a)

Co-authored-by: tonghuaroot (童话) <tonghuaroot@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants