Skip to content

Commit 9e2e5b6

Browse files
committed
fix(tests): handle bytes_iterator + never leave an exhausted body
Follow-up to 8e08272. The previous attempt at coalescing iterable request bodies bailed out (``return`` without writing ``request.body``) whenever it could not classify the chunk type. That was the wrong failure mode for one critical case: vcrpy sometimes presents the body as ``iter(some_bytes)``, whose Python type is ``bytes_iterator`` and which yields ``int`` byte values (0-255), not byte chunks. The old code saw an ``int`` chunk, hit the ``else: return`` branch, and left ``request.body`` pointing at the now-exhausted iterator. The post-fix diagnostic run made this loud: [vcr-safe-body-matcher] request body mismatch body[a]: type='bytes_iterator' length=unknown sha256=N/A body[b]: type='bytes_iterator' length=unknown sha256=N/A Every async image-edit test then ballooned from entries=2 to entries=10 in that single CI run -- the exhausted iterator meant the live multipart upload went out as an empty body, OpenAI returned 400, the SDK + flaky retries fired, each retry got a fresh iterator that my hook exhausted again, and ``new_episodes`` recorded each failed attempt as a new cassette episode. This patch: * Recognizes ``bytes_iterator`` (chunks are ``int``) and reconstructs the buffer via ``bytes(chunks)``. * Keeps the existing ``list_iterator``-over-bytes-chunks handling via ``b"".join(...)``. * **Always writes a bytes value back to ``request.body`` after consuming the iterator.** If the chunk shape is unrecognized, ``request.body`` is set to ``b""`` rather than left as an exhausted iterator. That is wrong in the sense of "we lost the body" but right in the sense of "the failure mode is now visible (live API call sends empty body and fails fast) instead of invisible (corrupt cassette grows silently)". Combined with the matcher diagnostic, any future regression in this code path will surface in the CI log immediately. Local verification covers ``bytes_iterator``, ``list_iterator`` over bytes chunks, generator over bytes chunks, empty iterator, already-bytes (idempotent), identical-content iterator equality in the matcher (now matches), and differing-content iterator inequality (still raises).
1 parent 8e08272 commit 9e2e5b6

1 file changed

Lines changed: 35 additions & 11 deletions

File tree

tests/_vcr_conftest_common.py

Lines changed: 35 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -638,18 +638,42 @@ def _materialize_iterable_body(request) -> None:
638638
chunks = list(body)
639639
except TypeError:
640640
return
641-
out = bytearray()
642-
for chunk in chunks:
643-
if isinstance(chunk, (bytes, bytearray)):
644-
out.extend(chunk)
645-
elif isinstance(chunk, str):
646-
out.extend(chunk.encode("utf-8"))
647-
else:
648-
# Heterogeneous, non-text/binary chunk - bail rather than
649-
# silently corrupt the body.
650-
return
641+
642+
# IMPORTANT: ``list(body)`` has already exhausted the original
643+
# iterator. From this point we MUST write something bytes-shaped
644+
# back to ``request.body`` -- bailing out and leaving the body as
645+
# an exhausted iterator makes the next access (cassette
646+
# serialization, retry replay, or the actual httpx send) see an
647+
# empty stream. In a previous attempt at this fix the bail path
648+
# was taken for ``bytes_iterator`` bodies (chunks were ints) and
649+
# the live send ended up with an empty multipart upload, which
650+
# the SDK retried until the cassette ballooned to ~10 episodes
651+
# per test. Fall through to ``out = b""`` rather than ``return``
652+
# so an unrecognized chunk shape still leaves a stable body.
653+
out = b""
654+
if chunks:
655+
first = chunks[0]
656+
if isinstance(first, int):
657+
# ``iter(b"...")`` yields integer byte values (its type
658+
# name is ``bytes_iterator``). ``bytes(list_of_ints)`` is
659+
# the inverse and reconstructs the original buffer.
660+
try:
661+
out = bytes(chunks)
662+
except (TypeError, ValueError):
663+
out = b""
664+
elif isinstance(first, (bytes, bytearray)):
665+
try:
666+
out = b"".join(c if isinstance(c, bytes) else bytes(c) for c in chunks)
667+
except (TypeError, ValueError):
668+
out = b""
669+
elif isinstance(first, str):
670+
try:
671+
out = "".join(chunks).encode("utf-8")
672+
except (TypeError, ValueError):
673+
out = b""
674+
651675
try:
652-
request.body = bytes(out)
676+
request.body = out
653677
except (AttributeError, TypeError):
654678
pass
655679

0 commit comments

Comments
 (0)