Skip to content

Conversation

@ShaharNaveh
Copy link
Collaborator

@ShaharNaveh ShaharNaveh commented Dec 16, 2025

Closes #2391

Summary by CodeRabbit

  • Tests
    • Added test coverage for decoding operations with surrogateescape error handling using system filesystem encoding.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Dec 16, 2025

Walkthrough

A test case is added to verify that decoding bytes with the 'surrogateescape' error handler functions correctly when using the system filesystem encoding, validating a bug fix for the previously unsupported error handler.

Changes

Cohort / File(s) Change Summary
Test Addition
extra_tests/snippets/builtin_bytes.py
Adds test assertion validating decoding of b"-\xff" with system filesystem encoding and 'surrogateescape' error handler, expecting result "-\udcff"

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~2 minutes

  • Single file modification
  • Straightforward test assertion addition
  • No complex logic or control flow changes
  • No new public APIs or exported entities

Poem

🐰 A surrogate escape, now caught in our test,
Bytes decode where they once would protest,
-\xff finds its way through the handler so fine,
To -\udcff—a fix worth this line! ✨

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Linked Issues check ⚠️ Warning The PR adds a regression test for the surrogateescape error handler during bytes decoding, but does not implement the handler itself—only tests it. The linked issue requires both implementation and regression tests. Implement the 'surrogateescape' error handler in RustPython's codec system to match CPython/PyPy semantics, then add the regression test.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'Add regression test' directly describes the main change—adding a test case for the surrogateescape decoding behavior. It is concise and clearly related to the changeset.
Out of Scope Changes check ✅ Passed The test addition is directly scoped to the linked issue #2391, which explicitly calls for regression tests to cover the surrogateescape decoding behavior.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

📜 Recent review details

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between adc2b0d and 5f8b97a.

📒 Files selected for processing (1)
  • extra_tests/snippets/builtin_bytes.py (2 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

**/*.py: In most cases, Python code should not be edited; bug fixes should be made through Rust code modifications only
Follow PEP 8 style for custom Python code
Use ruff for linting Python code

Files:

  • extra_tests/snippets/builtin_bytes.py
🧬 Code graph analysis (1)
extra_tests/snippets/builtin_bytes.py (1)
crates/vm/src/stdlib/sys.rs (1)
  • getfilesystemencoding (494-496)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)
  • GitHub Check: Run rust tests (macos-latest)
  • GitHub Check: Run snippets and cpython tests (ubuntu-latest)
  • GitHub Check: Run rust tests (windows-latest)
  • GitHub Check: Run rust tests (ubuntu-latest)
  • GitHub Check: Run snippets and cpython tests (windows-latest)
  • GitHub Check: Ensure compilation on various targets
  • GitHub Check: Check the WASM package and demo
  • GitHub Check: Check Rust code with clippy
  • GitHub Check: Run snippets and cpython tests (macos-latest)
🔇 Additional comments (2)
extra_tests/snippets/builtin_bytes.py (2)

1-1: Import is necessary and appropriate.

The sys import is required for accessing sys.getfilesystemencoding() in the regression test below.


616-617: Regression test for gh-2391 is appropriate and well-placed.

The test directly validates that decoding bytes with the 'surrogateescape' error handler produces the expected surrogate code point (\udcff for \xff), matching CPython and PyPy semantics. The test is properly positioned alongside other decode assertions and follows the existing test pattern.

Verify that the Rust-side implementation of the 'surrogateescape' error handler is complete and functional, as this Python test alone will not fix the underlying issue. The implementation should ensure that invalid UTF-8 sequences are replaced with surrogate code points in the Unicode Private Use Area when this error handler is used.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link
Contributor

Code has been automatically formatted

The code in this PR has been formatted using cargo fmt --all.
Please pull the latest changes before pushing again:

git pull origin regr-tests-bytes-decode

@ShaharNaveh ShaharNaveh mentioned this pull request Dec 16, 2025
@youknowone youknowone merged commit 9d2dd17 into RustPython:main Dec 17, 2025
13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] decoding bytes error

2 participants