Bytecode parity by youknowone · Pull Request #7536 · RustPython/RustPython

youknowone · 2026-03-30T05:02:47Z

Summary by CodeRabbit

Bug Fixes
- Fixed assertion message compilation behavior
- Improved error propagation in constant folding
- Enhanced exception handler control flow
Refactor
- Optimized f-string compilation with merged literal handling
- Improved chained comparison bytecode generation
- Enhanced stack depth and control flow analysis
Tests
- Added bytecode structure and control flow validation tests

… genexpr StopIteration wrapper - split_blocks_at_jumps: split blocks at branch points so each has one exit - jump_threading: thread jumps through single-jump blocks (flowgraph.c jump_thread) - Backward conditional jump normalization: invert and create NOT_TAKEN+JUMP block - Follow empty blocks in jump-to-return optimization (next_nonempty_block) - Add PEP 479 StopIteration handler to compile_comprehension for generators

- Add Slice variant to ConstantData and BorrowedConstant - Fold constant slices (x[:3], x[1:4]) into LOAD_CONST(slice(...)) - Marshal serialization/deserialization for Slice type - Box::leak in borrow_obj_constant for PySlice roundtrip

Prepare infrastructure for exit block duplication optimization. Currently disabled pending stackdepth integration.

coderabbitai · 2026-03-30T05:02:56Z

📝 Walkthrough

Walkthrough

This PR refactors bytecode compilation in the code generator, optimizing assertion message handling, rewriting f-string compilation to merge adjacent literals, improving boolean jump code generation for chained comparisons, strengthening constant folding with error propagation, guarding super-call optimizations for extended-call syntax, and revising IR-level control flow normalization for line-number resolution and small-block inlining.

Changes

Cohort / File(s)	Summary
Assertion and Call Compilation `crates/codegen/src/compile.rs`	Optimized assertion message compilation to emit `PushNull` and argument count selectively based on message presence; added `call_uses_ex_call` guard to disable super-call optimization when arguments resemble `CALL_FUNCTION_EX` usage.
Boolean Jump and Comparison Compilation `crates/codegen/src/compile.rs`	Enhanced `compile_jump_if` with source-range tracking; introduced `compile_jump_if_compare` for bytecode generation of chained comparisons using conditional jumps; refactored match arms to ensure consistent `CompileResult` returns and source-range restoration.
Slice Constant Folding `crates/codegen/src/compile.rs`	Refactored `try_fold_constant_slice` to return `CompileResult<bool>`, propagating integer-conversion errors and updating call sites to use error operator (`?`).
F-String Compilation Rewrite `crates/codegen/src/compile.rs`	Replaced element-count/build-string strategy with "pending literal merging" where adjacent literal segments accumulate into a single `Wtf8Buf`, emitted when interpolation occurs or completion; added `compile_fstring_literal_value`, `compile_fstring_part_literal_value`, `compile_fstring_part_into`, `compile_fstring_elements_into`, and `finish_fstring` helpers to implement new control flow and preserve surrogate behavior.
Constant Folding Refactor and Tests `crates/codegen/src/compile.rs`	Added `try_fold_constant_expr` helper for expression-level constant folding; expanded test coverage for scope-exit line numbers, `CALL_FUNCTION_EX` attribute loading, assertion call shaping, chained comparison cleanup, constant slice bounds, exception cleanup inlining, and merged f-string literals.
IR Line-Number Resolution and Block Inlining `crates/codegen/src/ir.rs`	Introduced `inline_small_or_no_lineno_blocks` and `resolve_line_numbers` phases to inline small blocks and propagate line numbers post-jump-normalization; refactored `max_stackdepth` stack-start computation and jump-normalization return inlining conditions; adjusted implicit-return duplication to compute fallthrough via `next_nonempty_block`.
Test Enum Extensions `crates/jit/tests/common.rs`	Extended `StackValue` enum with `Slice` and `Frozenset` variants; updated `From<ConstantData>` conversion to destructure slice elements and convert frozenset element collections.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Bytecode parity #7535: Overlapping changes in compile.rs (slice constant-folding) and ir.rs (jump/line-number normalization and CFG transformations).
Compiler parity: docstring dedent, StopIteration wrapper, constant folding #7530: Related compiler jump and constant-folding logic modifications in compile.rs and ir.rs affecting jump-compare and constant-folding code paths.
Bytecode parity phase 3 #7514: Overlapping CALL_FUNCTION_EX-style argument handling and unconditional PushNull removal in call-emission paths.

Suggested reviewers

ShaharNaveh

Poem

🐰 Hops with glee through bytecode streams,
F-strings merge in pending dreams,
Jumps grow wise with comparisons chained,
Constants fold as errors explained—
Line numbers bloom where blocks align! ✨

🚥 Pre-merge checks | ✅ 1 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 79.66% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.
Title check	❓ Inconclusive	The title 'Bytecode parity' is vague and generic, using a non-descriptive term that does not clearly convey the specific changes made in the changeset.	Consider a more specific title that highlights the main changes, such as 'Improve bytecode compilation for assertions, f-strings, and jumps' or 'Refactor bytecode compiler with assertion/f-string optimizations and jump improvements'.

✅ Passed checks (1 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

github-actions · 2026-03-30T06:07:35Z

📦 Library Dependencies

The following Lib/ modules were modified. Here are their dependencies:

[x] test: cpython/Lib/test/test_compile.py (TODO: 27)
[x] test: cpython/Lib/test/test_compiler_assemble.py
[x] test: cpython/Lib/test/test_compiler_codegen.py
[x] test: cpython/Lib/test/test_peepholer.py (TODO: 24)

dependencies:

dependent tests: (no tests depend on compile)

Legend:

[+] path exists in CPython
[x] up-to-date, [ ] outdated

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (1)

crates/codegen/src/compile.rs (1)
6786-6867: Extract the condition-dependent jump emission.

Lines 6804-6818 and 6837-6851 choose the same PopJumpIf* opcode twice. A tiny helper keeps the bytecode shape in one place and makes future parity tweaks less error-prone.

As per coding guidelines, "When branches differ only in a value but share common logic, extract the differing value first, then call the common logic once to avoid duplicate code".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/codegen/src/compile.rs` around lines 6786 - 6867, The
compile_jump_if_compare function duplicates emission of PopJumpIfTrue vs
PopJumpIfFalse in two places; refactor by computing the appropriate instruction
once based on the boolean condition (e.g., determine an opcode or small helper
like emit_pop_jump_if(condition, delta)) and then call emit! with that chosen
instruction and target_block in both spots (replace the two conditional emit!
blocks around the final compare and the single-mid-cleanup branch). Ensure the
helper/variable is used in both locations inside compile_jump_if_compare so the
bytecode shape is controlled from one place.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@crates/codegen/src/compile.rs`:
- Around line 9052-9059: The current compile_fstring_part_literal_value reparses
the entire ast::StringLiteral range (via source_file.slice(string.range)) which
can span multiple adjacent literal tokens and thus misdecode in the
surrogate-preservation path; change this to reparse each constituent literal
token individually and concatenate their parsed results: when
string.value.contains(char::REPLACEMENT_CHARACTER) iterate the literal tokens
that form this ast::StringLiteral (use each token's range / slice from
source_file.slice(token.range)), call crate::string_parser::parse_string_literal
on each token slice with string.flags.into(), collect/concatenate the resulting
buffers, and return the combined Wtf8Buf instead of reparsing the whole range at
once.

---

Nitpick comments:
In `@crates/codegen/src/compile.rs`:
- Around line 6786-6867: The compile_jump_if_compare function duplicates
emission of PopJumpIfTrue vs PopJumpIfFalse in two places; refactor by computing
the appropriate instruction once based on the boolean condition (e.g., determine
an opcode or small helper like emit_pop_jump_if(condition, delta)) and then call
emit! with that chosen instruction and target_block in both spots (replace the
two conditional emit! blocks around the final compare and the single-mid-cleanup
branch). Ensure the helper/variable is used in both locations inside
compile_jump_if_compare so the bytecode shape is controlled from one place.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

Run ID: 9842e9fe-d3c7-43f6-91ff-60550c5cff91

📥 Commits

Reviewing files that changed from the base of the PR and between 3706c53 and c717101.

⛔ Files ignored due to path filters (1)

Lib/test/test_peepholer.py is excluded by !Lib/**

📒 Files selected for processing (3)

crates/codegen/src/compile.rs
crates/codegen/src/ir.rs
crates/jit/tests/common.rs

coderabbitai · 2026-03-30T07:22:57Z

+    fn compile_fstring_part_literal_value(&self, string: &ast::StringLiteral) -> Wtf8Buf {
+        if string.value.contains(char::REPLACEMENT_CHARACTER) {
+            let source = self.source_file.slice(string.range);
+            crate::string_parser::parse_string_literal(source, string.flags.into()).into()
+        } else {
+            string.value.to_string().into()
+        }
+    }


⚠️ Potential issue | 🟡 Minor

Reparse each constituent literal token here.

Line 9054 reparses the combined ast::StringLiteral range with one flag set, but this node can span multiple adjacent literals. In the surrogate-preservation path that can misdecode or drop later pieces, so split plain-string parts next to an f-string can produce the wrong merged literal.

💡 Proposed fix

fn compile_fstring_part_literal_value(&self, string: &ast::StringLiteral) -> Wtf8Buf { if string.value.contains(char::REPLACEMENT_CHARACTER) { - let source = self.source_file.slice(string.range); - crate::string_parser::parse_string_literal(source, string.flags.into()).into() + string + .value + .iter() + .map(|lit| { + let source = self.source_file.slice(lit.range); + crate::string_parser::parse_string_literal(source, lit.flags.into()) + }) + .collect() } else { string.value.to_string().into() } }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@crates/codegen/src/compile.rs` around lines 9052 - 9059, The current compile_fstring_part_literal_value reparses the entire ast::StringLiteral range (via source_file.slice(string.range)) which can span multiple adjacent literal tokens and thus misdecode in the surrogate-preservation path; change this to reparse each constituent literal token individually and concatenate their parsed results: when string.value.contains(char::REPLACEMENT_CHARACTER) iterate the literal tokens that form this ast::StringLiteral (use each token's range / slice from source_file.slice(token.range)), call crate::string_parser::parse_string_literal on each token slice with string.flags.into(), collect/concatenate the resulting buffers, and return the combined Wtf8Buf instead of reparsing the whole range at once.

youknowone added 3 commits March 30, 2026 13:50

Add ConstantData::Slice and constant slice folding

8041f86

- Add Slice variant to ConstantData and BorrowedConstant - Fold constant slices (x[:3], x[1:4]) into LOAD_CONST(slice(...)) - Marshal serialization/deserialization for Slice type - Box::leak in borrow_obj_constant for PySlice roundtrip

Add duplicate_exits_without_lineno (disabled) and Block: Clone

3f9307f

Prepare infrastructure for exit block duplication optimization. Currently disabled pending stackdepth integration.

Improve codegen bytecode parity

c717101

youknowone force-pushed the bytecode-parity branch from 688d815 to c717101 Compare March 30, 2026 06:06

youknowone marked this pull request as ready for review March 30, 2026 07:09

coderabbitai Bot reviewed Mar 30, 2026

View reviewed changes

youknowone merged commit 1c39fdb into RustPython:main Mar 30, 2026
16 checks passed

youknowone deleted the bytecode-parity branch March 30, 2026 09:51

coderabbitai Bot mentioned this pull request Mar 30, 2026

Bytecode parity #7541

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bytecode parity#7536

Bytecode parity#7536
youknowone merged 4 commits into
RustPython:mainfrom
youknowone:bytecode-parity

youknowone commented Mar 30, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Mar 30, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

❌ Failed checks (1 warning, 1 inconclusive)

Uh oh!

github-actions Bot commented Mar 30, 2026

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot Mar 30, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

youknowone commented Mar 30, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Mar 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

❌ Failed checks (1 warning, 1 inconclusive)

Uh oh!

github-actions Bot commented Mar 30, 2026

📦 Library Dependencies

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Mar 30, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

youknowone commented Mar 30, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Mar 30, 2026 •

edited

Loading