Conversation
… genexpr StopIteration wrapper - split_blocks_at_jumps: split blocks at branch points so each has one exit - jump_threading: thread jumps through single-jump blocks (flowgraph.c jump_thread) - Backward conditional jump normalization: invert and create NOT_TAKEN+JUMP block - Follow empty blocks in jump-to-return optimization (next_nonempty_block) - Add PEP 479 StopIteration handler to compile_comprehension for generators
- Add Slice variant to ConstantData and BorrowedConstant - Fold constant slices (x[:3], x[1:4]) into LOAD_CONST(slice(...)) - Marshal serialization/deserialization for Slice type - Box::leak in borrow_obj_constant for PySlice roundtrip
Prepare infrastructure for exit block duplication optimization. Currently disabled pending stackdepth integration.
📝 WalkthroughWalkthroughThis PR refactors bytecode compilation in the code generator, optimizing assertion message handling, rewriting f-string compilation to merge adjacent literals, improving boolean jump code generation for chained comparisons, strengthening constant folding with error propagation, guarding super-call optimizations for extended-call syntax, and revising IR-level control flow normalization for line-number resolution and small-block inlining. Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 1 | ❌ 2❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (1 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
688d815 to
c717101
Compare
📦 Library DependenciesThe following Lib/ modules were modified. Here are their dependencies: [x] test: cpython/Lib/test/test_compile.py (TODO: 27) dependencies: dependent tests: (no tests depend on compile) Legend:
|
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
crates/codegen/src/compile.rs (1)
6786-6867: Extract thecondition-dependent jump emission.Lines 6804-6818 and 6837-6851 choose the same
PopJumpIf*opcode twice. A tiny helper keeps the bytecode shape in one place and makes future parity tweaks less error-prone.As per coding guidelines, "When branches differ only in a value but share common logic, extract the differing value first, then call the common logic once to avoid duplicate code".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/codegen/src/compile.rs` around lines 6786 - 6867, The compile_jump_if_compare function duplicates emission of PopJumpIfTrue vs PopJumpIfFalse in two places; refactor by computing the appropriate instruction once based on the boolean condition (e.g., determine an opcode or small helper like emit_pop_jump_if(condition, delta)) and then call emit! with that chosen instruction and target_block in both spots (replace the two conditional emit! blocks around the final compare and the single-mid-cleanup branch). Ensure the helper/variable is used in both locations inside compile_jump_if_compare so the bytecode shape is controlled from one place.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@crates/codegen/src/compile.rs`:
- Around line 9052-9059: The current compile_fstring_part_literal_value reparses
the entire ast::StringLiteral range (via source_file.slice(string.range)) which
can span multiple adjacent literal tokens and thus misdecode in the
surrogate-preservation path; change this to reparse each constituent literal
token individually and concatenate their parsed results: when
string.value.contains(char::REPLACEMENT_CHARACTER) iterate the literal tokens
that form this ast::StringLiteral (use each token's range / slice from
source_file.slice(token.range)), call crate::string_parser::parse_string_literal
on each token slice with string.flags.into(), collect/concatenate the resulting
buffers, and return the combined Wtf8Buf instead of reparsing the whole range at
once.
---
Nitpick comments:
In `@crates/codegen/src/compile.rs`:
- Around line 6786-6867: The compile_jump_if_compare function duplicates
emission of PopJumpIfTrue vs PopJumpIfFalse in two places; refactor by computing
the appropriate instruction once based on the boolean condition (e.g., determine
an opcode or small helper like emit_pop_jump_if(condition, delta)) and then call
emit! with that chosen instruction and target_block in both spots (replace the
two conditional emit! blocks around the final compare and the single-mid-cleanup
branch). Ensure the helper/variable is used in both locations inside
compile_jump_if_compare so the bytecode shape is controlled from one place.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yml
Review profile: CHILL
Plan: Pro
Run ID: 9842e9fe-d3c7-43f6-91ff-60550c5cff91
⛔ Files ignored due to path filters (1)
Lib/test/test_peepholer.pyis excluded by!Lib/**
📒 Files selected for processing (3)
crates/codegen/src/compile.rscrates/codegen/src/ir.rscrates/jit/tests/common.rs
| fn compile_fstring_part_literal_value(&self, string: &ast::StringLiteral) -> Wtf8Buf { | ||
| if string.value.contains(char::REPLACEMENT_CHARACTER) { | ||
| let source = self.source_file.slice(string.range); | ||
| crate::string_parser::parse_string_literal(source, string.flags.into()).into() | ||
| } else { | ||
| string.value.to_string().into() | ||
| } | ||
| } |
There was a problem hiding this comment.
Reparse each constituent literal token here.
Line 9054 reparses the combined ast::StringLiteral range with one flag set, but this node can span multiple adjacent literals. In the surrogate-preservation path that can misdecode or drop later pieces, so split plain-string parts next to an f-string can produce the wrong merged literal.
💡 Proposed fix
fn compile_fstring_part_literal_value(&self, string: &ast::StringLiteral) -> Wtf8Buf {
if string.value.contains(char::REPLACEMENT_CHARACTER) {
- let source = self.source_file.slice(string.range);
- crate::string_parser::parse_string_literal(source, string.flags.into()).into()
+ string
+ .value
+ .iter()
+ .map(|lit| {
+ let source = self.source_file.slice(lit.range);
+ crate::string_parser::parse_string_literal(source, lit.flags.into())
+ })
+ .collect()
} else {
string.value.to_string().into()
}
}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@crates/codegen/src/compile.rs` around lines 9052 - 9059, The current
compile_fstring_part_literal_value reparses the entire ast::StringLiteral range
(via source_file.slice(string.range)) which can span multiple adjacent literal
tokens and thus misdecode in the surrogate-preservation path; change this to
reparse each constituent literal token individually and concatenate their parsed
results: when string.value.contains(char::REPLACEMENT_CHARACTER) iterate the
literal tokens that form this ast::StringLiteral (use each token's range / slice
from source_file.slice(token.range)), call
crate::string_parser::parse_string_literal on each token slice with
string.flags.into(), collect/concatenate the resulting buffers, and return the
combined Wtf8Buf instead of reparsing the whole range at once.
Summary by CodeRabbit
Bug Fixes
Refactor
Tests