Conversation
… genexpr StopIteration wrapper - split_blocks_at_jumps: split blocks at branch points so each has one exit - jump_threading: thread jumps through single-jump blocks (flowgraph.c jump_thread) - Backward conditional jump normalization: invert and create NOT_TAKEN+JUMP block - Follow empty blocks in jump-to-return optimization (next_nonempty_block) - Add PEP 479 StopIteration handler to compile_comprehension for generators
- Add Slice variant to ConstantData and BorrowedConstant - Fold constant slices (x[:3], x[1:4]) into LOAD_CONST(slice(...)) - Marshal serialization/deserialization for Slice type - Box::leak in borrow_obj_constant for PySlice roundtrip
Add Frozenset to ConstantData, BorrowedConstant, and marshal support. Actual frozenset folding (BUILD_SET + CONTAINS_OP → LOAD_CONST frozenset) requires VirtualMachine for element hashing and is deferred.
📝 WalkthroughWalkthroughThis PR extends the compiler's constant handling to support slice and frozenset literals at compile-time, implements PEP 479-style StopIteration exception handling for generators and coroutines, and optimizes the control flow graph with new passes for block splitting, jump threading, and unreachable code elimination. Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 1 | ❌ 2❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (1 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Prepare infrastructure for exit block duplication optimization. Currently disabled pending stackdepth integration.
0cd225f to
42b24cb
Compare
There was a problem hiding this comment.
Actionable comments posted: 3
🧹 Nitpick comments (4)
crates/codegen/src/compile.rs (1)
8416-8431: Extract the StopIteration wrapper into a shared helper.This setup/teardown sequence is now effectively duplicated from
Compiler::compile_function_bodyat Lines 3877-3892 and Lines 3924-3939. Any later tweak to handler depth, location suppression, or block layout now has to stay in sync in two places.♻️ Suggested shape
- let stop_iteration_block = if is_gen_scope { - let handler_block = self.new_block(); - emit!( - self, - PseudoInstruction::SetupCleanup { - delta: handler_block - } - ); - self.set_no_location(); - self.push_fblock(FBlockType::StopIteration, handler_block, handler_block)?; - Some(handler_block) - } else { - None - }; + let stop_iteration_block = self.enter_stop_iteration_wrapper(is_gen_scope)?; ... - if let Some(handler_block) = stop_iteration_block { - emit!(self, PseudoInstruction::PopBlock); - self.set_no_location(); - self.pop_fblock(FBlockType::StopIteration); - self.switch_to_block(handler_block); - emit!( - self, - Instruction::CallIntrinsic1 { - func: oparg::IntrinsicFunction1::StopIterationError - } - ); - self.set_no_location(); - emit!(self, Instruction::Reraise { depth: 1u32 }); - self.set_no_location(); - } + self.exit_stop_iteration_wrapper(stop_iteration_block);Also applies to: 8522-8538
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/codegen/src/compile.rs` around lines 8416 - 8431, Extract the duplicated StopIteration setup/teardown into a shared helper (e.g., a method on the compiler like setup_stop_iteration_handler or wrap_with_stopiteration) that encapsulates the logic currently using is_gen_scope, self.new_block(), emitting PseudoInstruction::SetupCleanup { delta: handler_block }, self.set_no_location(), and self.push_fblock(FBlockType::StopIteration, handler_block, handler_block) and returns Option<handler_block>; then replace the duplicated blocks that create stop_iteration_block (the sequences in compile_function_body and the other location around is_gen_scope/is_async) with a call to that helper so both sites share the same behavior and return value.crates/codegen/src/ir.rs (1)
228-229: Consider adding a TODO comment explaining whyduplicate_exits_without_linenois disabled.The comment "pending stackdepth fix" is helpful, but linking to a tracking issue would make it easier to follow up.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/codegen/src/ir.rs` around lines 228 - 229, Update the existing TODO comment above the disabled call to duplicate_exits_without_lineno(&mut self.blocks) to include a reference to the tracking issue or bug ID and a short timestamp/owner tag (e.g., "pending stackdepth fix - see ISSUE-1234 `@2026-03-30`") so future readers can find the follow-up; leave the call commented out but expand the comment to mention the exact reason ("stackdepth calculation bug") and the issue link/number and optionally who filed it.crates/vm/src/builtins/code.rs (1)
319-326: Theunimplemented!panic is appropriately documented.The comment clearly explains that
PyObjBaglacksVirtualMachineaccess needed for frozenset hashing, and thatPyMarshalBaghandles this case. Consider adding a link to wherePyMarshalBagimplements this, or logging a warning if this path is ever hit unexpectedly.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/vm/src/builtins/code.rs` around lines 319 - 326, In the BorrowedConstant::Frozenset arm inside PyObjBag::make_constant, replace the bare unimplemented!() with a logged warning that clearly states this path should be handled by PyMarshalBag (reference PyMarshalBag's frozenset handling in the message) and include a link or code reference to where PyMarshalBag implements it; after logging, either return an appropriate Err or keep the unimplemented panic if you want to preserve current behavior, but ensure the warning is emitted first so unexpected hits are observable.crates/compiler-core/src/bytecode.rs (1)
934-935: Note:Frozensethash implementation doesn't account for set semantics.The current hash implementation hashes elements in order, but frozensets are unordered. If two
Frozensetconstants have the same elements in different orders, they would have different hashes but should compare equal. However, since this is used for constant pool deduplication and constants are typically created in a deterministic order, this may be acceptable.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/compiler-core/src/bytecode.rs` around lines 934 - 935, The Frozenset arm currently calls elements.hash(state) which is order-dependent; change it to an order-independent hashing strategy so frozensets with the same elements in different orders produce identical hashes. In the Frozenset match arm (the Frozenset { elements } branch inside the Hash impl for constants), compute each element's hash into a temporary value, sort those element-hash values (or otherwise combine them with a commutative operation like XOR/add on u64 hashes), then feed the sorted/combined result into the provided state so hashing is deterministic and order-independent.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@crates/codegen/src/compile.rs`:
- Around line 7577-7581: try_fold_constant_slice currently only recognizes
numbers, booleans and None and limits integer folding to i64, which is narrower
than ExprExt::is_constant_slice; update try_fold_constant_slice to accept the
same constant kinds that ExprExt::is_constant_slice treats constant (include
string/bytes/ellipsis bounds and arbitrary-size integers, not just i64) so that
patterns like obj["a":"z"] and large integer literals are folded to
LOAD_CONST(slice) and hit the BINARY_SLICE fast path; make the same expansion in
the adjacent constant-folding block that has the i64 restriction so both places
have parity with ExprExt::is_constant_slice.
In `@crates/compiler-core/src/bytecode.rs`:
- Around line 854-887: The From<ConstantData> for StackValue implementation
currently uses a wildcard arm and will panic for the new ConstantData::Slice and
ConstantData::Frozenset variants; update the match in From<ConstantData> for
StackValue to explicitly handle ConstantData::Slice (map it to an appropriate
StackValue variant representing a slice object, constructing start/stop/step
from the elements array) and ConstantData::Frozenset (convert elements
Vec<ConstantData> into the corresponding StackValue frozenset representation),
ensuring you pattern-match on ConstantData::Slice and ConstantData::Frozenset
instead of relying on the unimplemented wildcard arm.
In `@crates/compiler-core/src/marshal.rs`:
- Around line 471-478: The make_frozenset implementation collects elements into
a Vec and passes them straight to make_constant, which preserves duplicates;
update make_frozenset to deduplicate elements before constructing the
BorrowedConstant::Frozenset so its behavior matches CPython. Concretely, in
make_frozenset use a set (e.g., HashSet or BTreeSet) to track seen values and
build a new Vec of unique elements (preserving a deterministic order such as
first-seen or sorted if needed), then call
self.make_constant::<Bag::Constant>(BorrowedConstant::Frozenset { elements:
&unique_elements }) instead of passing the original elements Vec. Ensure you
reference the existing function make_frozenset and the constant constructor
BorrowedConstant::Frozenset when applying the change.
---
Nitpick comments:
In `@crates/codegen/src/compile.rs`:
- Around line 8416-8431: Extract the duplicated StopIteration setup/teardown
into a shared helper (e.g., a method on the compiler like
setup_stop_iteration_handler or wrap_with_stopiteration) that encapsulates the
logic currently using is_gen_scope, self.new_block(), emitting
PseudoInstruction::SetupCleanup { delta: handler_block },
self.set_no_location(), and self.push_fblock(FBlockType::StopIteration,
handler_block, handler_block) and returns Option<handler_block>; then replace
the duplicated blocks that create stop_iteration_block (the sequences in
compile_function_body and the other location around is_gen_scope/is_async) with
a call to that helper so both sites share the same behavior and return value.
In `@crates/codegen/src/ir.rs`:
- Around line 228-229: Update the existing TODO comment above the disabled call
to duplicate_exits_without_lineno(&mut self.blocks) to include a reference to
the tracking issue or bug ID and a short timestamp/owner tag (e.g., "pending
stackdepth fix - see ISSUE-1234 `@2026-03-30`") so future readers can find the
follow-up; leave the call commented out but expand the comment to mention the
exact reason ("stackdepth calculation bug") and the issue link/number and
optionally who filed it.
In `@crates/compiler-core/src/bytecode.rs`:
- Around line 934-935: The Frozenset arm currently calls elements.hash(state)
which is order-dependent; change it to an order-independent hashing strategy so
frozensets with the same elements in different orders produce identical hashes.
In the Frozenset match arm (the Frozenset { elements } branch inside the Hash
impl for constants), compute each element's hash into a temporary value, sort
those element-hash values (or otherwise combine them with a commutative
operation like XOR/add on u64 hashes), then feed the sorted/combined result into
the provided state so hashing is deterministic and order-independent.
In `@crates/vm/src/builtins/code.rs`:
- Around line 319-326: In the BorrowedConstant::Frozenset arm inside
PyObjBag::make_constant, replace the bare unimplemented!() with a logged warning
that clearly states this path should be handled by PyMarshalBag (reference
PyMarshalBag's frozenset handling in the message) and include a link or code
reference to where PyMarshalBag implements it; after logging, either return an
appropriate Err or keep the unimplemented panic if you want to preserve current
behavior, but ensure the warning is emitted first so unexpected hits are
observable.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yml
Review profile: CHILL
Plan: Pro
Run ID: 54e2036a-fe74-4e78-9275-4b95f948e8d1
⛔ Files ignored due to path filters (1)
crates/codegen/src/snapshots/rustpython_codegen__compile__tests__nested_double_async_with.snapis excluded by!**/*.snap
📒 Files selected for processing (5)
crates/codegen/src/compile.rscrates/codegen/src/ir.rscrates/compiler-core/src/bytecode.rscrates/compiler-core/src/marshal.rscrates/vm/src/builtins/code.rs
| // Try constant slice folding first | ||
| if self.try_fold_constant_slice(lower.as_deref(), upper.as_deref(), step.as_deref()) | ||
| { | ||
| return Ok(()); | ||
| } |
There was a problem hiding this comment.
try_fold_constant_slice is narrower than the rest of the compiler’s constant logic.
Lines 8999-9016 only fold numbers, booleans, and None, and Line 9005 also limits integer folding to i64. But ExprExt::is_constant_slice() already treats string/bytes/ellipsis bounds as constant, so cases like obj["a":"z"] skip the BINARY_SLICE fast path and still never reach the new LOAD_CONST(slice) path. Large integer literals have the same gap. That leaves valid constant slices on the old bytecode path and misses the parity target for this feature.
💡 Suggested fix
- if self.try_fold_constant_slice(lower.as_deref(), upper.as_deref(), step.as_deref())
- {
+ if self.try_fold_constant_slice(
+ lower.as_deref(),
+ upper.as_deref(),
+ step.as_deref(),
+ )? {
return Ok(());
}- fn try_fold_constant_slice(
- &mut self,
- lower: Option<&ast::Expr>,
- upper: Option<&ast::Expr>,
- step: Option<&ast::Expr>,
- ) -> bool {
- fn to_const(expr: Option<&ast::Expr>) -> Option<ConstantData> {
- match expr {
- None | Some(ast::Expr::NoneLiteral(_)) => Some(ConstantData::None),
- Some(ast::Expr::NumberLiteral(ast::ExprNumberLiteral { value, .. })) => match value
- {
- ast::Number::Int(i) => Some(ConstantData::Integer {
- value: i.as_i64()?.into(),
- }),
- ast::Number::Float(f) => Some(ConstantData::Float { value: *f }),
- ast::Number::Complex { real, imag } => Some(ConstantData::Complex {
- value: num_complex::Complex64::new(*real, *imag),
- }),
- },
- Some(ast::Expr::BooleanLiteral(ast::ExprBooleanLiteral { value, .. })) => {
- Some(ConstantData::Boolean { value: *value })
- }
- // Only match Constant_kind nodes (no UnaryOp)
- _ => None,
- }
- }
+ fn try_fold_constant_slice(
+ &mut self,
+ lower: Option<&ast::Expr>,
+ upper: Option<&ast::Expr>,
+ step: Option<&ast::Expr>,
+ ) -> CompileResult<bool> {
+ let mut to_const = |expr: Option<&ast::Expr>| -> CompileResult<Option<ConstantData>> {
+ Ok(match expr {
+ None | Some(ast::Expr::NoneLiteral(_)) => Some(ConstantData::None),
+ Some(ast::Expr::NumberLiteral(ast::ExprNumberLiteral { value, .. })) => {
+ Some(match value {
+ ast::Number::Int(i) => ConstantData::Integer {
+ value: ruff_int_to_bigint(i).map_err(|e| self.error(e))?,
+ },
+ ast::Number::Float(f) => ConstantData::Float { value: *f },
+ ast::Number::Complex { real, imag } => ConstantData::Complex {
+ value: Complex::new(*real, *imag),
+ },
+ })
+ }
+ Some(ast::Expr::StringLiteral(s)) => Some(ConstantData::Str {
+ value: self.compile_string_value(s),
+ }),
+ Some(ast::Expr::BytesLiteral(b)) => Some(ConstantData::Bytes {
+ value: b.value.bytes().collect(),
+ }),
+ Some(ast::Expr::BooleanLiteral(ast::ExprBooleanLiteral { value, .. })) => {
+ Some(ConstantData::Boolean { value: *value })
+ }
+ Some(ast::Expr::EllipsisLiteral(_)) => Some(ConstantData::Ellipsis),
+ _ => None,
+ })
+ };
- let (Some(start), Some(stop), Some(step_val)) =
- (to_const(lower), to_const(upper), to_const(step))
+ let (Some(start), Some(stop), Some(step_val)) =
+ (to_const(lower)?, to_const(upper)?, to_const(step)?)
else {
- return false;
+ return Ok(false);
};
self.emit_load_const(ConstantData::Slice {
elements: Box::new([start, stop, step_val]),
});
- true
+ Ok(true)
}Also applies to: 8992-9030
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@crates/codegen/src/compile.rs` around lines 7577 - 7581,
try_fold_constant_slice currently only recognizes numbers, booleans and None and
limits integer folding to i64, which is narrower than
ExprExt::is_constant_slice; update try_fold_constant_slice to accept the same
constant kinds that ExprExt::is_constant_slice treats constant (include
string/bytes/ellipsis bounds and arbitrary-size integers, not just i64) so that
patterns like obj["a":"z"] and large integer literals are folded to
LOAD_CONST(slice) and hit the BINARY_SLICE fast path; make the same expansion in
the adjacent constant-folding block that has the i64 restriction so both places
have parity with ExprExt::is_constant_slice.
| Tuple { | ||
| elements: Vec<ConstantData>, | ||
| }, | ||
| Integer { | ||
| value: BigInt, | ||
| }, | ||
| Float { | ||
| value: f64, | ||
| }, | ||
| Complex { | ||
| value: Complex64, | ||
| }, | ||
| Boolean { | ||
| value: bool, | ||
| }, | ||
| Str { | ||
| value: Wtf8Buf, | ||
| }, | ||
| Bytes { | ||
| value: Vec<u8>, | ||
| }, | ||
| Code { | ||
| code: Box<CodeObject>, | ||
| }, | ||
| /// Constant slice(start, stop, step) | ||
| Slice { | ||
| elements: Box<[ConstantData; 3]>, | ||
| }, | ||
| Frozenset { | ||
| elements: Vec<ConstantData>, | ||
| }, | ||
| None, | ||
| Ellipsis, | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Check if there are other match statements on ConstantData that may need updating
rg -n "match.*ConstantData" --type=rust -A10 | head -100Repository: RustPython/RustPython
Length of output: 47
🏁 Script executed:
cat crates/jit/tests/common.rsRepository: RustPython/RustPython
Length of output: 14312
🏁 Script executed:
rg "ConstantData::" --type=rust -B2 -A5 | head -150Repository: RustPython/RustPython
Length of output: 9976
🏁 Script executed:
fd . crates/jit/tests --type f -name "*.rs"Repository: RustPython/RustPython
Length of output: 295
🏁 Script executed:
rg "match.*\{" --type=rust -B3 -A15 | grep -A15 "ConstantData"Repository: RustPython/RustPython
Length of output: 41022
🏁 Script executed:
rg "Slice\s*\{|Frozenset\s*\{" --type=rust -B2 -A2Repository: RustPython/RustPython
Length of output: 16928
🏁 Script executed:
fd . crates/jit/tests --type f -name "*.rs" -exec grep -l "slice\|frozenset" {} \;Repository: RustPython/RustPython
Length of output: 295
Update crates/jit/tests/common.rs to handle ConstantData::Slice and ConstantData::Frozenset variants.
The From<ConstantData> for StackValue implementation uses a wildcard match arm that will panic at runtime if these new constant types are encountered. Since the compiler actively generates ConstantData::Slice constants, any JIT test using slice expressions will fail with an unimplemented! panic. Add handling for both variants in the match statement.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@crates/compiler-core/src/bytecode.rs` around lines 854 - 887, The
From<ConstantData> for StackValue implementation currently uses a wildcard arm
and will panic for the new ConstantData::Slice and ConstantData::Frozenset
variants; update the match in From<ConstantData> for StackValue to explicitly
handle ConstantData::Slice (map it to an appropriate StackValue variant
representing a slice object, constructing start/stop/step from the elements
array) and ConstantData::Frozenset (convert elements Vec<ConstantData> into the
corresponding StackValue frozenset representation), ensuring you pattern-match
on ConstantData::Slice and ConstantData::Frozenset instead of relying on the
unimplemented wildcard arm.
| fn make_frozenset(&self, it: impl Iterator<Item = Self::Value>) -> Result<Self::Value> { | ||
| let elements: Vec<Self::Value> = it.collect(); | ||
| Ok( | ||
| self.make_constant::<Bag::Constant>(BorrowedConstant::Frozenset { | ||
| elements: &elements, | ||
| }), | ||
| ) | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
cat -n crates/compiler-core/src/marshal.rs | sed -n '460,490p'Repository: RustPython/RustPython
Length of output: 1125
🏁 Script executed:
# Search for context around frozenset handling
rg -A 20 -B 5 "make_frozenset" crates/compiler-core/src/marshal.rsRepository: RustPython/RustPython
Length of output: 2488
🏁 Script executed:
# Check if there's decoding logic that calls make_frozenset
rg -B 10 "TYPE_FROZENSET|frozenset" crates/compiler-core/src/marshal.rs | head -100Repository: RustPython/RustPython
Length of output: 1368
🏁 Script executed:
# Find BorrowedConstant definition and Frozenset variant
rg -A 3 "enum BorrowedConstant|Frozenset {" crates/compiler-core/src/marshal.rs | head -50Repository: RustPython/RustPython
Length of output: 194
🏁 Script executed:
# Search for BorrowedConstant definition in related files
fd -e rs | xargs rg "BorrowedConstant" | grep -E "(enum|struct)" | head -20Repository: RustPython/RustPython
Length of output: 1031
🏁 Script executed:
# Check what makes up a constant and how frozensets are handled in constant bag
rg -B 5 -A 10 "struct.*Constant|enum.*Constant" crates/compiler-core/src/ | grep -A 10 -B 5 frozensetRepository: RustPython/RustPython
Length of output: 47
🏁 Script executed:
# Find the BorrowedConstant enum definition
rg -A 40 "pub enum BorrowedConstant" crates/compiler-core/src/Repository: RustPython/RustPython
Length of output: 3104
🏁 Script executed:
# Check if there's any deduplication logic in set/frozenset handling
rg -i "dedup\|duplicate\|hashset" crates/compiler-core/src/marshal.rsRepository: RustPython/RustPython
Length of output: 47
🏁 Script executed:
# Look at the make_constant implementation to understand how frozensets are stored
rg -B 5 -A 15 "fn make_constant" crates/compiler-core/src/marshal.rsRepository: RustPython/RustPython
Length of output: 47
🏁 Script executed:
# Search for the make_constant method implementation more broadly
fd -e rs -path "*/compiler-core/src/*" | xargs rg -A 10 "fn make_constant"Repository: RustPython/RustPython
Length of output: 2908
🏁 Script executed:
# Look at how frozensets are used in the codebase
rg "Frozenset" crates/compiler-core/src/ -A 2 -B 2Repository: RustPython/RustPython
Length of output: 4297
🏁 Script executed:
# Check if there's a ConstantBag trait that make_constant is part of
rg -B 5 -A 20 "trait ConstantBag" crates/compiler-core/src/Repository: RustPython/RustPython
Length of output: 1662
🏁 Script executed:
# Check the marshal deserialization code for frozenset more carefully
rg -B 10 -A 5 "Type::FrozenSet" crates/compiler-core/src/marshal.rsRepository: RustPython/RustPython
Length of output: 1290
🏁 Script executed:
# Look at how frozensets are handled in VM when executing constants
rg "Frozenset" crates/vm/src/ -A 3 -B 3 | head -80Repository: RustPython/RustPython
Length of output: 3098
🏁 Script executed:
# Check if there are any tests for frozenset marshaling
fd -e rs -path "*/tests/*" -o -path "*test*" | xargs rg -l "frozenset\|Frozenset" | head -10Repository: RustPython/RustPython
Length of output: 233
🏁 Script executed:
# Find test files for marshal
find crates -name "*test*" -o -name "*tests*" | grep -E "\.rs$" | xargs rg -l "frozenset\|Frozenset" 2>/dev/null | head -10Repository: RustPython/RustPython
Length of output: 47
🏁 Script executed:
# Check the VM's marshal implementation for frozenset handling
rg -B 5 -A 20 "PyMarshalBag\|impl.*MarshalBag" crates/vm/src/stdlib/marshal.rs | head -100Repository: RustPython/RustPython
Length of output: 47
🏁 Script executed:
# Look for how frozenset constants are created in the VM context
rg -B 3 -A 10 "make_frozenset" crates/vm/src/stdlib/marshal.rsRepository: RustPython/RustPython
Length of output: 501
🏁 Script executed:
# Find PyFrozenSet::from_iter implementation
rg -B 5 -A 20 "impl.*PyFrozenSet|fn from_iter" crates/vm/src/builtins/frozenset.rs | head -150Repository: RustPython/RustPython
Length of output: 140
🏁 Script executed:
# Look at PyFrozenSet implementation
cat crates/vm/src/builtins/frozenset.rs | head -150Repository: RustPython/RustPython
Length of output: 132
🏁 Script executed:
# Find PyFrozenSet definition
fd -e rs | xargs rg "struct PyFrozenSet|impl PyFrozenSet" | head -20Repository: RustPython/RustPython
Length of output: 335
🏁 Script executed:
# Search for frozenset implementation in builtins
fd -e rs crates/vm/src/builtins/ | xargs rg -l "PyFrozenSet\|frozenset"Repository: RustPython/RustPython
Length of output: 464
🏁 Script executed:
# Check PyFrozenSet implementation and from_iter method
rg -B 5 -A 30 "impl PyFrozenSet" crates/vm/src/builtins/set.rs | head -200Repository: RustPython/RustPython
Length of output: 1897
🏁 Script executed:
# Search for from_iter implementation
rg -B 3 -A 15 "from_iter" crates/vm/src/builtins/set.rsRepository: RustPython/RustPython
Length of output: 3361
Deduplicate frozenset elements before constructing the constant.
When deserializing TYPE_FROZENSET, the marshal code collects items directly into a Vec without deduplication, preserving duplicates from the marshal data. CPython's equivalent (PySet_Add in marshal.c) automatically deduplicates. This means crafted marshal data with duplicate elements will produce a different constant representation in the bytecode than CPython does, affecting constant equality and hashing. Normalize the elements before passing to make_constant.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@crates/compiler-core/src/marshal.rs` around lines 471 - 478, The
make_frozenset implementation collects elements into a Vec and passes them
straight to make_constant, which preserves duplicates; update make_frozenset to
deduplicate elements before constructing the BorrowedConstant::Frozenset so its
behavior matches CPython. Concretely, in make_frozenset use a set (e.g., HashSet
or BTreeSet) to track seen values and build a new Vec of unique elements
(preserving a deterministic order such as first-seen or sorted if needed), then
call self.make_constant::<Bag::Constant>(BorrowedConstant::Frozenset { elements:
&unique_elements }) instead of passing the original elements Vec. Ensure you
reference the existing function make_frozenset and the constant constructor
BorrowedConstant::Frozenset when applying the change.
Summary by CodeRabbit
New Features
Bug Fixes
Performance