-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Fix annotation scope, deadlock, MRO, and HEAPTYPE issues #7087
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Annotation scope: - Remove module-level annotation re-scan that created phantom sub_tables, breaking annotation closure for comprehensions - Add async comprehension check in symbol table with is_in_async_context(); annotation/type-params scopes are always non-async - Save/restore CompileContext in enter/exit_annotation_scope to reset in_async_scope Deadlock prevention: - Fix TypeVar/ParamSpec/TypeVarTuple __default__ and evaluate_default by cloning lock contents before acquiring a second lock or calling Python Other fixes: - Add HEAPTYPE flag to Generic for correct pickle behavior - Guard heaptype_ext access in name_inner/set___name__/ set___qualname__ with safe checks instead of unwrap - Fix MRO error message to include base class names - Add "format" to varnames in TypeAlias annotation scopes - Fix single-element tuple repr to include trailing comma
📝 WalkthroughWalkthroughThis PR refactors annotation scope management in the compiler to explicitly carry and restore CompileContext, strengthens type immutability error handling, improves TypeVar locking semantics, and adjusts tuple type representation formatting. Changes span compilation, type resolution, and stdlib implementations. Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
Tip Issue Planner is now in beta. Read the docs and try it out! Share your feedback on Discord. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
📦 Library DependenciesThe following Lib/ modules were modified. Here are their dependencies: [ ] lib: cpython/Lib/ctypes dependencies:
dependent tests: (1 tests)
[x] lib: cpython/Lib/dataclasses.py dependencies:
dependent tests: (6 tests)
[x] lib: cpython/Lib/typing.py dependencies:
dependent tests: (12 tests)
Legend:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
crates/vm/src/stdlib/typevar.rs (1)
95-122:⚠️ Potential issue | 🟠 Major
__bound__and__constraints__still hold their locks while calling into Python — same deadlock risk you fixed in__default__.Lines 101 and 116 call
evaluate_constraints.call()/evaluate_bound.call()while theconstraints/boundmutex guard is alive. If the evaluator callback triggers re-entrant access to__constraints__or__bound__on the sameTypeVar, it will deadlock. Apply the same scoped-lock + clone-first pattern used in__default__.Proposed fix
#[pygetset] fn __constraints__(&self, vm: &VirtualMachine) -> PyResult { - let mut constraints = self.constraints.lock(); - if !vm.is_none(&constraints) { - return Ok(constraints.clone()); + { + let constraints = self.constraints.lock(); + if !vm.is_none(&constraints) { + return Ok(constraints.clone()); + } } let r = if !vm.is_none(&self.evaluate_constraints) { - *constraints = self.evaluate_constraints.call((1i32,), vm)?; - constraints.clone() + let result = self.evaluate_constraints.call((1i32,), vm)?; + *self.constraints.lock() = result.clone(); + result } else { vm.ctx.empty_tuple.clone().into() }; Ok(r) } #[pygetset] fn __bound__(&self, vm: &VirtualMachine) -> PyResult { - let mut bound = self.bound.lock(); - if !vm.is_none(&bound) { - return Ok(bound.clone()); + { + let bound = self.bound.lock(); + if !vm.is_none(&bound) { + return Ok(bound.clone()); + } } let r = if !vm.is_none(&self.evaluate_bound) { - *bound = self.evaluate_bound.call((1i32,), vm)?; - bound.clone() + let result = self.evaluate_bound.call((1i32,), vm)?; + *self.bound.lock() = result.clone(); + result } else { vm.ctx.none() }; Ok(r) }
🤖 Fix all issues with AI agents
In `@crates/codegen/src/compile.rs`:
- Around line 2494-2497: The code currently inserts "format" into
current_code_info().metadata.varnames and later assumes its local slot is 0;
update emit_format_validation() (and the similar spots at the other occurrences)
to resolve the local slot via get_local_var_index("format") instead of
hardcoding 0: use get_local_var_index("format") to obtain the parameter index
and emit loads/validation against that index, leaving the varnames insertion
as-is but making the load robust to symbol ordering changes.
🧹 Nitpick comments (2)
crates/vm/src/builtins/type.rs (1)
850-877: Inconsistent name formatting in heaptype_ext error vs. sibling setter.Line 865 uses
self.name()(which strips the module prefix viarsplit_once('.')), while the analogous guard inset___name__at line 1162 usesself.slot_name()(which preserves the full dotted name). Thecheck_set_special_type_attrhelper at line 1138 also usesslot_name(). For static types whereheaptype_extisNone, these yield different output (e.g."typing.TypeAliasType"vs"TypeAliasType").Consider using
self.slot_name()here for consistency:Suggested fix
let heap_type = self.heaptype_ext.as_ref().ok_or_else(|| { vm.new_type_error(format!( "cannot set '__qualname__' attribute of immutable type '{}'", - self.name() + self.slot_name() )) })?;crates/vm/src/stdlib/typevar.rs (1)
182-192: Consider extracting the duplicated__default__/evaluate_defaultlogic into shared helpers.The
__default__body (lines 141–155, 506–520, 704–718) andevaluate_defaultbody (lines 183–192, 524–533, 722–731) are identical acrossTypeVar,ParamSpec, andTypeVarTuple. A free function taking(&PyMutex<PyObjectRef>, &PyMutex<PyObjectRef>, &VirtualMachine)would eliminate the triplication and ensure future lock-pattern fixes only need to be applied once.As per coding guidelines, "When branches differ only in a value but share common logic, extract the differing value first, then call the common logic once to avoid duplicate code."
Also applies to: 523-533, 721-731
| self.current_code_info() | ||
| .metadata | ||
| .varnames | ||
| .insert("format".to_owned()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
# First, let's look at the relevant code sections to understand the context
cd crates/codegen/src && cat -n compile.rs | sed -n '2490,2500p'Repository: RustPython/RustPython
Length of output: 787
🏁 Script executed:
# Check the other mentioned locations
cd crates/codegen/src && cat -n compile.rs | sed -n '2530,2545p'Repository: RustPython/RustPython
Length of output: 1028
🏁 Script executed:
# And the third location
cd crates/codegen/src && cat -n compile.rs | sed -n '2680,2695p'Repository: RustPython/RustPython
Length of output: 703
🏁 Script executed:
# Now let's find the emit_format_validation function
cd crates/codegen/src && rg -A 20 "fn emit_format_validation"Repository: RustPython/RustPython
Length of output: 1004
🏁 Script executed:
# Check what IndexSet::insert does - search for IndexSet usage and documentation
rg "IndexSet" crates/codegen/src/compile.rs | head -20Repository: RustPython/RustPython
Length of output: 822
🏁 Script executed:
# Let's check the type definition of varnames to confirm it's an IndexSet
cd crates/codegen/src && rg "varnames.*IndexSet" -A 2 -B 2Repository: RustPython/RustPython
Length of output: 672
🏁 Script executed:
# Let's also check if there's a method to get local var index by name
cd crates/codegen/src && rg "get_local_var_index" -A 5Repository: RustPython/RustPython
Length of output: 738
🏁 Script executed:
# Let's check the enter_scope implementation to see if varnames is initialized fresh
cd crates/codegen/src && rg "fn enter_scope" -A 30Repository: RustPython/RustPython
Length of output: 4782
🏁 Script executed:
# Let's also check if CodeInfo is initialized fresh when entering a scope
cd crates/codegen/src && rg "struct CodeInfo" -A 20Repository: RustPython/RustPython
Length of output: 834
🏁 Script executed:
# Check what happens to metadata when entering a new scope - look for initialization
cd crates/codegen/src && rg "metadata.*CodeMetadata" -B 2 -A 2Repository: RustPython/RustPython
Length of output: 47
🏁 Script executed:
# Let's look at the complete context around the first occurrence to see the scope entry
cd crates/codegen/src && cat -n compile.rs | sed -n '2485,2505p'Repository: RustPython/RustPython
Length of output: 1376
🏁 Script executed:
# Let's check if there are any other variables added to varnames in these scopes before format
cd crates/codegen/src && cat -n compile.rs | sed -n '2680,2710p'Repository: RustPython/RustPython
Length of output: 1296
🏁 Script executed:
# Let's check how the symbol table is built for TypeParams scopes and if "format" is already in the symbol table
cd crates/codegen/src && rg "TypeParams" symboltable.rs -B 5 -A 10Repository: RustPython/RustPython
Length of output: 7690
🏁 Script executed:
# Let's look at how varnames are populated in the symbol table during the first pass
cd crates/codegen/src && rg "fn scan_type_param" symboltable.rs -A 30Repository: RustPython/RustPython
Length of output: 2926
🏁 Script executed:
# Check for "format" parameter being added to symbol table during analysis phase
rg "\"format\"" crates/codegen/src/symboltable.rs -B 3 -A 3Repository: RustPython/RustPython
Length of output: 1573
🏁 Script executed:
# Let's trace through the complete enter_scope to see how metadata.varnames is initialized
cd crates/codegen/src && cat -n compile.rs | sed -n '2200,2320p'Repository: RustPython/RustPython
Length of output: 5711
🏁 Script executed:
# Check what happens to varnames from the symbol table when CodeInfo is created
cd crates/codegen/src && rg "CodeUnitMetadata" -A 15 | head -60Repository: RustPython/RustPython
Length of output: 2709
🏁 Script executed:
# Let's see the complete enter_scope implementation to find where varnames is initialized
cd crates/codegen/src && cat -n compile.rs | sed -n '655,780p'Repository: RustPython/RustPython
Length of output: 5474
🏁 Script executed:
# Look for where CodeInfo is created with the varname_cache
cd crates/codegen/src && rg "varname_cache" compile.rs -B 5 -A 5Repository: RustPython/RustPython
Length of output: 949
🏁 Script executed:
# Let's check if there's code that pushes the CodeInfo with initialized metadata
cd crates/codegen/src && cat -n compile.rs | sed -n '780,850p'Repository: RustPython/RustPython
Length of output: 3090
🏁 Script executed:
# Let's look at the SymbolTableEntry structure to understand what varnames contains
cd crates/codegen/src && rg "struct SymbolTableEntry" symboltable.rs -A 10Repository: RustPython/RustPython
Length of output: 47
🏁 Script executed:
# Check what register_name does with parameters
cd crates/codegen/src && rg "fn register_name" symboltable.rs -A 20Repository: RustPython/RustPython
Length of output: 853
🏁 Script executed:
# Let's verify that "format" is actually in the symbol table for these TypeParams scopes
# by checking what varnames ste contains when enter_scope is called
cd crates/codegen/src && cat -n compile.rs | sed -n '2470,2505p'Repository: RustPython/RustPython
Length of output: 2232
🏁 Script executed:
# Check if there are any other parameters or locals that might be added to these TypeParams scopes before format
cd crates/codegen/src && rg "enter_scope.*TypeParams" -B 15 -A 30 | head -100Repository: RustPython/RustPython
Length of output: 5578
🏁 Script executed:
# Let's verify the complete flow by checking what symbol table entries are registered for one of these TypeParams scopes
# Look at the specific case: the inner TypeAlias scope at line 2490
cd crates/codegen/src && rg -B 50 'self.enter_scope.*"TypeAlias".*TypeParams' symboltable.rs | grep -A 50 "compile_type_alias" | head -80Repository: RustPython/RustPython
Length of output: 47
🏁 Script executed:
# Let's check if there's any code that modifies varnames after enter_scope and before emit_format_validation
cd crates/codegen/src && cat -n compile.rs | sed -n '2488,2502p'Repository: RustPython/RustPython
Length of output: 1079
🏁 Script executed:
# Check IndexSet behavior - does insert return early if already present?
web_search
IndexSet insert behavior - does it append or return if already presentRepository: RustPython/RustPython
Length of output: 162
🌐 Web query:
indexmap IndexSet insert behavior documentation
💡 Result:
In indexmap::IndexSet, insert behaves like this:
IndexSet::insert(&mut self, value: T) -> boolinsertsvalueonly if an equivalent value is not already present.- Returns
trueif it inserted a new element. - Returns
falseif the set already contained an equivalent element; in that case it keeps the original stored value and does not change its insertion order. [1]
- Returns
Related ordering note:
IndexSetiteration order is determined by the history of insertions/removals, and a re-insertion attempt of an already-present element does not affect the order or stored value. [2]
If you also need the index:
IndexSet::insert_full(&mut self, value: T) -> (usize, bool)returns (index, inserted?) with the same “no replace, no reorder on duplicates” behavior. [3]
Sources:
[1] https://docs.rs/indexmap/latest/indexmap/set/struct.IndexSet.html (method insert)
[2] https://docs.rs/indexmap/latest/indexmap/set/struct.IndexSet.html (section “Order”)
[3] https://docs.rs/indexmap/latest/indexmap/set/struct.IndexSet.html (method insert_full)
🏁 Script executed:
# Let's check the documentation and implementation by looking at what methods are available on IndexSet
# and verify the behavior of insert
cd crates/codegen/src && rg "IndexSet<String>" compile.rs -B 2 -A 2 | head -30Repository: RustPython/RustPython
Length of output: 479
🏁 Script executed:
# Let's trace through what happens:
# 1. enter_scope initializes metadata.varnames from symbol table
# 2. Then we manually call insert("format")
# Let's check if "format" would already be in varnames from the symbol table
cd crates/codegen/src && cat -n compile.rs | sed -n '2440,2520p'Repository: RustPython/RustPython
Length of output: 4561
🏁 Script executed:
# Check the symbol_table_stack to understand what key is passed to enter_scope
cd crates/codegen/src && cat -n compile.rs | sed -n '2410,2495p'Repository: RustPython/RustPython
Length of output: 4487
🏁 Script executed:
# Let's verify that "format" is indeed registered in the symbol table during analysis
# by checking the flow for the inner TypeAlias scope case
cd crates/codegen/src && rg "scan_expression.*TypeAlias\|StmtTypeAlias" symboltable.rs -B 5 -A 20Repository: RustPython/RustPython
Length of output: 47
🏁 Script executed:
# Let's check how StmtTypeAlias is scanned in the symbol table
cd crates/codegen/src && rg "StmtTypeAlias" symboltable.rs -B 2 -A 20Repository: RustPython/RustPython
Length of output: 937
🏁 Script executed:
# Check if the inner closure's symbol table is created during analysis with "format" already registered
cd crates/codegen/src && rg "scan_type_alias\|visit_type_alias" symboltable.rs -B 5 -A 30Repository: RustPython/RustPython
Length of output: 47
🏁 Script executed:
# Let's get the complete StmtTypeAlias scanning logic
cd crates/codegen/src && rg "Stmt::TypeAlias" symboltable.rs -A 50Repository: RustPython/RustPython
Length of output: 2089
🏁 Script executed:
# Let's check the specific part where the value scope is created for TypeAlias
cd crates/codegen/src && cat -n symboltable.rs | sed -n '1900,2050p'Repository: RustPython/RustPython
Length of output: 7510
🏁 Script executed:
# Let's verify the complete picture by checking what register_name does with Parameters
cd crates/codegen/src && rg "register_name.*Parameter" symboltable.rs -B 2 -A 5Repository: RustPython/RustPython
Length of output: 1189
🏁 Script executed:
# Check what SymbolUsage::Parameter means and how it's processed
cd crates/codegen/src && rg "enum SymbolUsage\|SymbolUsage::Parameter" symboltable.rs -B 2 -A 15Repository: RustPython/RustPython
Length of output: 47
🏁 Script executed:
# Find the SymbolUsage enum definition
cd crates/codegen/src && rg "enum SymbolUsage" -A 20Repository: RustPython/RustPython
Length of output: 829
🏁 Script executed:
# Let's also check how register_name processes parameters
cd crates/codegen/src && rg "fn register_name" symboltable.rs -A 40 | head -60Repository: RustPython/RustPython
Length of output: 1714
🏁 Script executed:
# Let's verify the key question: after enter_scope, is "format" already in varnames from the symbol table?
# Check what happens in leave_scope during symbol table analysis
cd crates/codegen/src && cat -n symboltable.rs | sed -n '785,810p'Repository: RustPython/RustPython
Length of output: 1380
🏁 Script executed:
# Let's trace through: when register_name is called with Parameter, where does it go?
cd crates/codegen/src && rg "fn register_name" symboltable.rs -A 70 | head -100Repository: RustPython/RustPython
Length of output: 3150
🏁 Script executed:
# Let's check if register_name adds parameters to current_varnames
cd crates/codegen/src && rg "current_varnames" symboltable.rs -B 3 -A 3Repository: RustPython/RustPython
Length of output: 4291
🏁 Script executed:
# Let's find where parameters are added to varnames during symbol table analysis
cd crates/codegen/src && rg "SymbolUsage::Parameter" symboltable.rs -B 5 -A 10 | head -50Repository: RustPython/RustPython
Length of output: 1977
🏁 Script executed:
# Check the complete register_name function to see what happens with Parameter role
cd crates/codegen/src && cat -n symboltable.rs | sed -n '275,400p'Repository: RustPython/RustPython
Length of output: 5569
🏁 Script executed:
# Let me verify the order: is "format" the FIRST parameter registered in these TypeParams scopes?
cd crates/codegen/src && cat -n symboltable.rs | sed -n '1865,1895p'Repository: RustPython/RustPython
Length of output: 1558
🏁 Script executed:
# Also check for the bound/default case
cd crates/codegen/src && cat -n symboltable.rs | sed -n '2034,2048p'Repository: RustPython/RustPython
Length of output: 911
Use get_local_var_index() to load format parameter instead of hardcoding index 0.
During symbol table analysis, "format" is registered as a parameter in these TypeParams scopes before any other symbols, so it correctly appears at index 0. However, relying on this implicit ordering is fragile. Use get_local_var_index("format") in emit_format_validation() to explicitly resolve the parameter's index, making the intent clear and the code more resilient to future changes.
🛠️ Suggested fix in emit_format_validation
- emit!(self, Instruction::LoadFast(0));
+ let format_idx = self.get_local_var_index("format")?;
+ emit!(self, Instruction::LoadFast(format_idx));Also applies to: 2535-2538, 2684-2687
🤖 Prompt for AI Agents
In `@crates/codegen/src/compile.rs` around lines 2494 - 2497, The code currently
inserts "format" into current_code_info().metadata.varnames and later assumes
its local slot is 0; update emit_format_validation() (and the similar spots at
the other occurrences) to resolve the local slot via
get_local_var_index("format") instead of hardcoding 0: use
get_local_var_index("format") to obtain the parameter index and emit
loads/validation against that index, leaving the varnames insertion as-is but
making the load robust to symbol ordering changes.
Summary by CodeRabbit