-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Implement more typing #7078
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement more typing #7078
Conversation
📝 WalkthroughWalkthroughAdds PEP 695 style type‑parameter and annotation evaluation support: registers a format parameter in TypeParams/annotation scopes, enforces type‑param ordering and nonlocal restrictions, introduces a ConstEvaluator for formatted annotation values, migrates TypeVar/ParamSpec/TypeVarTuple to PyMutex lazy evaluation, and adds class_getitem to some builtins. Changes
Sequence Diagram(s)mermaid Estimated code review effort🎯 4 (Complex) | ⏱️ ~50 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 1 | ❌ 2❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (1 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 5
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
crates/vm/src/stdlib/typevar.rs (1)
95-107:⚠️ Potential issue | 🟠 MajorPython call while holding
constraintslock.
self.evaluate_constraints.call((1i32,), vm)?at Line 101 executes Python code whileconstraintsis locked. If the called code accesses__constraints__on the sameTypeVar, this deadlocks (non-reentrant mutex). The same issue exists in__bound__(Line 116). Consider cloning the evaluator, dropping the lock, calling, then re-acquiring to store.Proposed fix for `__constraints__`
fn __constraints__(&self, vm: &VirtualMachine) -> PyResult { - let mut constraints = self.constraints.lock(); - if !vm.is_none(&constraints) { - return Ok(constraints.clone()); + { + let constraints = self.constraints.lock(); + if !vm.is_none(&constraints) { + return Ok(constraints.clone()); + } } - let r = if !vm.is_none(&self.evaluate_constraints) { - *constraints = self.evaluate_constraints.call((1i32,), vm)?; - constraints.clone() - } else { - vm.ctx.empty_tuple.clone().into() - }; - Ok(r) + if !vm.is_none(&self.evaluate_constraints) { + let value = self.evaluate_constraints.call((1i32,), vm)?; + *self.constraints.lock() = value.clone(); + Ok(value) + } else { + Ok(vm.ctx.empty_tuple.clone().into()) + } }Apply the same pattern to
__bound__.
🤖 Fix all issues with AI agents
In `@crates/codegen/src/compile.rs`:
- Around line 2464-2468: The evaluator scopes set argcount/posonlyargcount then
call emit_format_validation(), but they don't register the "format" local in the
current code object's varnames so emit_format_validation()'s LoadFast(0) can
read the wrong slot or panic; update the relevant evaluator scope blocks (the
ones around the calls to
self.current_code_info().metadata.argcount/posonlyargcount and
emit_format_validation(), including the similar blocks at the other locations
noted) to push "format" into self.current_code_info().metadata.varnames
(mirroring enter_annotation_scope's behavior) before calling
emit_format_validation(), ensuring varnames contains "format" and preserving
Rust error handling conventions when modifying the metadata.
In `@crates/vm/src/stdlib/typevar.rs`:
- Around line 178-189: In evaluate_default, the lock acquisition order is
reversed and holds multiple locks while potentially making Python calls, risking
deadlock; change the logic in fn evaluate_default to avoid holding both
MutexGuards at once: first lock self.evaluate_default, clone the value into a
local and drop the guard; if that clone is not None return it; otherwise lock
self.default_value, clone that into a local, drop the guard, then compare the
cloned default against vm.ctx.typing_no_default and, if not equal, call
const_evaluator_alloc using the cloned value (ensure no locks are held when
calling const_evaluator_alloc); reference symbols: evaluate_default (method),
self.evaluate_default, self.default_value, const_evaluator_alloc,
vm.ctx.typing_no_default.
- Around line 140-152: The method __default__ in typevar.rs (and the analogous
ParamSpec::__default__ and TypeVarTuple::__default__) currently acquires
default_value then evaluate_default and even calls evaluate_default.call(...)
while both mutexes are held, which risks deadlock due to inconsistent lock
ordering and calling into Python under a lock. Fix by changing the sequence to
first lock evaluate_default (to check/isolate the callable) and clone or take
its value, then drop that lock before acquiring default_value; perform the
Python call outside any mutex, then re-acquire default_value to update it if
needed. Ensure the lock acquisition order matches the evaluate_default getter
(evaluate_default before default_value) across all three __default__
implementations and never call Python while holding either mutex.
In `@crates/vm/src/stdlib/typing.rs`:
- Around line 159-169: The function typing_type_repr_value incorrectly formats
single-element tuples because parts.join(", ") yields "(item)" instead of
"(item,)" — update typing_type_repr_value: after collecting parts from the
PyTuple (using value.try_to_ref::<PyTuple> and typing_type_repr for each item),
detect when tuple.len() == 1 and format the string with a trailing comma (e.g.
"({},)" style) otherwise use "({})" with parts.join(", "); keep using
vm.ctx.new_str(...) and the existing return path.
- Around line 109-123: The code currently only treats ANNOTATE_FORMAT_STRING
(const ANNOTATE_FORMAT_STRING) specially in ConstEvaluator::call by delegating
to typing_type_repr_value, while formats VALUE, VALUE_WITH_FAKE_GLOBALS and
FORWARDREF fall through to returning the raw value; update the
ConstEvaluator::call implementation to either (a) add a clear comment above the
const and inside the call method stating that only annotationlib.Format.STRING
(4) is intentionally handled here and other formats (1,2,3) are returned
unchanged by design, referencing ANNOTATE_FORMAT_STRING and
typing_type_repr_value, or (b) implement the missing handling for formats 1/2/3
as required (e.g., call the same or different helpers) so behavior matches
callers that expect format=1 at runtime; pick one approach and apply it
consistently so future maintainers know why formats besides STRING are not
processed.
🧹 Nitpick comments (1)
crates/vm/src/stdlib/typing.rs (1)
127-155: Attribute-existence checks viaget_attr().is_ok()swallow real errors.Lines 137–138 use
get_attr(...).is_ok()to test whether__origin__and__args__exist. This conflates "attribute not found" with "error during lookup" (e.g., a__getattr__that raisesRuntimeError). This mirrors CPython's approach so it's not a blocker, but worth noting.
| // Evaluator takes a positional-only format parameter | ||
| self.current_code_info().metadata.argcount = 1; | ||
| self.current_code_info().metadata.posonlyargcount = 1; | ||
| self.emit_format_validation()?; | ||
| self.compile_expression(value)?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add the format local before emit_format_validation() to keep LoadFast(0) valid.
emit_format_validation() uses LoadFast(0), but these evaluator scopes only set argcount/posonlyargcount and don’t insert "format" into varnames. That can misread a different local or panic if the varnames set is empty. Add "format" to varnames in these evaluator scopes (mirroring enter_annotation_scope).
✅ Suggested fix
@@
- // Evaluator takes a positional-only format parameter
+ // Evaluator takes a positional-only format parameter
self.current_code_info().metadata.argcount = 1;
self.current_code_info().metadata.posonlyargcount = 1;
+ self.current_code_info()
+ .metadata
+ .varnames
+ .insert("format".to_owned());
self.emit_format_validation()?;
@@
- // Evaluator takes a positional-only format parameter
+ // Evaluator takes a positional-only format parameter
self.current_code_info().metadata.argcount = 1;
self.current_code_info().metadata.posonlyargcount = 1;
+ self.current_code_info()
+ .metadata
+ .varnames
+ .insert("format".to_owned());
self.emit_format_validation()?;
@@
- // Evaluator takes a positional-only format parameter
+ // Evaluator takes a positional-only format parameter
self.current_code_info().metadata.argcount = 1;
self.current_code_info().metadata.posonlyargcount = 1;
+ self.current_code_info()
+ .metadata
+ .varnames
+ .insert("format".to_owned());
self.emit_format_validation()?;As per coding guidelines, follow Rust best practices for error handling and memory management.
Also applies to: 2501-2505, 2646-2651
🤖 Prompt for AI Agents
In `@crates/codegen/src/compile.rs` around lines 2464 - 2468, The evaluator scopes
set argcount/posonlyargcount then call emit_format_validation(), but they don't
register the "format" local in the current code object's varnames so
emit_format_validation()'s LoadFast(0) can read the wrong slot or panic; update
the relevant evaluator scope blocks (the ones around the calls to
self.current_code_info().metadata.argcount/posonlyargcount and
emit_format_validation(), including the similar blocks at the other locations
noted) to push "format" into self.current_code_info().metadata.varnames
(mirroring enter_annotation_scope's behavior) before calling
emit_format_validation(), ensuring varnames contains "format" and preserving
Rust error handling conventions when modifying the metadata.
| fn __default__(&self, vm: &VirtualMachine) -> PyResult { | ||
| let mut default_value = self.default_value.lock(); | ||
| // Check if default_value is NoDefault (not just None) | ||
| if !default_value.is(&vm.ctx.typing_no_default) { | ||
| return Ok(default_value.clone()); | ||
| } | ||
| let evaluate_default = self.evaluate_default.lock(); | ||
| if !vm.is_none(&evaluate_default) { | ||
| *default_value = evaluate_default.call((), vm)?; | ||
| *default_value = evaluate_default.call((1i32,), vm)?; | ||
| Ok(default_value.clone()) | ||
| } else { | ||
| // Return NoDefault singleton | ||
| Ok(vm.ctx.typing_no_default.clone().into()) | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Potential deadlock: inconsistent lock ordering and Python calls under lock.
Two issues with this method:
-
Lock ordering inversion:
__default__acquiresdefault_value(Line 141) thenevaluate_default(Line 145), butevaluate_defaultgetter (Line 180) acquires them in reverse order. Concurrent access to both properties on the same object would deadlock. -
Python call under lock:
evaluate_default.call((1i32,), vm)?at Line 147 executes arbitrary Python code while holding both mutex locks. If the callee accesses__default__orevaluate_defaulton the same object, it deadlocks even single-threaded (assuming non-reentrantPyMutex).
The same pattern is repeated in ParamSpec::__default__ (Line 503–513) and TypeVarTuple::__default__ (Line 698–708).
Proposed fix: release locks before calling Python
fn __default__(&self, vm: &VirtualMachine) -> PyResult {
- let mut default_value = self.default_value.lock();
- if !default_value.is(&vm.ctx.typing_no_default) {
- return Ok(default_value.clone());
+ {
+ let default_value = self.default_value.lock();
+ if !default_value.is(&vm.ctx.typing_no_default) {
+ return Ok(default_value.clone());
+ }
}
- let evaluate_default = self.evaluate_default.lock();
- if !vm.is_none(&evaluate_default) {
- *default_value = evaluate_default.call((1i32,), vm)?;
- Ok(default_value.clone())
- } else {
- Ok(vm.ctx.typing_no_default.clone().into())
+ let evaluator = {
+ let evaluate_default = self.evaluate_default.lock();
+ if vm.is_none(&evaluate_default) {
+ return Ok(vm.ctx.typing_no_default.clone().into());
+ }
+ evaluate_default.clone()
+ };
+ let value = evaluator.call((1i32,), vm)?;
+ *self.default_value.lock() = value.clone();
+ Ok(value)
- }
}Apply the same pattern to ParamSpec::__default__ and TypeVarTuple::__default__.
🤖 Prompt for AI Agents
In `@crates/vm/src/stdlib/typevar.rs` around lines 140 - 152, The method
__default__ in typevar.rs (and the analogous ParamSpec::__default__ and
TypeVarTuple::__default__) currently acquires default_value then
evaluate_default and even calls evaluate_default.call(...) while both mutexes
are held, which risks deadlock due to inconsistent lock ordering and calling
into Python under a lock. Fix by changing the sequence to first lock
evaluate_default (to check/isolate the callable) and clone or take its value,
then drop that lock before acquiring default_value; perform the Python call
outside any mutex, then re-acquire default_value to update it if needed. Ensure
the lock acquisition order matches the evaluate_default getter (evaluate_default
before default_value) across all three __default__ implementations and never
call Python while holding either mutex.
| fn typing_type_repr_value(value: &PyObjectRef, vm: &VirtualMachine) -> PyResult { | ||
| if let Ok(tuple) = value.try_to_ref::<PyTuple>(vm) { | ||
| let mut parts = Vec::with_capacity(tuple.len()); | ||
| for item in tuple.iter() { | ||
| parts.push(typing_type_repr(item, vm)?); | ||
| } | ||
| Ok(vm.ctx.new_str(format!("({})", parts.join(", "))).into()) | ||
| } else { | ||
| Ok(vm.ctx.new_str(typing_type_repr(value, vm)?).into()) | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Single-element tuple formatting omits the trailing comma.
parts.join(", ") for a 1-element tuple produces "(item)" rather than "(item,)". While single-element tuples are rare in annotation contexts (single constraints are disallowed), if this path is ever hit, the output would be ambiguous.
Proposed fix
if let Ok(tuple) = value.try_to_ref::<PyTuple>(vm) {
let mut parts = Vec::with_capacity(tuple.len());
for item in tuple.iter() {
parts.push(typing_type_repr(item, vm)?);
}
- Ok(vm.ctx.new_str(format!("({})", parts.join(", "))).into())
+ let inner = if parts.len() == 1 {
+ format!("{},", parts[0])
+ } else {
+ parts.join(", ")
+ };
+ Ok(vm.ctx.new_str(format!("({})", inner)).into())📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| fn typing_type_repr_value(value: &PyObjectRef, vm: &VirtualMachine) -> PyResult { | |
| if let Ok(tuple) = value.try_to_ref::<PyTuple>(vm) { | |
| let mut parts = Vec::with_capacity(tuple.len()); | |
| for item in tuple.iter() { | |
| parts.push(typing_type_repr(item, vm)?); | |
| } | |
| Ok(vm.ctx.new_str(format!("({})", parts.join(", "))).into()) | |
| } else { | |
| Ok(vm.ctx.new_str(typing_type_repr(value, vm)?).into()) | |
| } | |
| } | |
| fn typing_type_repr_value(value: &PyObjectRef, vm: &VirtualMachine) -> PyResult { | |
| if let Ok(tuple) = value.try_to_ref::<PyTuple>(vm) { | |
| let mut parts = Vec::with_capacity(tuple.len()); | |
| for item in tuple.iter() { | |
| parts.push(typing_type_repr(item, vm)?); | |
| } | |
| let inner = if parts.len() == 1 { | |
| format!("{},", parts[0]) | |
| } else { | |
| parts.join(", ") | |
| }; | |
| Ok(vm.ctx.new_str(format!("({})", inner)).into()) | |
| } else { | |
| Ok(vm.ctx.new_str(typing_type_repr(value, vm)?).into()) | |
| } | |
| } |
🤖 Prompt for AI Agents
In `@crates/vm/src/stdlib/typing.rs` around lines 159 - 169, The function
typing_type_repr_value incorrectly formats single-element tuples because
parts.join(", ") yields "(item)" instead of "(item,)" — update
typing_type_repr_value: after collecting parts from the PyTuple (using
value.try_to_ref::<PyTuple> and typing_type_repr for each item), detect when
tuple.len() == 1 and format the string with a trailing comma (e.g. "({},)"
style) otherwise use "({})" with parts.join(", "); keep using
vm.ctx.new_str(...) and the existing return path.
- Implement _ConstEvaluator type in _typing module with STRING format support via typing_type_repr (port of _Py_typing_type_repr) - Add evaluate_bound, evaluate_constraints, evaluate_default pygetset properties to TypeVar, ParamSpec, TypeVarTuple - Emit format validation in evaluator scopes (compile_type_param_bound_or_default, TypeAlias value scopes) so evaluators raise NotImplementedError for unsupported formats - Add non-default-after-default SyntaxError in scan_type_params - Fix ParamSpec default_value to use Mutex for proper caching - Fix TypeVar constructor: evaluate_constraints set to None instead of constraints tuple for eager-constructed TypeVars - Pass format=1 (FORMAT_VALUE) to all lazy evaluator calls - Remove 6 expectedFailure markers from test_type_params
…sre module names - Reject nonlocal binding for type parameters in symboltable - Enable __class_getitem__ on memoryview, Template, Interpolation - Fix __set_name__ error note to quote attribute name - Set re.Match and re.Pattern module to "re" instead of "_sre" - Migrate parking_lot::Mutex to PyMutex in typevar.rs - Collapse nested if-let chains in typing_type_repr (clippy) - Remove 5 expectedFailure markers from test files
Annotation scopes now inherit can_see_class_scope from parent scopes (not just direct Class parents), allowing annotations in generic methods inside classes to access class variables via __classdict__. Remove 3 expectedFailure markers from test_type_params.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (3)
crates/vm/src/stdlib/typevar.rs (2)
94-107:⚠️ Potential issue | 🟠 MajorPython call while holding mutex lock in
__constraints__.Line 101 calls
self.evaluate_constraints.call((1i32,), vm)?while theconstraintsmutex is held (locked at Line 96). If the callable transitively accesses__constraints__on the sameTypeVar, this will deadlock on a non-reentrantPyMutex. Consider cloning the evaluator reference and releasing the lock before making the call.Proposed fix: release lock before Python call
fn __constraints__(&self, vm: &VirtualMachine) -> PyResult { - let mut constraints = self.constraints.lock(); - if !vm.is_none(&constraints) { - return Ok(constraints.clone()); + { + let constraints = self.constraints.lock(); + if !vm.is_none(&constraints) { + return Ok(constraints.clone()); + } } - let r = if !vm.is_none(&self.evaluate_constraints) { - *constraints = self.evaluate_constraints.call((1i32,), vm)?; - constraints.clone() - } else { - vm.ctx.empty_tuple.clone().into() - }; - Ok(r) + if !vm.is_none(&self.evaluate_constraints) { + let value = self.evaluate_constraints.call((1i32,), vm)?; + *self.constraints.lock() = value.clone(); + Ok(value) + } else { + Ok(vm.ctx.empty_tuple.clone().into()) + } }
109-122:⚠️ Potential issue | 🟠 MajorSame issue: Python call while holding
boundmutex lock.Line 116 calls
self.evaluate_bound.call((1i32,), vm)?whileboundis locked. Apply the same drop-before-call pattern.Proposed fix
fn __bound__(&self, vm: &VirtualMachine) -> PyResult { - let mut bound = self.bound.lock(); - if !vm.is_none(&bound) { - return Ok(bound.clone()); + { + let bound = self.bound.lock(); + if !vm.is_none(&bound) { + return Ok(bound.clone()); + } } - let r = if !vm.is_none(&self.evaluate_bound) { - *bound = self.evaluate_bound.call((1i32,), vm)?; - bound.clone() - } else { - vm.ctx.none() - }; - Ok(r) + if !vm.is_none(&self.evaluate_bound) { + let value = self.evaluate_bound.call((1i32,), vm)?; + *self.bound.lock() = value.clone(); + Ok(value) + } else { + Ok(vm.ctx.none()) + } }crates/codegen/src/symboltable.rs (1)
952-987:⚠️ Potential issue | 🟠 Major
"format"should be registered as a symbol in the annotation scope for consistency.The annotation scope adds
"format"tovarnames(line 969) but never callsregister_name()to add it to thesymbolsmap. This is inconsistent with TypeAlias value scopes (line 1508) and type-parameter bound/default scopes (line 2015), which both useregister_name("format", SymbolUsage::Parameter, ...)to register"format"as a parameter symbol.While the current code works because
emit_format_validation()usesLoadFast(0)to access format by position, the inconsistency means future code that inspects symbol metadata or scope information will not find"format"in the annotation scope's symbols map.Call
register_name("format", SymbolUsage::Parameter, TextRange::default())after pushing the annotation table to the stack (after line 975), following the same pattern used elsewhere:Proposed fix
self.tables.push(*annotation_table); // Save parent's varnames and seed with existing annotation varnames (e.g., "format") self.varnames_stack .push(core::mem::take(&mut self.current_varnames)); self.current_varnames = self.tables.last().unwrap().varnames.clone(); + // Register "format" as a parameter symbol (varnames was already seeded above) + self.register_name("format", SymbolUsage::Parameter, TextRange::default())?; if can_see_class_scope && !self.future_annotations {
📦 Library DependenciesThe following Lib/ modules were modified. Here are their dependencies: [x] lib: cpython/Lib/re dependencies:
dependent tests: (58 tests)
[x] lib: cpython/Lib/typing.py dependencies:
dependent tests: (12 tests)
Legend:
|
Summary by CodeRabbit
New Features
Bug Fixes
Improvements