-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Upgrade typing from 3.14.2 and more impl #7057
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughReworks compiler annotation and TypeParams scoping to emit closures for type-parameter values, updates symbol-table propagation for class scope and conditional annotations, converts TypeAlias to a lazy/eager compute model, generalizes GenericAlias.origin to PyObjectRef, and adjusts VM setters/getters and module handling across VM types. Changes
Sequence Diagram(s)sequenceDiagram
participant Parser as Parser/AST
participant Symtab as SymbolTable
participant Codegen as Codegen
participant VM as Runtime (VM types)
Parser->>Symtab: register scopes (annotations, TypeParams)
Symtab-->>Codegen: provide scope metadata (can_see_class_scope, cells)
Codegen->>Symtab: push annotation / TypeParams scopes
Codegen->>Codegen: compile type-param/value expressions → build closures
Codegen->>Symtab: pop scopes / restore context
Codegen->>VM: emit closures/references for TypeAlias/annotations
VM->>VM: create TypeAlias (compute_value + cached_value) and GenericAlias (origin as PyObjectRef)
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 1 | ❌ 2❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (1 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
No actionable comments were generated in the recent review. 🎉 🧹 Recent nitpick comments
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
Code has been automatically formatted The code in this PR has been formatted using:
git pull origin typing |
ee8cb19 to
18b7d5c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
crates/vm/src/builtins/type.rs (1)
1003-1051:⚠️ Potential issue | 🟠 Major
Assign(None)andDeleteare conflated, causingcls.__annotations__ = Noneto raiseAttributeErroron fresh classes.Lines 1015–1018 collapse both
Assign(None)(Pythoncls.__annotations__ = None) andDelete(Pythondel cls.__annotations__) to a singlevalue = None. The subsequent deletion logic (lines 1024–1046) then raisesAttributeErrorif neither__annotations__nor__annotations_cache__exists.When Python
Noneis passed,TryFromObjectconverts it to RustNone(lines 97–105 ofcrates/vm/src/convert/try_from.rs), yieldingAssign(None). The current code incorrectly treats this the same as an explicitDelete, failing if the attribute doesn't already exist.Per Python semantics,
cls.__annotations__ = Noneshould succeed silently, whiledel cls.__annotations__should error only if the attribute is missing. Separate the match arms to handle these three cases distinctly:Proposed fix
- let value = match value { - crate::function::PySetterValue::Assign(v) => v, - crate::function::PySetterValue::Delete => None, - }; - let mut attrs = self.attributes.write(); - // conditional update based on __annotations__ presence let has_annotations = attrs.contains_key(identifier!(vm, __annotations__)); - if has_annotations { - // If __annotations__ is in dict, update it - if let Some(value) = value { - attrs.insert(identifier!(vm, __annotations__), value); - } else if attrs - .swap_remove(identifier!(vm, __annotations__)) - .is_none() - { - return Err(vm.new_attribute_error("__annotations__".to_owned())); - } - // Also clear __annotations_cache__ - attrs.swap_remove(identifier!(vm, __annotations_cache__)); - } else { - // Otherwise update only __annotations_cache__ - if let Some(value) = value { - attrs.insert(identifier!(vm, __annotations_cache__), value); - } else if attrs - .swap_remove(identifier!(vm, __annotations_cache__)) - .is_none() - { - return Err(vm.new_attribute_error("__annotations__".to_owned())); - } + match value { + crate::function::PySetterValue::Assign(v) => { + if has_annotations { + if let Some(v) = v { + attrs.insert(identifier!(vm, __annotations__), v); + } else { + attrs.swap_remove(identifier!(vm, __annotations__)); + } + attrs.swap_remove(identifier!(vm, __annotations_cache__)); + } else { + if let Some(v) = v { + attrs.insert(identifier!(vm, __annotations_cache__), v); + } else { + attrs.swap_remove(identifier!(vm, __annotations_cache__)); + } + } + } + crate::function::PySetterValue::Delete => { + if has_annotations { + if attrs.swap_remove(identifier!(vm, __annotations__)).is_none() { + return Err(vm.new_attribute_error("__annotations__".to_owned())); + } + attrs.swap_remove(identifier!(vm, __annotations_cache__)); + } else if attrs + .swap_remove(identifier!(vm, __annotations_cache__)) + .is_none() + { + return Err(vm.new_attribute_error("__annotations__".to_owned())); + } + } }
🤖 Fix all issues with AI agents
In `@crates/codegen/src/symboltable.rs`:
- Around line 1463-1472: Replace checks that use self.class_name.is_some() when
deciding to set table.can_see_class_scope with a direct parent-scope check like
self.tables.last().is_some_and(|t| t.typ == CompilerScope::Class); update the
three places named in the review (the type-alias-with-params block that sets
can_see_class_scope and registers "__classdict__" where SymbolUsage::Used and
TextRange::default() are used, the else branch for the
type-alias-without-params, and the scan_type_param_bound_or_default
implementation) to use this pattern (see enter_type_param_block for the
canonical example) so only immediate class children flip can_see_class_scope
instead of any nested function within a class.
- Around line 1456-1495: The TypeAlias handling duplicates the same inner-scope
logic; extract that into a helper (e.g., scan_type_alias_value) and replace both
branches with a call to it: the helper should call enter_scope("TypeAlias",
CompilerScope::TypeParams, self.line_index_start(value.range())), set
tables.last_mut().can_see_class_scope = true when self.class_name.is_some(),
call register_name("__classdict__", SymbolUsage::Used, TextRange::default()) as
before, invoke scan_expression(value, ExpressionContext::Load), then
leave_scope() and return Ok(()); update the original code paths to call this
helper instead of repeating
enter_scope/register_name/scan_expression/leave_scope.
In `@crates/vm/src/stdlib/typing.rs`:
- Around line 175-209: In py_new for TypeAliasType, add validation to reject
duplicate positional+keyword arguments and unexpected kwargs: check args.kwargs
keys only allow "name", "value", "type_params" and if any other key present
return vm.new_type_error with an unexpected-keyword message; for each branch
(args.args.len() == 1 or 2 and the empty-args case) detect if a parameter was
supplied both positionally and via kwargs (e.g., when args.args.len() >= 1 and
args.kwargs.contains_key("name") or when args.args.len() == 2 and
args.kwargs.contains_key("value") or "name") and return vm.new_type_error
mimicking CPython’s "TypeAliasType() got multiple values for argument
'name'"/"'value'". Ensure these checks run before you assign name/value so
duplicates are rejected and unexpected keywords are not silently ignored.
- Around line 89-98: TypeAliasType currently declares the Hashable protocol
implementation which adds a custom __hash__ differing from CPython; remove
Hashable from the pyclass attribute so TypeAliasType only implements
Constructor, Representable, AsMapping, and AsNumber to match CPython's
identity-based hashing. Edit the #[pyclass(with(...))] on TypeAliasType to drop
Hashable and ensure no other code defines a custom __hash__ for TypeAliasType
(inspect any impls or trait blocks named Hashable or methods called __hash__
associated with TypeAliasType and delete them).
🧹 Nitpick comments (3)
crates/codegen/src/compile.rs (1)
1072-1076: Please confirm fmt/clippy + clean bytecode build were run.Bytecode generation paths changed here, so please ensure
cargo fmt,cargo clippy --all-targets --all-features, and the clean build step (rm -r target/debug/build/rustpython-* && find . | grep -E "\.pyc$" | xargs rm -r) were executed before finalizing.As per coding guidelines: Follow the default rustfmt code style by running
cargo fmtto format Rust code; Always run clippy to lint Rust code (cargo clippy) before completing tasks and fix any warnings or lints that are introduced by your changes; When modifying bytecode instructions, perform a clean build by running:rm -r target/debug/build/rustpython-* && find . | grep -E "\.pyc$" | xargs rm -r.crates/vm/src/stdlib/typing.rs (2)
110-118:new_eagerstores non-callable value ascompute_value— latent risk if cache is ever cleared.In
new_eager,compute_valueis set tovalue.clone(), which is the evaluated result (not a callable). Ifcached_valuewere everNoneafter construction (e.g., due to a future refactor),__value__would attempt to call a non-callable, producing a confusingTypeError. Currently safe sincecached_valueis alwaysSomeafternew_eager, but this is fragile.Consider storing a sentinel or documenting the invariant:
Suggestion
/// Create with an eagerly evaluated value (used by constructor) pub fn new_eager(name: PyStrRef, type_params: PyTupleRef, value: PyObjectRef) -> Self { Self { name, type_params, - compute_value: value.clone(), + compute_value: value.clone(), // N.B. not callable — cached_value must always be Some for eager aliases cached_value: crate::common::lock::PyMutex::new(Some(value)), } }
272-277: Module attribute fix is reasonable but fragile.If
TypeAliasTypeis ever not exported as apyattr(or renamed), theget_attr("TypeAliasType", vm)lookup silently fails and__module__stays as"_typing". Thelet _ =onset_attralso silently swallows failures. This is acceptable as a workaround, but a comment explaining why this is needed (pyattr sets__module__from the containing#[pymodule(name = "_typing")]) would help future maintainers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
🤖 Fix all issues with AI agents
In `@crates/vm/src/stdlib/ast/pyast.rs`:
- Around line 79-86: The custom NodeExprConstant::extend_class_with_fields
implementation forgot to clear the IMMUTABLETYPE flag, making ast.Constant
immutable; update NodeExprConstant::extend_class_with_fields to mirror the
macro-based extend_class logic by unsafely obtaining a mutable pointer to
class.slots.flags (as *mut crate::types::PyTypeFlags) and calling
.remove(crate::types::PyTypeFlags::IMMUTABLETYPE) on it during type
initialization so the AST node becomes mutable like the others.
In `@crates/vm/src/stdlib/typing.rs`:
- Around line 113-127: The bug is that new_eager stores the actual value in
compute_value, so evaluate_value treats callable values as evaluator functions;
fix by making compute_value always be a zero-argument callable that returns the
eager value (i.e., wrap value in a closure) and keep cached_value set to
Some(value) as before; update new_eager (and the same eager-constructor variant
at the other location) so compute_value is a function/closure that ignores
incoming args and returns the stored value, ensuring evaluate_value will call
the wrapper rather than misinterpreting a callable value as the evaluator.
- Around line 307-309: The error message hardcodes "not int" when downcast of
name fails; change the map_err closure on
name.downcast::<crate::builtins::PyStr>() to inspect the actual object's type
and include that in the vm.new_type_error message (e.g., use the Err value
passed into the closure to get its runtime type/name via the object's class or
type-name accessor and format "typealias() argument 'name' must be str, not
{actual_type}"). Keep using vm.new_type_error and the same call site so the only
change is replacing the |_| closure with one that computes the real type name.
🧹 Nitpick comments (5)
crates/codegen/src/symboltable.rs (1)
399-413: Dead!is_classbranch inneeds_class_symbolscomputation.The
!is_classdisjunct (lines 405-407) can never influenceclass_symbols_clonebecause line 409 gates cloning onis_class && needs_class_symbols. When!is_class,class_symbols_cloneis alwaysNoneregardless ofneeds_class_symbols, and child scopes receiveclass_entrydirectly (line 425). The extra condition adds noise without effect.♻️ Proposed simplification
- let needs_class_symbols = (is_class - && (sub_tables.iter().any(|st| st.can_see_class_scope) - || annotation_block - .as_ref() - .is_some_and(|b| b.can_see_class_scope))) - || (!is_class - && class_entry.is_some() - && sub_tables.iter().any(|st| st.can_see_class_scope)); - - let class_symbols_clone = if is_class && needs_class_symbols { + let needs_class_symbols = is_class + && (sub_tables.iter().any(|st| st.can_see_class_scope) + || annotation_block + .as_ref() + .is_some_and(|b| b.can_see_class_scope)); + + let class_symbols_clone = if needs_class_symbols { Some(symbols.clone()) } else { None };crates/vm/src/stdlib/typing.rs (4)
367-377:vm.import("typing", 0)on everyiter()call is potentially expensive.This performs a module import lookup each time a
TypeAliasTypeis iterated (forUnpacksemantics). The same pattern appears inunpack_typevartuples. While this follows other patterns in the codebase and the import is cached after first load, it's worth noting.
214-246:check_type_paramsreturn value is discarded by the caller.Line 316 calls
Self::check_type_params(&tp, vm)?;but ignores theOption<PyTupleRef>return value. The function is only used for its validation side-effect. Consider changing the return type toPyResult<()>to reflect this.Suggested simplification
fn check_type_params( type_params: &PyTupleRef, vm: &VirtualMachine, - ) -> PyResult<Option<PyTupleRef>> { + ) -> PyResult<()> { if type_params.is_empty() { - return Ok(None); + return Ok(()); } // ... validation logic unchanged ... - Ok(Some(type_params.clone())) + Ok(()) }
134-143: Avoid double-computation in__value__via non-atomic check-then-set.The lock is released after
clone()on line 136, then re-acquired on line 141. Two concurrent threads could both observeNone, both callcompute_value, and both store results, causing wasteful redundant computation. Sincecompute_valueis a Python callable, the double-computation is safe but inefficient.Use a double-check-lock pattern to eliminate the race:
Suggested pattern
fn __value__(&self, vm: &VirtualMachine) -> PyResult { - let cached = self.cached_value.lock().clone(); - if let Some(value) = cached { - return Ok(value); - } - let value = self.compute_value.call((), vm)?; - *self.cached_value.lock() = Some(value.clone()); - Ok(value) + { + let cached = self.cached_value.lock(); + if let Some(value) = cached.as_ref() { + return Ok(value.clone()); + } + } + let value = self.compute_value.call((), vm)?; + let mut cached = self.cached_value.lock(); + if let Some(existing) = cached.as_ref() { + Ok(existing.clone()) + } else { + *cached = Some(value.clone()); + Ok(value) + } }This pattern computes
compute_valueoutside the lock (safe for Python re-entry), then verifies another thread hasn't already cached the result before storing.
322-326: Use the establishedcaller()pattern for consistency with TypeVar.TypeAliasType retrieves the module by accessing
func_obj.get_attr("__module__", vm), but TypeVar in the same codebase uses thecaller()helper which accessesframe.globals.get_item("__name__", vm). A comment in typevar.rs explicitly notes that "RustPython's Frame architecture is different" from CPython and deliberately uses globals instead of func_obj. TypeAliasType should follow this same pattern—either by callingset_module_from_caller()after construction (like TypeVar does inslot_new) or by using thecaller()function directly to retrieve the module name, ensuring consistency with RustPython's architectural approach.
24bcfe6 to
a14141c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🤖 Fix all issues with AI agents
In `@crates/vm/src/stdlib/typing.rs`:
- Around line 93-95: evaluate_value confuses "was created eagerly" with "was
lazily evaluated" because it tests cached_value.is_some() and then wraps
compute_value in a ConstEvaluator, causing callers to receive the evaluator
function instead of the evaluated result; to fix, add an explicit boolean flag
(e.g., is_eager or has_compute_fn) to the lazy-alias struct to record whether
the alias was constructed eagerly vs lazily, update all places that construct
aliases (including where you currently create a ConstEvaluator) to set that flag
appropriately, change evaluate_value to check that new flag instead of
cached_value.is_some() and only wrap/replace compute_value with a ConstEvaluator
when the flag indicates an eager evaluator, and update the __value__ access /
cached_value-setting path to mark the alias as lazily-evaluated (leaving the
eager flag false) so evaluate_value returns the actual evaluated object rather
than the evaluator function; also apply the same change to the other relevant
branch(s) around the existing ConstEvaluator construction.
- Around line 326-332: TypeAliasType currently derives the module from
func_obj.__module__ (via f.func_obj) which can be reassigned or missing; change
it to read the caller frame's globals['__name__'] like the TypeVar
implementation does. In the block that obtains module before calling
Self::new_eager(name, ...), replace the func_obj-based lookup with a
globals-based lookup using vm.current_frame() and
f.globals().get_item("__name__") (mirror the caller() logic in typevar.rs), and
pass that value into TypeAliasType::new_eager so module resolution matches
TypeVar/ParamSpec behavior. Ensure you handle absent frames or missing __name__
by falling back to None as before.
🧹 Nitpick comments (2)
crates/vm/src/builtins/genericalias.rs (2)
150-168: List repr:borrow_vec()per-element access may panic on concurrent mutation.The comment says "Use indexed access so list mutation during repr causes IndexError," but
list.borrow_vec()acquires a borrow guard. Ifrepr_itemtriggers a__repr__that mutates this same list, the secondborrow_vec()call will panic (borrow conflict) rather than returning anIndexError. Consider either borrowing the whole slice once upfront (clone elements) or documenting this difference from CPython's behavior.Possible safer alternative: clone elements upfront
if obj.class().is(vm.ctx.types.list_type) { let list = obj.downcast_ref::<crate::builtins::PyList>().unwrap(); - let len = list.borrow_vec().len(); - let mut parts = Vec::with_capacity(len); - // Use indexed access so list mutation during repr causes IndexError - for i in 0..len { - let item = - list.borrow_vec().get(i).cloned().ok_or_else(|| { - vm.new_index_error("list index out of range".to_owned()) - })?; - parts.push(repr_item(item, vm)?); - } + let items: Vec<PyObjectRef> = list.borrow_vec().to_vec(); + let mut parts = Vec::with_capacity(items.len()); + for item in items { + parts.push(repr_item(item, vm)?); + } Ok(format!("[{}]", parts.join(", ")))
670-721:PyGenericAliasIteratorlackstraverse— potential GC concern.The iterator holds a
PyMutex<Option<PyObjectRef>>which references aPyGenericAlias. If the alias participates in a reference cycle, the GC won't be able to trace through this iterator. Consider addingtraverse = "manual"ortraverseto the#[pyclass]attribute, similar to how other iterators in the codebase handle this.
- TypeAliasType: lazy value evaluation via closures, __module__, __parameters__, __iter__, evaluate_value, check_type_params, IMMUTABLETYPE flag, Hashable/AsMapping/Iterable traits - TypeAliasType constructor: positional-or-keyword arg validation, duplicate/unexpected kwarg rejection - type.__annotations__ setter: distinguish None assignment from deletion - Annotation scope: name as __annotate__, format as positional-only, __conditional_annotations__ uses Cell for both load and store - Compiler: proper TypeParams/TypeAlias scope with closures, find_ann covers match/try-except handlers - symboltable: deduplicate TypeAlias value scope code - GenericAlias repr: handle list args, avoid deadlock in repr_arg by cloning items before calling repr - AST types: remove IMMUTABLETYPE (heap types, mutable) - pymodule macro: preserve existing __module__ getset descriptors
- TypeAliasType: lazy value evaluation via closures, __module__, __parameters__, __iter__, evaluate_value, check_type_params, IMMUTABLETYPE flag, Hashable/AsMapping/Iterable traits - TypeAliasType constructor: positional-or-keyword arg validation, duplicate/unexpected kwarg rejection - type.__annotations__ setter: distinguish None assignment from deletion - Annotation scope: name as __annotate__, format as positional-only, __conditional_annotations__ uses Cell for both load and store - Compiler: proper TypeParams/TypeAlias scope with closures, find_ann covers match/try-except handlers - symboltable: deduplicate TypeAlias value scope code - GenericAlias: implement gaiterobject (generic_alias_iterator), starred equality comparison, starred pickle via iterator reduce, split attr_exceptions/attr_blocked for correct __dir__, make_parameters/subs_parameters handle list/tuple args recursively, repr_arg indexed access for mutation safety - AST types: remove IMMUTABLETYPE (heap types, mutable) - pymodule macro: preserve existing __module__ getset descriptors
📦 Library DependenciesThe following Lib/ modules were modified. Here are their dependencies: [x] lib: cpython/Lib/dataclasses.py dependencies:
dependent tests: (6 tests)
[x] lib: cpython/Lib/typing.py dependencies:
dependent tests: (12 tests)
[x] lib: cpython/Lib/reprlib.py dependencies:
dependent tests: (163 tests)
Legend:
|
Summary by CodeRabbit