-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Fix traceback, syntax errors, and exception handling #7015
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Caution Review failedThe pull request is closed. 📝 WalkthroughWalkthroughRestore and refine source-location tracking for emitted bytecode (especially subscripts/slices), add EOF/unclosed‑bracket and indentation error mapping, introduce frame ownership and guarded clearing, adjust traceback setter/excepthook behavior, refine exception contextualization and suggestion logic, and tweak several coroutine/awaitable flows. Changes
Sequence Diagram(s)sequenceDiagram
participant Caller
participant VM as VirtualMachine
participant Frame as ExecutingFrame
participant TB as Traceback
participant Hook as excepthook
rect rgba(220,240,255,0.5)
Caller->>VM: run code / raise exception
end
VM->>Frame: handle_exception(exc, idx, is_reraise, is_new_raise)
rect rgba(200,255,200,0.5)
Frame->>TB: decide contextualize (is_new_raise / prior tb)
TB-->>Frame: set __context__/__cause__/traceback as needed
end
Frame-->>VM: return/unwind result
VM->>Hook: call excepthook(normalized_exception)
Hook->>Hook: try traceback.print_exception (Python)
alt python_print_succeeds
Hook-->>VM: done
else python_print_fails
Hook-->>VM: fallback to vm.write_exception (Rust)
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
Code has been automatically formatted The code in this PR has been formatted using:
git pull origin traceback |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
🤖 Fix all issues with AI agents
In `@crates/compiler/src/lib.rs`:
- Around line 79-87: The end offset for indentation errors is computed using
byte length (line.len()) but SourceLocation.character_offset expects a character
count; update the computation in the block that builds end_loc (variables: loc,
line_idx, line, line_end_col, end_loc) to use the character count (e.g.
line.chars().count()) when creating the 1-indexed column value passed into
OneIndexed::new so the SourceLocation.character_offset reflects character
positions on non-ASCII lines.
In `@crates/vm/src/builtins/frame.rs`:
- Around line 150-155: The clear method currently only checks vm.frames for an
executing frame; update clear(&self, vm: &VirtualMachine) to also inspect all
threads by calling vm.get_all_current_frames() (or the equivalent function that
returns frames from all threads) and ensure none of those frames
.is(self.as_object()) before proceeding; if any match, return
Err(vm.new_runtime_error("cannot clear an executing frame".to_owned())) to
prevent clearing a frame executing in another thread.
- Around line 157-173: The clear() currently leaves Frame.temporary_refs and the
operand stack (state.stack) intact, so add a helper on Frame named
clear_value_stack(&self) that locks self.state and calls state.stack.clear(),
and in clear() explicitly clear temporary_refs (e.g.,
self.temporary_refs.lock().clear() or iterate and drop entries) and call
self.clear_value_stack(); ensure you reference the existing fields/locks
(self.state, self.temporary_refs) and call the new helper from the existing
clear method to fully break reference cycles.
In `@crates/vm/src/suggestion.rs`:
- Around line 73-82: The ImportError handling uses the wrong attributes: swap
the two get_attr calls so mod_name = exc.as_object().get_attr("name", vm).ok()?
and name = exc.as_object().get_attr("name_from", vm).ok()? (so mod_name_str
comes from the module attribute used to look up sys_modules.get_item) and then
pass &name into calculate_suggestions; update the references around
mod_name_str, sys_modules, and vm.dir(Some(module)) accordingly to match the
corrected variables.
🧹 Nitpick comments (1)
crates/vm/src/suggestion.rs (1)
50-55: Pre-existing:offer_suggestionsuses.is()forattribute_errorbut.fast_issubclass()forimport_error.Line 51 uses exact type match (
.is()) forattribute_error, while line 73 uses subclass check (.fast_issubclass()) forimport_error. CPython's suggestion mechanism usesPyErr_GivenExceptionMatches(subclass-aware) for all branches. If a customAttributeErrorsubclass is raised, suggestions won't be offered with the current code. Not introduced by this PR, but worth noting for consistency.
| fn clear(&self, vm: &VirtualMachine) -> PyResult<()> { | ||
| // Check if the frame is currently executing | ||
| let is_executing = vm.frames.borrow().iter().any(|f| f.is(self.as_object())); | ||
| if is_executing { | ||
| return Err(vm.new_runtime_error("cannot clear an executing frame".to_owned())); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
# Check if file exists and view the relevant lines
cat -n crates/vm/src/builtins/frame.rs | sed -n '140,180p'Repository: RustPython/RustPython
Length of output: 1485
🏁 Script executed:
# Look for threading feature definitions
rg "feature.*threading" --type tomlRepository: RustPython/RustPython
Length of output: 47
🏁 Script executed:
# Check for get_all_current_frames function
rg "get_all_current_frames" --type rsRepository: RustPython/RustPython
Length of output: 91
🏁 Script executed:
# Check for thread module and related functions
rg "mod thread" --type rsRepository: RustPython/RustPython
Length of output: 91
🏁 Script executed:
# Search for Frame struct and clear method usage/implementation
rg "impl.*Frame" crates/vm/src/builtins/ --type rs -A 10Repository: RustPython/RustPython
Length of output: 91
🏁 Script executed:
# Search for threading feature
find . -name "Cargo.toml" -o -name "*.toml" | xargs grep -l "threading"Repository: RustPython/RustPython
Length of output: 179
🏁 Script executed:
# Search for get_all_current_frames function
grep -r "get_all_current_frames" --include="*.rs"Repository: RustPython/RustPython
Length of output: 456
🏁 Script executed:
# Search for thread module and related functions
grep -r "mod thread" --include="*.rs"Repository: RustPython/RustPython
Length of output: 241
🏁 Script executed:
# Check if there's a stdlib::thread module
find . -path "*/stdlib/thread*" -type fRepository: RustPython/RustPython
Length of output: 97
🏁 Script executed:
# Look at the Frame struct definition and understand its threading model
grep -r "pub struct Frame" --include="*.rs" -A 5Repository: RustPython/RustPython
Length of output: 393
🏁 Script executed:
# Search for other frame clearing or execution checks
grep -r "is_executing\|executing.*frame" --include="*.rs" -B 2 -A 2Repository: RustPython/RustPython
Length of output: 2965
🏁 Script executed:
# Get the implementation of get_all_current_frames
grep -A 15 "pub fn get_all_current_frames" crates/vm/src/stdlib/thread.rsRepository: RustPython/RustPython
Length of output: 774
🏁 Script executed:
# Check threading feature flags in Cargo.toml
grep -A 20 "\[features\]" crates/vm/Cargo.tomlRepository: RustPython/RustPython
Length of output: 729
🏁 Script executed:
# Check how get_all_current_frames is used in sys.rs
grep -B 5 -A 10 "get_all_current_frames" crates/vm/src/stdlib/sys.rsRepository: RustPython/RustPython
Length of output: 628
🏁 Script executed:
# Check if there's a threading feature gate already in place
grep -r "#\[cfg.*threading" --include="*.rs"Repository: RustPython/RustPython
Length of output: 5515
🏁 Script executed:
# Verify if the path in the suggested fix is correct (crate::stdlib::thread)
grep -r "crate::stdlib::thread" --include="*.rs" | head -10Repository: RustPython/RustPython
Length of output: 718
🏁 Script executed:
# Check VirtualMachine structure to understand vm.frames access pattern
grep -A 20 "pub struct VirtualMachine" crates/vm/src/vm/mod.rs | head -30Repository: RustPython/RustPython
Length of output: 996
🏁 Script executed:
# Check the context around the clear method - is it public/exposed as Python API?
grep -B 10 "fn clear" crates/vm/src/builtins/frame.rs | head -20Repository: RustPython/RustPython
Length of output: 323
🏁 Script executed:
# Check if there are tests or usage of Frame.clear
grep -r "\.clear()" --include="*.rs" | grep -i "frame\|Frame"Repository: RustPython/RustPython
Length of output: 288
🏁 Script executed:
# Check how Python threads work with frames in this codebase
grep -A 20 "pub fn push_frame" crates/vm/src/frame.rsRepository: RustPython/RustPython
Length of output: 47
🏁 Script executed:
# Look at how vm.frames is managed across threads
grep -B 5 -A 10 "thread_frames\|thread_frame" crates/vm/src/vm/mod.rsRepository: RustPython/RustPython
Length of output: 2363
Check for executing frames in other threads before clearing.
The execution guard only inspects vm.frames (current thread). In a threading build, a frame can still be actively executing elsewhere; clearing it mid-execution corrupts its locals. Add a check using get_all_current_frames() for all threads.
Suggested fix
- let is_executing = vm.frames.borrow().iter().any(|f| f.is(self.as_object()));
+ let mut is_executing = vm.frames.borrow().iter().any(|f| f.is(self.as_object()));
+ #[cfg(feature = "threading")]
+ {
+ is_executing |= crate::stdlib::thread::get_all_current_frames(vm)
+ .into_iter()
+ .any(|(_, frame)| frame.is(self.as_object()));
+ }🤖 Prompt for AI Agents
In `@crates/vm/src/builtins/frame.rs` around lines 150 - 155, The clear method
currently only checks vm.frames for an executing frame; update clear(&self, vm:
&VirtualMachine) to also inspect all threads by calling
vm.get_all_current_frames() (or the equivalent function that returns frames from
all threads) and ensure none of those frames .is(self.as_object()) before
proceeding; if any match, return Err(vm.new_runtime_error("cannot clear an
executing frame".to_owned())) to prevent clearing a frame executing in another
thread.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
🤖 Fix all issues with AI agents
In `@crates/vm/src/frame.rs`:
- Around line 2225-2245: The non-circular-import branch builds msg without the
module name; update the else branch in import_from to include the originating
module (use the existing module_name variable) e.g. change the else branch's
format!("cannot import name '{name}'") to include "from '{module_name}'" (and
optionally append file/path if you can obtain it) so the vm.new_import_error
call (and the subsequent setting of "name_from" on err) reports the module name
like the circular-import branch; locate this logic around
is_module_initializing(...) and vm.new_import_error(...) in frame.rs.
In `@crates/vm/src/suggestion.rs`:
- Around line 73-82: The code is using the wrong variable for suggestion lookup:
replace the search needle passed to calculate_suggestions from name to the
missing-symbol value (name_from). Specifically, in the ImportError handling
block that retrieves name, mod_name and mod_name_str, call calculate_suggestions
with the missing attribute (mod_name / mod_name_str) instead of name so
suggestions are generated for the symbol that failed to import.
In `@crates/vm/src/vm/vm_new.rs`:
- Around line 384-395: Replace fragile string matching on OtherError(s) that
checks s.ends_with("was never closed") by adding an explicit variant or flag to
the ParseError produced by the compiler and matching on that; change the
compiler side (where format!("'{}' was never closed", bracket_char) is created)
to produce a structured indicator (e.g., ParseError::UnclosedBracket or
ParseError { kind: UnclosedBracket, bracket_char }) and then update the match in
vm_new.rs (the crate::compiler::CompileError::Parse arm that currently looks at
ruff_python_parser::ParseErrorType::OtherError(s)) to detect the new
UnclosedBracket variant and return self.ctx.exceptions.incomplete_input_error
when allow_incomplete is true (otherwise syntax_error), removing any reliance on
the error message string.
🧹 Nitpick comments (6)
crates/vm/src/builtins/frame.rs (1)
149-176: Missing generator/coroutine handling inframe.clear().CPython's
frame.clear()also handles the case where the frame belongs to a suspended generator or coroutine — it closes it by throwingGeneratorExitinto it, and if the frame is currently executing, it sets a flag to clear when the frame is next entered rather than raising immediately.This implementation correctly rejects executing frames and clears locals, but doesn't address generator-owned frames. If
clear()is called on a frame backing a suspended generator, the generator may continue to reference stale (cleared) state.Consider adding a TODO or documenting this limitation if it's intentionally deferred.
Suggested improvement
#[pymethod] fn clear(&self, vm: &VirtualMachine) -> PyResult<()> { // Check if the frame is currently executing let is_executing = vm.frames.borrow().iter().any(|f| f.is(self.as_object())); if is_executing { return Err(vm.new_runtime_error("cannot clear an executing frame".to_owned())); } + // TODO: Handle generator/coroutine frames - CPython closes suspended + // generators by throwing GeneratorExit before clearing the frame. + // See PyFrame_Clear in CPython's frameobject.c. + // Clear fastlocalscrates/vm/src/stdlib/sys.rs (1)
700-727:normalize_exceptionis called twice on fallback.When the Python
traceback.print_exceptionpath fails (e.g.,print_exc.call()raises), the code falls through and normalizes the exception again on line 715. Since normalization should be idempotent this isn't a correctness issue, but you could normalize once and reuse the result for both paths to avoid redundant work.Suggested refactor to normalize once
fn excepthook( exc_type: PyObjectRef, exc_val: PyObjectRef, exc_tb: PyObjectRef, vm: &VirtualMachine, ) -> PyResult<()> { - // Use Python traceback module for exception display. - // This enables features like keyword suggestions in SyntaxError. - if let Ok(tb_mod) = vm.import("traceback", 0) - && let Ok(print_exc) = tb_mod.get_attr("print_exception", vm) - { - let exc = vm.normalize_exception(exc_type.clone(), exc_val.clone(), exc_tb.clone()); - if let Ok(exc) = exc - && print_exc.call((exc.as_object().to_owned(),), vm).is_ok() - { - return Ok(()); - } - } - - // Fallback to Rust-level exception printing - let stderr = super::get_stderr(vm)?; - match vm.normalize_exception(exc_type.clone(), exc_val.clone(), exc_tb) { + let stderr = super::get_stderr(vm)?; + match vm.normalize_exception(exc_type.clone(), exc_val.clone(), exc_tb) { + Ok(exc) => { + // Try Python traceback module first for richer output + if let Ok(tb_mod) = vm.import("traceback", 0) + && let Ok(print_exc) = tb_mod.get_attr("print_exception", vm) + && print_exc.call((exc.as_object().to_owned(),), vm).is_ok() + { + return Ok(()); + } + // Fallback to Rust-level exception printing + vm.write_exception(&mut crate::py_io::PyWriter(stderr, vm), &exc) + }crates/vm/src/vm/vm_new.rs (1)
343-364: Tab/space detection scans across the entire source, not per-block.The heuristic flags a
TabErrorif any line uses a leading tab and any other line uses a leading space, even if they belong to completely unrelated blocks. CPython's detection is tighter (per tokenizer context). This is a reasonable approximation for error classification but may misfire on files that legitimately mix indentation styles in separate top-level blocks (unusual but not invalid in Python).Not a blocker, just worth documenting in a comment.
crates/codegen/src/compile.rs (1)
496-554: Deduplicate range restoration before emitting subscript ops.
set_source_range(subscript_range)is repeated in both branches; consider hoisting it after slice compilation and branching only for the emit step to keep this DRY.Suggested refactor
- if slice.should_use_slice_optimization() && !matches!(ctx, ast::ExprContext::Del) { + let use_slice_opt = + slice.should_use_slice_optimization() && !matches!(ctx, ast::ExprContext::Del); + if use_slice_opt { match slice { ast::Expr::Slice(s) => self.compile_slice_two_parts(s)?, _ => unreachable!( "should_use_slice_optimization should only return true for ast::Expr::Slice" ), }; - // Restore full subscript expression range before emitting - self.set_source_range(subscript_range); - match ctx { - ast::ExprContext::Load => { - emit!(self, Instruction::BinarySlice); - } - ast::ExprContext::Store => { - emit!(self, Instruction::StoreSlice); - } - _ => unreachable!(), - } } else { // VISIT(c, expr, e->v.Subscript.slice) self.compile_expression(slice)?; - // Restore full subscript expression range before emitting - self.set_source_range(subscript_range); - - // Emit appropriate instruction based on context - match ctx { - ast::ExprContext::Load => emit!( - self, - Instruction::BinaryOp { - op: BinaryOperator::Subscr - } - ), - ast::ExprContext::Store => emit!(self, Instruction::StoreSubscr), - ast::ExprContext::Del => emit!(self, Instruction::DeleteSubscr), - ast::ExprContext::Invalid => { - return Err(self.error(CodegenErrorType::SyntaxError( - "Invalid expression context".to_owned(), - ))); - } - } } + + // Restore full subscript expression range before emitting + self.set_source_range(subscript_range); + + // Emit appropriate instruction based on context + match (use_slice_opt, ctx) { + (true, ast::ExprContext::Load) => emit!(self, Instruction::BinarySlice), + (true, ast::ExprContext::Store) => emit!(self, Instruction::StoreSlice), + (true, _) => unreachable!(), + (false, ast::ExprContext::Load) => emit!( + self, + Instruction::BinaryOp { + op: BinaryOperator::Subscr + } + ), + (false, ast::ExprContext::Store) => emit!(self, Instruction::StoreSubscr), + (false, ast::ExprContext::Del) => emit!(self, Instruction::DeleteSubscr), + (false, ast::ExprContext::Invalid) => { + return Err(self.error(CodegenErrorType::SyntaxError( + "Invalid expression context".to_owned(), + ))); + } + }As per coding guidelines, "Avoid duplicate code in branches: extract differing values first, then call common logic once instead of duplicating logic in if/else branches."
crates/compiler/src/lib.rs (2)
147-233:find_unclosed_bracketdoes not account for raw strings.In raw strings (
r"...",r'...'), backslash is not an escape character. The function treats\as an escape inside all strings (lines 172–173), sor"path\to\"would be misinterpreted — the function would think the string continues past the closing quote. In practice, for EOF errors caused by unclosed brackets, this edge case is unlikely to trigger (the unclosed raw string itself would be the error), and the fallback path handles theNonereturn gracefully. Still worth noting as a known limitation.Also, closing brackets (line 223) pop without verifying the bracket type matches. This is acceptable here since mismatched brackets would produce a different parser error before EOF, but a comment documenting this assumption would help future maintainers.
156-156: Minor: avoid collecting all chars into aVec.
source.chars().collect::<Vec<_>>()allocates the entire character sequence upfront. Since this is only called during error handling (cold path), it's not performance-critical, but you could usesource.char_indices()to iterate lazily and get byte offsets for free, which would also eliminate the manualbyte_offsettracking.Sketch using char_indices
- let chars: Vec<char> = source.chars().collect(); - let mut i = 0; - let mut byte_offset = 0; - - while i < chars.len() { - let ch = chars[i]; - let ch_len = ch.len_utf8(); + let chars: Vec<(usize, char)> = source.char_indices().collect(); + let mut i = 0; + + while i < chars.len() { + let (byte_offset, ch) = chars[i];This still needs the
Vecfor lookahead (chars[i+1],chars[i+2]), but byte offsets come from the iterator rather than manual tracking, reducing a class of bugs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@crates/vm/src/vm/vm_new.rs`:
- Around line 343-364: The TabError detection in the is_tab_error computation is
too broad because it uses global seen_tab/seen_space across all lines; change it
to inspect each line's leading indentation only by computing the prefix of bytes
up to the first non-space-or-tab and checking if that prefix contains both b'\t'
and b' ' (if so return true for that line), otherwise continue to the next line;
update the closure that sets is_tab_error (the code that iterates source.lines()
and for b in line.bytes().take_while(...)) to reset per-line flags (or directly
test the prefix for both characters) so only lines with mixed tabs and spaces in
their own indentation trigger self.ctx.exceptions.tab_error, otherwise fall back
to indentation_error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
crates/codegen/src/compile.rs (1)
496-536:⚠️ Potential issue | 🟠 MajorStore/Delete subscript ops can inherit the wrong source range.
subscript_rangeis captured fromself.current_source_range, but callers likecompile_storetypically leave that set to the RHS expression (e.g., ina[b] = c, the range is forc). After this change,StoreSubscr/DeleteSubscrwill be attributed to the RHS or whole statement instead of the subscript, which is a regression in traceback precision.Prefer deriving the range from the subscript expression itself (or pass the subscript’s range into
compile_subscript) so emit locations stay correct across load/store/delete contexts.🔧 Possible fix (local derivation as a fallback)
- let subscript_range = self.current_source_range; + let subscript_range = TextRange::new(value.range().start(), slice.range().end());
🤖 Fix all issues with AI agents
In `@crates/vm/src/vm/vm_new.rs`:
- Around line 569-582: The `_metadata` tuple for syntax_error currently
hardcodes line/column to zero; instead use the extracted lineno and offset from
error.python_location() (the variables named lineno and offset) when building
the tuple passed to self.ctx.new_tuple so `_metadata` reflects the actual
position and enables proper keyword-typo suggestions; update the construction
that calls self.ctx.new_tuple (and the subsequent
syntax_error.as_object().set_attr("_metadata", ...)) to insert lineno and offset
(cloned or converted as needed) rather than self.ctx.new_int(0).
🧹 Nitpick comments (4)
crates/compiler/src/lib.rs (1)
220-224: Closing brackets pop without verifying they match the opening bracket.For code like
([), the)pops the[from the stack, potentially hiding the real unclosed bracket. Since this is a heuristic for error messages on already-broken code, this may be acceptable, but matching brackets would improve accuracy.Optional improvement
'(' | '[' | '{' => stack.push((ch, byte_offset)), ')' | ']' | '}' => { - stack.pop(); + let expected = match ch { + ')' => '(', + ']' => '[', + '}' => '{', + _ => unreachable!(), + }; + if stack.last().is_some_and(|(open, _)| *open == expected) { + stack.pop(); + } }crates/vm/src/builtins/traceback.rs (1)
65-74:__dir__hardcodes attributes and omits inherited ones.CPython's
traceback.__dir__()delegates toobject.__dir__()which also returns inherited attributes (e.g.,__class__,__doc__,__repr__, etc.). This hardcoded list returns only the four traceback-specific attributes. If that's intentional to match a specific CPython behavior, it's fine, but it diverges from the defaultobject.__dir__pattern shown inbuiltins/object.rs(which callsobj.dir(vm)to collect all attributes).Also a minor nit:
vec![...].into_iter().map(...)can be simplified.♻️ Optional simplification
- let attrs = vec!["tb_frame", "tb_next", "tb_lasti", "tb_lineno"]; - PyList::from( - attrs - .into_iter() - .map(|s| vm.ctx.new_str(s).into()) - .collect::<Vec<_>>(), - ) + PyList::from( + ["tb_frame", "tb_next", "tb_lasti", "tb_lineno"] + .iter() + .map(|&s| vm.ctx.new_str(s).into()) + .collect::<Vec<_>>(), + )crates/vm/src/vm/vm_new.rs (1)
377-443: Consider extracting the repeatedallow_incomplete→ error type pattern.The pattern
if allow_incomplete { incomplete_input_error } else { syntax_error }appears at least 4 times in this match block (lines 378–382, 390–394, 406–410, 435–442). Extracting a small helper or a local closure would reduce duplication. As per coding guidelines, "Avoid duplicate code in branches: extract differing values first, then call common logic once instead of duplicating logic in if/else branches."♻️ Example helper
let incomplete_or_syntax = |allow: bool| -> &'static Py<PyType> { if allow { self.ctx.exceptions.incomplete_input_error } else { self.ctx.exceptions.syntax_error } };Then each arm becomes simply
incomplete_or_syntax(allow_incomplete).crates/vm/src/frame.rs (1)
2243-2243: Silently discardingset_attrerror forname_from.
let _ = err.as_object().set_attr(...)suppresses any error. If this is intentional (best-effort metadata), a brief comment would clarify intent. Ifset_attron a freshly-created exception object can realistically fail, the error should be logged or propagated.
|
|
||
| // Set _metadata for keyword typo suggestions in traceback module. | ||
| // Format: (line_offset, col_offset, source_code) | ||
| if let Some(source) = source { | ||
| let metadata = self.ctx.new_tuple(vec![ | ||
| self.ctx.new_int(0).into(), | ||
| self.ctx.new_int(0).into(), | ||
| self.ctx.new_str(source).into(), | ||
| ]); | ||
| syntax_error | ||
| .as_object() | ||
| .set_attr("_metadata", metadata, self) | ||
| .unwrap(); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Search for _metadata usage in Rust files
rg -n "_metadata" --type rs -C 3Repository: RustPython/RustPython
Length of output: 91
🏁 Script executed:
#!/bin/bash
# Search for _metadata usage without type filter
rg -n "_metadata" -C 3Repository: RustPython/RustPython
Length of output: 43526
🏁 Script executed:
#!/bin/bash
# Check the context around lines 569-582 in vm_new.rs to see what error data is available
head -n 600 crates/vm/src/vm/vm_new.rs | tail -n 100Repository: RustPython/RustPython
Length of output: 3991
Use actual position offsets from error in _metadata tuple instead of hardcoding to zero.
The code extracts (lineno, offset) from error.python_location() at lines 556–561 and uses these values for exception attributes, but then discards them when constructing the _metadata tuple at lines 573–577 by hardcoding both offsets to zero. This breaks the keyword typo suggestion feature in the traceback module for syntax errors not at the beginning of the source. Use the extracted lineno and offset variables instead:
let metadata = self.ctx.new_tuple(vec![
lineno.clone().into(),
offset.clone().into(),
self.ctx.new_str(source).into(),
]);🤖 Prompt for AI Agents
In `@crates/vm/src/vm/vm_new.rs` around lines 569 - 582, The `_metadata` tuple for
syntax_error currently hardcodes line/column to zero; instead use the extracted
lineno and offset from error.python_location() (the variables named lineno and
offset) when building the tuple passed to self.ctx.new_tuple so `_metadata`
reflects the actual position and enables proper keyword-typo suggestions; update
the construction that calls self.ctx.new_tuple (and the subsequent
syntax_error.as_object().set_attr("_metadata", ...)) to insert lineno and offset
(cloned or converted as needed) rather than self.ctx.new_int(0).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@crates/compiler/src/lib.rs`:
- Around line 77-88: The code computes the start location in loc but mistakenly
returns (error.error, end_loc, end_loc), discarding the start; update the return
in the branch that builds end_loc so it returns the original start loc and the
computed end_loc (i.e., return (error.error, loc, end_loc)) so the error range
spans from the computed start to end; look for the variables loc and end_loc in
the function in crates/compiler/src/lib.rs to make this change.
🧹 Nitpick comments (2)
crates/compiler/src/lib.rs (2)
220-224: Closing brackets pop without matching — can misattribute the unclosed bracket.
stack.pop()on a closing bracket doesn't verify it matches the top of the stack. For mismatched input like([), the)pops[instead of(, and the result points to(— which is an acceptable "best effort" in that case. However, for([)followed by more code, the stack is left in an inconsistent state that may returnNoneor the wrong bracket.Since this is only called for EOF errors (where true mismatches are rare), this is a minor accuracy gap. If you want to harden it later, skip the pop when brackets don't match.
Optional: match bracket types before popping
')' | ']' | '}' => { - stack.pop(); + let expected = match ch { + ')' => '(', + ']' => '[', + '}' => '{', + _ => unreachable!(), + }; + if stack.last().is_some_and(|&(top, _)| top == expected) { + stack.pop(); + } }
149-158: Consider usingchar_indices()to avoid the upfrontVec<char>allocation.
source.chars().collect::<Vec<char>>()duplicates the entire source ascharvalues. Usingsource.char_indices()provides(byte_offset, char)pairs without the allocation, and you can peek ahead withsource[byte_offset..]slicing when checking triple quotes.This is error-path-only code so not performance-critical, but it would be a cleaner idiomatic Rust pattern.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
🤖 Fix all issues with AI agents
In `@crates/compiler/src/lib.rs`:
- Around line 152-237: find_unclosed_bracket fails to recognize Python string
prefixes (r, b, f, rb, br, fr, rf, etc.), so backslashes in raw strings are
incorrectly treated as escapes; update the string-start detection in
find_unclosed_bracket to scan immediately before the quote (using the chars
vector and byte offsets) for 1–3 ASCII prefix letters, compute a lowercase
prefix string, set a boolean like is_raw = prefix.contains('r'), and use is_raw
to disable escape handling (i.e., only set escape_next on '\' when not is_raw);
ensure this prefix check is applied for both single and triple-quote branches
(string_quote, triple_quote, in_string, escape_next) so raw strings are parsed
correctly.
In `@crates/vm/src/builtins/frame.rs`:
- Around line 157-161: The current check in frame clearing uses
self.generator.load().is_some() and rejects any generator-owned frame; change it
to only reject when the generator is still running (use Coro::running() / not
closed()) so finished generators can be cleared, and update the error string to
reflect the actual condition (e.g., "cannot clear a running generator frame" or
similar). Locate the code that returns Err(vm.new_runtime_error(...)) in
crates/vm/src/builtins/frame.rs (the block referencing self.generator.load())
and replace the existence check with a runtime-state check on the loaded Coro
(call running() / closed() appropriately), returning the corrected error message
only when the generator is executing/suspended, allowing clearing when
closed/finished.
In `@crates/vm/src/builtins/traceback.rs`:
- Around line 65-73: The __dir__ method currently returns only
traceback-specific names; change it to start from the standard/object dir and
append the four traceback attributes instead of replacing the full list: in the
traceback impl's __dir__ (the function named __dir__ with signature fn
__dir__(&self, vm: &VirtualMachine) -> PyList) call the base/object __dir__
(e.g. invoke object.__dir__(self) via the VM or call the default __dir__ through
the VM API to obtain a PyList), then extend that PyList with "tb_frame",
"tb_next", "tb_lasti", "tb_lineno" and return the combined list so inherited
attributes are preserved.
🧹 Nitpick comments (3)
crates/vm/src/suggestion.rs (2)
51-58: Consider using.ok()?instead of.unwrap()for resilience.Lines 55–56 use
.unwrap()to retrievenameandobjfrom the exception. Now that the guard usesfast_issubclass(matching subclasses too), a misbehaving subclass could theoretically lack these attributes, causing a panic. The new ImportError branch on lines 77–78 already uses the safer.ok()?pattern. Aligning these branches would make the function uniformly panic-free.Suggested diff
- let name = exc.as_object().get_attr("name", vm).unwrap(); - let obj = exc.as_object().get_attr("obj", vm).unwrap(); + let name = exc.as_object().get_attr("name", vm).ok()?; + let obj = exc.as_object().get_attr("obj", vm).ok()?;
59-60: Same.unwrap()concern applies here.Line 60 has the same
.unwrap()risk as the AttributeError branch above — a NameError subclass could theoretically lack thenameattribute.Suggested diff
- let name = exc.as_object().get_attr("name", vm).unwrap(); + let name = exc.as_object().get_attr("name", vm).ok()?;crates/vm/src/vm/vm_new.rs (1)
351-368: TabError detection now correctly uses per-line checks — addresses the over-broad detection concern.The logic properly resets per line via the
.any()closure and only examines leading whitespace bytes. One minor allocation nit: theVec<u8>collection could be avoided with a direct scan:Optional: avoid allocation
let is_tab_error = source.is_some_and(|source| { source.lines().any(|line| { - let indent: Vec<u8> = line - .bytes() - .take_while(|&b| b == b' ' || b == b'\t') - .collect(); - indent.contains(&b' ') && indent.contains(&b'\t') + let mut seen_tab = false; + let mut seen_space = false; + for b in line.bytes().take_while(|&b| b == b' ' || b == b'\t') { + seen_tab |= b == b'\t'; + seen_space |= b == b' '; + } + seen_tab && seen_space }) });
| #[pymethod] | ||
| fn __dir__(&self, vm: &VirtualMachine) -> PyList { | ||
| PyList::from( | ||
| ["tb_frame", "tb_next", "tb_lasti", "tb_lineno"] | ||
| .iter() | ||
| .map(|&s| vm.ctx.new_str(s).into()) | ||
| .collect::<Vec<_>>(), | ||
| ) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
__dir__ returns only traceback-specific attributes, omitting inherited ones.
CPython's __dir__ for traceback objects returns both the traceback-specific attributes and those inherited from object (e.g., __class__, __doc__, __init__, __repr__, etc.). This override replaces the full listing with only 4 entries, so dir(tb) will appear incomplete versus CPython.
Consider calling the base __dir__ and extending the result, or using object.__dir__(self) as a starting point:
Sketch
#[pymethod]
- fn __dir__(&self, vm: &VirtualMachine) -> PyList {
- PyList::from(
- ["tb_frame", "tb_next", "tb_lasti", "tb_lineno"]
- .iter()
- .map(|&s| vm.ctx.new_str(s).into())
- .collect::<Vec<_>>(),
- )
+ fn __dir__(&self, vm: &VirtualMachine) -> PyResult<PyList> {
+ // Start with object.__dir__ and add traceback-specific attrs
+ let base = vm.call_method(self.as_object(), "__dir__", ())?;
+ // ... extend with tb-specific names if not present
}🤖 Prompt for AI Agents
In `@crates/vm/src/builtins/traceback.rs` around lines 65 - 73, The __dir__ method
currently returns only traceback-specific names; change it to start from the
standard/object dir and append the four traceback attributes instead of
replacing the full list: in the traceback impl's __dir__ (the function named
__dir__ with signature fn __dir__(&self, vm: &VirtualMachine) -> PyList) call
the base/object __dir__ (e.g. invoke object.__dir__(self) via the VM or call the
default __dir__ through the VM API to obtain a PyList), then extend that PyList
with "tb_frame", "tb_next", "tb_lasti", "tb_lineno" and return the combined list
so inherited attributes are preserved.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
crates/vm/src/suggestion.rs (1)
51-85:⚠️ Potential issue | 🟠 MajorChange
unwrap()took()?for missing exception attributes to avoid panics.When
fast_issubclassmatches exception subclasses, the expected attributes (name,obj) may not exist, causing.unwrap()to panic. Sinceoffer_suggestionsis best-effort, gracefully returnNonefor missing attributes instead.Lines 55–56 (AttributeError):
- Change
get_attr("name", vm).unwrap()toget_attr("name", vm).ok()?- Change
get_attr("obj", vm).unwrap()toget_attr("obj", vm).ok()?Line 60 (NameError):
- Change
get_attr("name", vm).unwrap()toget_attr("name", vm).ok()?This aligns with the ImportError section (lines 77–78), which already uses the
.ok()?pattern consistently.
🤖 Fix all issues with AI agents
In `@crates/codegen/src/compile.rs`:
- Around line 503-535: The slice optimization currently sets use_slice_opt
without considering ast::ExprContext::Invalid, which can lead to the
unreachable!() panic; fix it by restricting the optimization to only Load/Store
contexts or by handling Invalid explicitly: when computing use_slice_opt (the
let use_slice_opt = slice.should_use_slice_optimization() && !matches!(ctx,
ast::ExprContext::Del);) add a check to ensure ctx is Load or Store (e.g., &&
matches!(ctx, ast::ExprContext::Load | ast::ExprContext::Store)), and/or add a
(true, ast::ExprContext::Invalid) arm in the subsequent match to return
Err(self.error(CodegenErrorType::SyntaxError(...))); this touches the
computation of use_slice_opt and the match that emits BinarySlice/StoreSlice or
returns errors, and references compile_slice_two_parts and compile_expression
where the optimized vs fallback paths are taken.
🧹 Nitpick comments (4)
crates/vm/src/frame.rs (1)
423-497: Run a clean build after instruction-semantics changes.These edits adjust instruction handling in the eval loop. Please run the clean-build cleanup command to avoid stale bytecode artifacts.
Clean build commands
rm -r target/debug/build/rustpython-* find . | grep -E "\.pyc$" | xargs rm -rAs per coding guidelines, "When modifying bytecode instructions, perform a clean build by running: rm -r target/debug/build/rustpython-* && find . | grep -E ".pyc$" | xargs rm -r".
crates/compiler/src/lib.rs (2)
204-228: String-prefix lookback is a nice improvement — one readability nit.The prefix detection correctly handles
r,rb,br,fr, etc. by looking back up to 2 characters.
1..=2.min(i)relies on method-call precedence binding tighter than..=. It works correctly, but1..=i.min(2)reads more naturally as "up tomin(i, 2)" without requiring the reader to parse precedence rules.Readability tweak
- for lookback in 1..=2.min(i) { + for lookback in 1..=i.min(2) {
337-476: Consider adding unit tests for the new error-handling paths.The existing tests only cover successful compilation. The new logic — EOF→unclosed-bracket rewriting, IndentationError end-location adjustment, and the
find_unclosed_bracketheuristic — would benefit from targeted tests, e.g.:
find_unclosed_bracket("foo(bar[baz")→Some(('[', …))find_unclosed_bracket("r\"[\\\"")→None(bracket inside raw string)find_unclosed_bracket("complete()")→None- A
compile(…)call with an unclosed bracket verifying the error message containswas never closed.crates/vm/src/vm/vm_new.rs (1)
354-376: Avoid per-lineVec<u8>allocation in the tab-detection closure.The
collect::<Vec<u8>>()on each line's indent prefix is unnecessary — you can track the two booleans inline without materialising the bytes.♻️ Suggested diff
let is_tab_error = source.is_some_and(|source| { let mut has_space_indent = false; let mut has_tab_indent = false; for line in source.lines() { - let indent: Vec<u8> = line - .bytes() - .take_while(|&b| b == b' ' || b == b'\t') - .collect(); - if indent.is_empty() { - continue; - } - if indent.contains(&b' ') && indent.contains(&b'\t') { + let mut line_has_space = false; + let mut line_has_tab = false; + for b in line.bytes() { + match b { + b' ' => line_has_space = true, + b'\t' => line_has_tab = true, + _ => break, + } + } + if line_has_space && line_has_tab { return true; } - if indent.contains(&b' ') { + if line_has_space { has_space_indent = true; } - if indent.contains(&b'\t') { + if line_has_tab { has_tab_indent = true; } } has_space_indent && has_tab_indent });
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@crates/codegen/src/compile.rs`:
- Around line 503-515: The optimized slice path (when use_slice_opt is true) can
emit implicit LoadConst None from compile_slice_two_parts while
current_source_range still points at the preceding expression, so update the
slice source range before emitting any implicit None values: in the branch that
handles ast::Expr::Slice (used when use_slice_opt is true) ensure you call the
same range-setting logic used by the non-optimized path (the logic that
compile_expression(slice) triggers) for the slice's overall source range prior
to emitting LoadConst None; modify compile_slice_two_parts (or call a helper
from compile_slice_two_parts) to set current_source_range for the missing
lower/upper bound positions so debug ranges match the non-optimized path.
🧹 Nitpick comments (1)
crates/vm/src/vm/vm_new.rs (1)
354-376: Avoid per-lineVecallocation in theis_tab_errorcheck.The
Vec<u8>collection on each line is only used to call.contains(). You can achieve the same with direct iteration, avoiding the heap allocation.♻️ Suggested refactor
let is_tab_error = source.is_some_and(|source| { let mut has_space_indent = false; let mut has_tab_indent = false; for line in source.lines() { - let indent: Vec<u8> = line - .bytes() - .take_while(|&b| b == b' ' || b == b'\t') - .collect(); - if indent.is_empty() { - continue; - } - if indent.contains(&b' ') && indent.contains(&b'\t') { + let mut line_has_space = false; + let mut line_has_tab = false; + for b in line.bytes().take_while(|&b| b == b' ' || b == b'\t') { + line_has_space |= b == b' '; + line_has_tab |= b == b'\t'; + } + if !line_has_space && !line_has_tab { + continue; + } + if line_has_space && line_has_tab { return true; } - if indent.contains(&b' ') { + if line_has_space { has_space_indent = true; } - if indent.contains(&b'\t') { + if line_has_tab { has_tab_indent = true; } } has_space_indent && has_tab_indent });
|
|
||
| /// Find the last unclosed opening bracket in source code. | ||
| /// Returns the bracket character and its byte offset, or None if all brackets are balanced. | ||
| fn find_unclosed_bracket(source: &str) -> Option<(char, usize)> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe ruff has this functionality somewhere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://github.com/astral-sh/ruff/blob/main/crates/ruff_python_parser/src/parser/expression.rs seems to have a few related functions , but not sure about getting position
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🤖 Fix all issues with AI agents
In `@crates/vm/src/builtins/frame.rs`:
- Around line 149-165: The clear method currently rejects any frame with a
generator and non-zero lasti, which incorrectly blocks finished (closed)
generators; update the check in clear (around self.generator.load() and
self.lasti()) to detect suspended generators only — i.e., if the generator
payload is a PyGenerator or PyCoroutine, treat it as suspended when lasti() != 0
AND the underlying coroutine's closed() is false (use
payload::<PyGenerator>().map(|g| !g.as_coro().closed() && self.lasti() != 0) and
similarly for PyCoroutine), and only return the "cannot clear a suspended frame"
error when that combined suspended condition is true so finished generators are
allowed to clear.
In `@crates/vm/src/suggestion.rs`:
- Around line 51-60: The attribute access for exception subclasses can panic
because exc.class().fast_issubclass(...) may match subclasses that lack
"name"/"obj"; replace the direct .unwrap() calls in the attribute_error branch
(the lines assigning name and obj via exc.as_object().get_attr("name", vm) and
get_attr("obj", vm)) with fallible handling (e.g., use .ok()? or similar to
return None on missing attributes) mirroring the ImportError branch's pattern so
the function returns early instead of panicking when attributes are absent.
🧹 Nitpick comments (3)
crates/compiler/src/lib.rs (3)
203-228: String prefix lookback can match trailing letters of identifiers.The lookback loop (lines 207-216) scans up to 2 characters before the quote for
r/b/f/uprefixes, but it doesn't verify that those characters aren't part of a larger identifier. For example, inmyvar"text", the lookback seesrfrommyvarand incorrectly setsis_raw_string = true, causing backslash escapes inside the "string" to be ignored.This is a best-effort heuristic on an error path, so impact is limited to occasionally mis-reporting bracket positions when identifiers ending in
rimmediately precede a syntax-error string. Worth noting for future improvement.Sketch: check that the prefix chars are not preceded by an alphanumeric/underscore
// Check up to 2 characters before the quote for string prefix is_raw_string = false; - for look_back in 1..=2.min(i) { + let prefix_start = i.saturating_sub(2.min(i)); + // Only treat as prefix if not part of a larger identifier + let is_prefix_boundary = prefix_start == 0 + || !chars[prefix_start - 1].1.is_alphanumeric() + && chars[prefix_start - 1].1 != '_'; + if is_prefix_boundary { + for look_back in 1..=2.min(i) { let prev = chars[i - look_back].1; if matches!(prev, 'r' | 'R') { is_raw_string = true; break; } if !matches!(prev, 'b' | 'B' | 'f' | 'F' | 'u' | 'U') { break; } + } }
230-247: Mismatched closing brackets are silently ignored — consider popping anyway.When a closing bracket doesn't match the top of the stack (line 239), it's silently skipped. In source like
[(]), the)doesn't match[and is ignored, leaving both[and(on the stack. The function then returns((the last entry), but[is the actual unclosed bracket from the user's perspective.This is a known trade-off in simple bracket-matching heuristics and the current behavior is reasonable for error reporting. Just noting it for awareness.
154-162: Collecting allchar_indicesinto aVecallocates on every call.
source.char_indices().collect()materializes the entire source into a vector. This runs only on the error path so it's not performance-critical, but for very large files it's unnecessary memory. The lookback for prefix detection (up to 2 chars) and lookahead for triple quotes (up to 2 chars) could be handled with a small sliding window or by storing just the last few indices, avoiding the full allocation.Low priority given it's error-path only.
| #[pymethod] | ||
| fn clear(&self, vm: &VirtualMachine) -> PyResult<()> { | ||
| // Check if the frame is currently executing in any thread. | ||
| // The is_executing flag is set/cleared atomically by with_frame(). | ||
| if self | ||
| .is_executing | ||
| .load(core::sync::atomic::Ordering::Acquire) | ||
| { | ||
| return Err(vm.new_runtime_error("cannot clear an executing frame".to_owned())); | ||
| } | ||
|
|
||
| // If this frame is owned by a generator/coroutine and has been | ||
| // started (lasti > 0), it is suspended and cannot be cleared. | ||
| // Frames not yet started (lasti == 0, FRAME_CREATED state) can be cleared. | ||
| if self.generator.load().is_some() && self.lasti() != 0 { | ||
| return Err(vm.new_runtime_error("cannot clear a suspended frame".to_owned())); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Check how Coro exposes closed/running state and how generator types are structured
echo "=== Coro struct and closed() method ==="
rg -n "fn closed|fn running" crates/vm/src/coroutine.rs -B2 -A5
echo -e "\n=== PyGenerator struct ==="
rg -n "pub struct PyGenerator|inner.*Coro" crates/vm/src/builtins/generator.rs -B2 -A5
echo -e "\n=== PyCoroutine struct ==="
rg -n "pub struct PyCoroutine|inner.*Coro" crates/vm/src/builtins/coroutine.rs -B2 -A5
echo -e "\n=== How generator.load() is used elsewhere ==="
rg -n "generator\.load\(\)" crates/vm/src/ -B2 -A3Repository: RustPython/RustPython
Length of output: 2263
🏁 Script executed:
#!/bin/bash
# Check how to access the inner Coro from PyGenerator/PyCoroutine
echo "=== PyGenerator inner access ==="
rg -n "self\.inner\." crates/vm/src/builtins/generator.rs | head -20
echo -e "\n=== PyCoroutine inner access ==="
rg -n "self\.inner\." crates/vm/src/builtins/coroutine.rs | head -20
echo -e "\n=== Getters for Coro/frame ==="
rg -n "fn frame|fn closed|fn running" crates/vm/src/builtins/generator.rs -A2
echo -e "\n=== Type of generator.load() result ==="
rg -n "generator.*load\(\)" crates/vm/src/builtins/frame.rs -B10 -A2 | head -30
echo -e "\n=== Check PyRef accessors and payload method ==="
rg -n "impl.*PyGenerator|impl.*PyCoroutine" crates/vm/src/builtins/generator.rs crates/vm/src/builtins/coroutine.rs -A5Repository: RustPython/RustPython
Length of output: 9531
🏁 Script executed:
#!/bin/bash
# Find the generator field definition in Frame struct
echo "=== Frame struct definition ==="
rg -n "pub struct Frame|generator:" crates/vm/src/builtins/frame.rs -A20 | head -40
echo -e "\n=== Check if generator can be PyGenerator or PyCoroutine ==="
rg -n "PyGenerator|PyCoroutine" crates/vm/src/builtins/frame.rs | head -20
echo -e "\n=== Look for how generator is set/used in code ==="
rg -n "self\.generator.*=" crates/vm/src/builtins/frame.rs -B2 -A2
echo -e "\n=== Check generator field type more explicitly ==="
sed -n '1,100p' crates/vm/src/builtins/frame.rs | grep -A50 "pub struct Frame"Repository: RustPython/RustPython
Length of output: 604
🏁 Script executed:
#!/bin/bash
# Read the frame.rs file to find the struct definition
echo "=== Reading frame.rs struct definition ==="
head -50 crates/vm/src/builtins/frame.rs
echo -e "\n=== Search for 'generator' field ==="
rg -n "generator" crates/vm/src/builtins/frame.rs -B2 -A2 | head -50Repository: RustPython/RustPython
Length of output: 1919
🏁 Script executed:
#!/bin/bash
# Find the Frame struct definition in the frame module
echo "=== Find Frame struct in frame.rs or similar ==="
fd frame.rs crates/vm/src/ --type f
echo -e "\n=== Look for Frame struct definition ==="
rg -n "pub struct Frame" crates/vm/src/
echo -e "\n=== Check the generator field definition ==="
rg -n "generator.*:" crates/vm/src/frame.rs -B2 -A2
echo -e "\n=== Show full Frame struct ==="
rg -n "pub struct Frame\|pub.*generator" crates/vm/src/frame.rs -A100 | head -80Repository: RustPython/RustPython
Length of output: 1733
🏁 Script executed:
#!/bin/bash
# Find PyAtomicBorrow definition and understand what load() returns
echo "=== PyAtomicBorrow definition ==="
rg -n "pub struct PyAtomicBorrow|impl PyAtomicBorrow" crates/vm/src/ -A15
echo -e "\n=== PyAtomicBorrow::load() method ==="
rg -n "fn load" crates/vm/src/ -A5 | grep -A5 "PyAtomicBorrow" | head -30
echo -e "\n=== Search for PyAtomicBorrow definition ==="
rg -n "PyAtomicBorrow" crates/vm/src/ -B2 -A10 | head -60Repository: RustPython/RustPython
Length of output: 4979
🏁 Script executed:
#!/bin/bash
# Find the complete load() method and understand how to work with &PyObject
echo "=== Complete PyAtomicBorrow::load() method ==="
sed -n '490,510p' crates/vm/src/object/ext.rs
echo -e "\n=== Search for payload() method or similar downcasting ==="
rg -n "\.payload<|fn payload" crates/vm/src/ | head -20
echo -e "\n=== Look for how to downcast PyObject to PyGenerator ==="
rg -n "PyGenerator|PyCoroutine" crates/vm/src/builtins/frame.rs -B3 -A3
echo -e "\n=== Check if there's a try_into_ref or similar pattern ==="
rg -n "try_into_ref|payload" crates/vm/src/object/ -A3 | head -40Repository: RustPython/RustPython
Length of output: 3507
🏁 Script executed:
#!/bin/bash
# Let's verify the payload() method signature and understand the correct syntax
echo "=== PyObject::payload() method signature ==="
sed -n '892,920p' crates/vm/src/object/core.rs
echo -e "\n=== Check how payload() is used elsewhere in the codebase ==="
rg -n "\.payload::<" crates/vm/src/ -B2 -A2 | head -50
echo -e "\n=== Verify CPython behavior with finished generators ==="
echo "Looking for any tests or comments about generator cleanup..."
rg -n "finished|exhausted|generator.*clear" crates/vm/src/ -i | head -20Repository: RustPython/RustPython
Length of output: 2903
🏁 Script executed:
#!/bin/bash
# Search for actual usage patterns of payload() method
echo "=== How payload() is actually called ==="
rg -n "payload\(" crates/vm/src/ -A1 | head -60
echo -e "\n=== Check if payload is used with turbofish in the codebase ==="
rg -n "payload<|payload::" crates/vm/src/ | head -30
echo -e "\n=== Now check the actual clear() method implementation context ==="
sed -n '149,175p' crates/vm/src/builtins/frame.rs
echo -e "\n=== Check for any existing generator type checks in the codebase ==="
rg -n "PyGenerator|PyCoroutine" crates/vm/src/builtins/frame.rs -B3 -A3Repository: RustPython/RustPython
Length of output: 2169
🏁 Script executed:
#!/bin/bash
# Check if PyGenerator and PyCoroutine are imported or accessible in frame.rs
echo "=== Check imports at top of frame.rs ==="
head -30 crates/vm/src/builtins/frame.rs
echo -e "\n=== Check if there's a super module or crate structure for builtins ==="
rg -n "use.*PyGenerator|use.*PyCoroutine" crates/vm/src/builtins/frame.rs
echo -e "\n=== Look at the module structure ==="
ls -la crates/vm/src/builtins/ | grep -E "\.rs$"
echo -e "\n=== Check if super::PyGenerator is valid in this context ==="
rg -n "mod generator|mod coroutine" crates/vm/src/builtins/
echo -e "\n=== Verify actual behavior: check if lasti is indeed non-zero for finished generators ==="
rg -n "lasti.*=" crates/vm/src/ -B2 -A2 | grep -A5 "finish\|close\|exhaust" | head -30Repository: RustPython/RustPython
Length of output: 4588
🏁 Script executed:
#!/bin/bash
# Verify the actual behavior of generators and how lasti changes
echo "=== How lasti changes during generator execution ==="
rg -n "lasti\(\)" crates/vm/src/coroutine.rs -B3 -A3
echo -e "\n=== Check when a generator finishes and what happens to lasti ==="
rg -n "closed\(\)|running\(\)" crates/vm/src/coroutine.rs -B5 -A5
echo -e "\n=== Now let's understand: when generator closes, what state is lasti in? ==="
sed -n '175,225p' crates/vm/src/coroutine.rs
echo -e "\n=== Check if payload syntax is actually valid in frame.rs context ==="
rg -n "use crate::" crates/vm/src/builtins/frame.rsRepository: RustPython/RustPython
Length of output: 2356
🏁 Script executed:
#!/bin/bash
# Let's verify the exact syntax and understand the proposed fix better
echo "=== Verify that payload::<PyGenerator>() returns reference to PyGenerator ==="
rg -n "pub fn payload" crates/vm/src/object/core.rs -A3
echo -e "\n=== Verify as_coro() is the correct accessor ==="
rg -n "pub.*fn as_coro" crates/vm/src/builtins/generator.rs crates/vm/src/builtins/coroutine.rs
echo -e "\n=== Check if we can actually distinguish finished from suspended ==="
echo "The key question: does Coro::closed() reflect finished state correctly?"
sed -n '175,215p' crates/vm/src/coroutine.rs | grep -A30 "pub fn close"
echo -e "\n=== Final verification: can a finished generator have closed() = true? ==="
sed -n '195,210p' crates/vm/src/coroutine.rsRepository: RustPython/RustPython
Length of output: 2903
🏁 Script executed:
#!/bin/bash
# Let's trace through how a generator actually finishes
echo "=== How send() works and when StopIteration happens ==="
sed -n '100,140p' crates/vm/src/coroutine.rs
echo -e "\n=== Check if PyGenerator and PyCoroutine are in payload imports ==="
rg -n "PyPayload.*for.*PyGenerator|impl PyPayload for" crates/vm/src/builtins/generator.rs crates/vm/src/builtins/coroutine.rs
echo -e "\n=== Test the actual proposed syntax chain ==="
echo "Let's verify: generator.load() returns Option<&PyObject>"
echo "Can we call .payload::<PyGenerator>() on it?"
echo "Then call .as_coro() on the result?"
rg -n "\.payload.*\.as_coro\|and_then.*payload" crates/vm/src/ | head -10Repository: RustPython/RustPython
Length of output: 2124
🏁 Script executed:
#!/bin/bash
# Check if frame.rs can import PyGenerator and PyCoroutine at all
echo "=== Current imports in frame.rs ==="
cat crates/vm/src/builtins/frame.rs | head -20
echo -e "\n=== Check the parent module structure ==="
rg -n "pub use\|pub mod\|pub crate" crates/vm/src/builtins/mod.rs | head -40
echo -e "\n=== Verify the proposed fix would compile - test if payload syntax works ==="
echo "Checking if generator.load().and_then(|obj| obj.payload::<PyGenerator>()).map(...) pattern is used"
rg -n "load().*payload\|and_then.*payload" crates/vm/src/ -B2 -A2
echo -e "\n=== Final check: what imports would be needed ==="
grep -n "use.*generator\|use.*coroutine" crates/vm/src/builtins/frame.rsRepository: RustPython/RustPython
Length of output: 819
🏁 Script executed:
#!/bin/bash
# Let's verify the exact issue and solution one more time
echo "=== Clarify: when a generator finishes naturally (exhausted), what are its states? ==="
rg -n "maybe_close|StopIteration" crates/vm/src/coroutine.rs -B3 -A3
echo -e "\n=== Verify: does a finished (exhausted) generator have closed=true? ==="
echo "From the code we saw: when generator finishes via StopIteration, closed is NOT automatically set"
echo "Let's confirm when closed gets set..."
rg -n "closed\.store\(true" crates/vm/src/coroutine.rs
echo -e "\n=== Now the critical insight: Test case scenario ==="
echo "1. Start a generator (lasti becomes > 0)"
echo "2. Run it to exhaustion (returns StopIteration)"
echo "3. Generator object still exists"
echo "4. What is the state of: closed, lasti, and generator.load()?"
echo "Let's check if closed() is set when generator exhausts..."
sed -n '100,170p' crates/vm/src/coroutine.rs | grep -B10 -A10 "maybe_close\|StopIteration"Repository: RustPython/RustPython
Length of output: 3961
🏁 Script executed:
#!/bin/bash
# Final verification: confirm that closed() gets set when generator exhausts
echo "=== Trace: when a generator exhausts naturally, closed is set ==="
echo "From code: maybe_close() sets closed=true on ExecutionResult::Return()"
echo "When generator finishes with StopIteration, it returns Return, triggering maybe_close()"
echo ""
echo "States:"
echo "1. Not started: lasti==0, generator exists, closed=false"
echo "2. Suspended: lasti>0, generator exists, closed=false, running=false"
echo "3. Finished: lasti>0, generator exists, closed=true, running=false"
echo "4. Not owned by generator: generator.load()==None"
echo ""
echo "Current check only looks at: generator.is_some() && lasti != 0"
echo "This rejects BOTH suspended (correct) AND finished (incorrect)"
echo ""
echo "Fix needs: generator.is_some() && lasti != 0 && !closed()"
echo ""
echo "=== Verify payload syntax is valid and as_coro accessor exists ==="
rg -n "payload<" crates/vm/src/ -A2 | grep -A1 "as_coro\|inner" | head -20Repository: RustPython/RustPython
Length of output: 795
Generator frames that have finished (exhausted) are incorrectly rejected — they should be allowed to clear.
When a generator finishes by exhaustion, its closed() state is set to true while lasti remains non-zero. The current check only verifies generator.is_some() && self.lasti() != 0, which rejects both suspended and finished generators identically. You need to add a check for closed() to allow finished generators through.
The fix should check if the generator is suspended (lasti > 0 and not closed), not just if it has a non-zero lasti. This matches CPython's behavior: executing frames are rejected, suspended frames are rejected, but finished generator frames are allowed.
if let Some(gen_ref) = self.generator.load() {
let is_suspended = gen_ref.payload::<PyGenerator>()
.map(|g| !g.as_coro().closed() && self.lasti() != 0)
.or_else(|| gen_ref.payload::<PyCoroutine>()
.map(|c| !c.as_coro().closed() && self.lasti() != 0))
.unwrap_or(false);
if is_suspended {
return Err(vm.new_runtime_error("cannot clear a suspended frame".to_owned()));
}
}🤖 Prompt for AI Agents
In `@crates/vm/src/builtins/frame.rs` around lines 149 - 165, The clear method
currently rejects any frame with a generator and non-zero lasti, which
incorrectly blocks finished (closed) generators; update the check in clear
(around self.generator.load() and self.lasti()) to detect suspended generators
only — i.e., if the generator payload is a PyGenerator or PyCoroutine, treat it
as suspended when lasti() != 0 AND the underlying coroutine's closed() is false
(use payload::<PyGenerator>().map(|g| !g.as_coro().closed() && self.lasti() !=
0) and similarly for PyCoroutine), and only return the "cannot clear a suspended
frame" error when that combined suspended condition is true so finished
generators are allowed to clear.
- Improve unclosed bracket detection with "'(' was never closed" message
- Fix IndentationError location to point to end of line
- Implement frame.clear() with proper checks for executing/suspended frames
- Fix exception context chaining for propagated exceptions
- Add traceback.__dir__() and prevent tb_next deletion
- Fix subscript operation source range restoration in compiler
- Change "duplicate parameter" to "duplicate argument" error message
- Refactor duplicate code in asyncgenerator.rs and frame.rs
📦 Library DependenciesThe following Lib/ modules were modified. Here are their dependencies: [x] lib: cpython/Lib/codeop.py dependencies:
dependent tests: (33 tests)
[ ] lib: cpython/Lib/asyncio dependencies:
dependent tests: (7 tests)
[ ] test: cpython/Lib/test/test_cmd_line_script.py (TODO: 15) dependencies: dependent tests: (no tests depend on cmd_line_script) [x] lib: cpython/Lib/code.py dependencies:
dependent tests: (1 tests) [x] lib: cpython/Lib/contextlib.py dependencies:
dependent tests: (63 tests)
[x] test: cpython/Lib/test/test_exceptions.py (TODO: 25) dependencies: dependent tests: (no tests depend on exception) [x] lib: cpython/Lib/future.py dependencies:
dependent tests: (24 tests)
[ ] test: cpython/Lib/test/test_grammar.py (TODO: 18) dependencies: dependent tests: (no tests depend on grammar) [ ] lib: cpython/Lib/inspect.py dependencies:
dependent tests: (41 tests)
[ ] test: cpython/Lib/test/test_named_expressions.py (TODO: 12) dependencies: dependent tests: (no tests depend on named_expressions) [ ] test: cpython/Lib/test/test_syntax.py (TODO: 26) dependencies: dependent tests: (no tests depend on syntax) [ ] test: cpython/Lib/test/test_sys.py (TODO: 15) dependencies: dependent tests: (208 tests)
[x] lib: cpython/Lib/traceback.py dependencies:
dependent tests: (32 tests)
[ ] test: cpython/Lib/test/test_unpack.py dependencies: dependent tests: (no tests depend on unpack) Legend:
|
Summary by CodeRabbit
New Features
Bug Fixes / Improvements