Skip to content

Conversation

@ShaharNaveh
Copy link
Collaborator

@ShaharNaveh ShaharNaveh commented Nov 29, 2025

We are currently only checking if the argument count is 3, meaning that even if we pass it -1, 0 or 999 it will act as if 2 was passed into it.

This change ensures that the oparg for BuildSlice can be either 2 or 3

Summary by CodeRabbit

  • Refactor
    • Slice construction now encodes whether a slice has two or three arguments (instead of a boolean "step" flag), unifying compile-time and runtime handling.
    • Bytecode and execution paths updated to use this explicit argument-count form; behavior is preserved while improving consistency and clarity.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Nov 29, 2025

Walkthrough

Replace the boolean step BuildSlice argument with a BuildSliceArgCount enum (Two/Three) across bytecode, codegen, and VM, and update compile_slice/subscript emission and VM execution to use argc-based slice construction.

Changes

Cohort / File(s) Change Summary
Bytecode instruction definition
crates/compiler-core/src/bytecode.rs
Added public BuildSliceArgCount enum (Two = 2, Three = 3), implemented OpArgType and argc() helper; replaced Instruction::BuildSlice field step: Arg<bool> with argc: Arg<BuildSliceArgCount>; updated stack-effect, display/marshal, and op-arg handling to use argc.
Compiler codegen
crates/codegen/src/compile.rs
Imported BuildSliceArgCount; changed fn compile_slice(...) return type to CompileResult<BuildSliceArgCount>; updated slice and subscript emission to compute and emit BuildSlice { argc: BuildSliceArgCount::Two/Three } instead of using a boolean step.
VM execution / Frame
crates/vm/src/frame.rs
execute_build_slice now accepts argc: BuildSliceArgCount; call sites pass argc.get(arg); VM derives whether to pop a step value from argc (Two → no step popped, Three → pop step) and adjusts pops accordingly when constructing slices.

Sequence Diagram(s)

mermaid
sequenceDiagram
participant Compiler as Compiler (codegen)
participant Bytecode as Bytecode (instruction stream)
participant VM as VM (frame executor)

Compiler->>Bytecode: emit BuildSlice { argc: Two/Three }
Note over Bytecode: argc encoded as BuildSliceArgCount
VM->>Bytecode: fetch Instruction::BuildSlice(argc)
VM->>VM: argc.get(arg) → determine pop count
alt argc == Three
VM->>VM: pop stop, pop step, pop start
else argc == Two
VM->>VM: pop stop, pop start
end
VM->>VM: construct slice object and push result

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

  • Verify stack-effect calculation in crates/compiler-core/src/bytecode.rs uses argc.get(arg).argc().get() correctly for both variants.
  • Check all callers of compile_slice in crates/codegen/src/compile.rs to ensure BuildSliceArgCount values propagate and emission matches CPython ordering.
  • Confirm execute_build_slice in crates/vm/src/frame.rs pops values in the correct order and handles optional step only when Three.

Possibly related PRs

  • compile_subscript #6008 — modifies compile-side slice/subscript emission; strongly related to changing compile logic and BuildSlice emission.

Poem

🐰 Two or three I softly count,
I hop through code and bytes amount.
No boolean step to slow my pace,
I tuck argc in the slice's place.
A tiny leap — a cleaner trace.

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title directly and concisely describes the main change: restricting BuildSlice oparg values to 2 or 3, which aligns with the primary objective of the PR.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
crates/codegen/src/compile.rs (1)

451-477: Two-element slice fast path uses BuildSliceArgCount correctly, but return type could be tightened

The is_two_element_slice guard plus compile_slice guarantees n is always 2, so BuildSliceArgCount::from_op_arg(n).unwrap() is safe. If you want to make this more self-documenting and future-proof, you could instead:

  • Change compile_slice to return BuildSliceArgCount (and maybe not expose raw u32 here), or
  • Inline BuildSliceArgCount::Two at this callsite and treat compile_slice as “emit two bounds, no step” for this fast path.

Not blocking, but would reduce reliance on the implicit 2/3 contract of compile_slice.

📜 Review details

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6782fa2 and 29ee656.

📒 Files selected for processing (3)
  • crates/codegen/src/compile.rs (3 hunks)
  • crates/compiler-core/src/bytecode.rs (4 hunks)
  • crates/vm/src/frame.rs (2 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.rs

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

**/*.rs: Follow the default rustfmt code style by running cargo fmt to format Rust code
Always run clippy to lint Rust code (cargo clippy) before completing tasks and fix any warnings or lints introduced by changes
Follow Rust best practices for error handling and memory management
Use the macro system (pyclass, pymodule, pyfunction, etc.) when implementing Python functionality in Rust

Files:

  • crates/vm/src/frame.rs
  • crates/compiler-core/src/bytecode.rs
  • crates/codegen/src/compile.rs
🧠 Learnings (1)
📚 Learning: 2025-11-29T12:17:28.583Z
Learnt from: CR
Repo: RustPython/RustPython PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-11-29T12:17:28.583Z
Learning: Applies to **/*.rs : Use the macro system (`pyclass`, `pymodule`, `pyfunction`, etc.) when implementing Python functionality in Rust

Applied to files:

  • crates/codegen/src/compile.rs
🧬 Code graph analysis (1)
crates/codegen/src/compile.rs (1)
crates/compiler-core/src/bytecode.rs (7)
  • from_op_arg (341-341)
  • from_op_arg (348-350)
  • from_op_arg (360-362)
  • from_op_arg (463-465)
  • from_op_arg (481-489)
  • from_op_arg (914-916)
  • from_op_arg (1177-1180)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)
  • GitHub Check: Run snippets and cpython tests on wasm-wasi
  • GitHub Check: Run tests under miri
  • GitHub Check: Check the WASM package and demo
  • GitHub Check: Run snippets and cpython tests (windows-latest)
  • GitHub Check: Run snippets and cpython tests (macos-latest)
  • GitHub Check: Run snippets and cpython tests (ubuntu-latest)
  • GitHub Check: Run rust tests (windows-latest)
  • GitHub Check: Ensure compilation on various targets
  • GitHub Check: Run rust tests (macos-latest)
  • GitHub Check: Run rust tests (ubuntu-latest)
  • GitHub Check: Check Rust code with clippy
🔇 Additional comments (5)
crates/vm/src/frame.rs (2)

837-839: LGTM! Type-safe instruction dispatch.

The change from step: bool to argc: BuildSliceArgCount strengthens type safety by ensuring only valid argument counts (2 or 3) can be dispatched to the slice builder. The call site correctly passes argc.get(arg) to the execution function.


1780-1800: LGTM! Excellent type safety improvement.

The refactoring from step: bool to argc: BuildSliceArgCount makes the function signature more explicit and type-safe:

  • The exhaustive match on the enum clearly documents that only 2 or 3 arguments are valid
  • The enum prevents invalid values (e.g., -1, 0, 999) from being misinterpreted
  • Logic is semantically equivalent to the previous implementation
  • Stack operations remain correct

This change successfully enforces the PR objective to restrict BuildSlice oparg to exactly 2 or 3 arguments.

crates/codegen/src/compile.rs (1)

4585-4604: Expr::Slice emission now encodes argc explicitly and matches stack effect

The new argc computation based on step presence and emit!(..., Instruction::BuildSlice { argc }) keeps the stack behavior correct (2 or 3 inputs → 1 slice object) and aligns with BuildSliceArgCount’s restricted domain. This is exactly what the new bytecode representation expects.

crates/compiler-core/src/bytecode.rs (2)

1153-1167: BuildSliceArgCount enum correctly constrains allowed BUILD_SLICE arg counts

Defining BuildSliceArgCount with repr(u8) and values { Two = 2, Three = 3 } via op_arg_enum! ensures that only opargs 2 or 3 round-trip through OpArgType and Arg<T>. This is a clean way to encode the CPython-style 2/3 argument semantics for BUILD_SLICE and directly supports the PR’s goal of rejecting other arg counts.


762-764: Instruction::BuildSlice wiring (encoding, stack effect, display) is consistent; consider compatibility expectations

Switching BuildSlice to argc: Arg<BuildSliceArgCount> and computing the stack effect as 1 - argc keeps the net stack change correct for both 2- and 3-argument cases, and the disassembly now prints the enum (Two/Three), which remains informative.

One behavioral change is that invalid encoded arg values (e.g., legacy 0/1 encodings or corrupt bytecode) will now cause Arg::getBuildSliceArgCount::from_op_arg to return None and panic on unwrap, whereas previously any non-zero oparg was treated as “has step”. This matches how other op_arg_enum!-based instructions already behave, but it does tighten the contract for BUILD_SLICE.

If you rely on loading older or third-party bytecode streams, it’s worth double-checking that they already encode BUILD_SLICE with argument counts 2 or 3, or that such bytecode is rejected before execution.

Also applies to: 1565-1569, 1756-1756

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
crates/codegen/src/compile.rs (1)

396-427: compile_slice’s BuildSliceArgCount return type aligns with BuildSlice constraints

The function now compiles lower/upper bounds, conditionally compiles step, and returns BuildSliceArgCount::{Two,Three}, so emitted argc can only be 2 or 3 and always matches the number of pushed values. Currently this helper is only used from the two‑element slice fast path (so the Three arm is effectively unused but harmless). If you want to document that invariant, a debug_assert!(s.step.is_none()) here would make it explicit, but it’s optional.

📜 Review details

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 29ee656 and 759195a.

📒 Files selected for processing (1)
  • crates/codegen/src/compile.rs (5 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.rs

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

**/*.rs: Follow the default rustfmt code style by running cargo fmt to format Rust code
Always run clippy to lint Rust code (cargo clippy) before completing tasks and fix any warnings or lints introduced by changes
Follow Rust best practices for error handling and memory management
Use the macro system (pyclass, pymodule, pyfunction, etc.) when implementing Python functionality in Rust

Files:

  • crates/codegen/src/compile.rs
🧠 Learnings (1)
📚 Learning: 2025-11-29T12:17:28.583Z
Learnt from: CR
Repo: RustPython/RustPython PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-11-29T12:17:28.583Z
Learning: Applies to **/*.rs : Use the macro system (`pyclass`, `pymodule`, `pyfunction`, etc.) when implementing Python functionality in Rust

Applied to files:

  • crates/codegen/src/compile.rs
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)
  • GitHub Check: Run tests under miri
  • GitHub Check: Run snippets and cpython tests (windows-latest)
  • GitHub Check: Check the WASM package and demo
  • GitHub Check: Run snippets and cpython tests on wasm-wasi
  • GitHub Check: Run snippets and cpython tests (ubuntu-latest)
  • GitHub Check: Run snippets and cpython tests (macos-latest)
  • GitHub Check: Check Rust code with clippy
  • GitHub Check: Run rust tests (windows-latest)
  • GitHub Check: Ensure compilation on various targets
  • GitHub Check: Run rust tests (ubuntu-latest)
  • GitHub Check: Run rust tests (macos-latest)
🔇 Additional comments (3)
crates/codegen/src/compile.rs (3)

35-40: Importing BuildSliceArgCount is consistent and necessary

The added BuildSliceArgCount import matches its later usage in Instruction::BuildSlice and keeps the bytecode-related imports grouped coherently.


451-466: Two‑element subscript slice fast path correctly wires argc into BuildSlice

Deriving argc via compile_slice and feeding it into Instruction::BuildSlice { argc } keeps the stack shape as [value, slice] and guarantees the oparg is constrained to the enum (effectively Two here, since is_two_element_slice enforces step.is_none()). The separation between Load/Store (using the fast path) and Del (falling back to generic Subscript/DeleteSubscript) remains intact.


4576-4596: Expr::Slice emission now uses BuildSliceArgCount with correct stack arity

The compile_bound helper ensures lower/upper each contribute exactly one value (or None), the optional step adds one more when present, and argc is then chosen as Two vs Three to match. This guarantees Instruction::BuildSlice { argc } never sees an invalid argument count and satisfies the “only 2 or 3” oparg requirement.

@ShaharNaveh ShaharNaveh changed the title Force BuildSlice oparg to be either 2 or 3 Ensure BuildSlice oparg to be either 2 or 3 Nov 29, 2025
Comment on lines 1158 to 1165
/// ```py
/// x[5:10]
/// ```
Two = 2,
/// ```py
/// x[5:10:2]
/// ```
Three = 3,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// ```py
/// x[5:10]
/// ```
Two = 2,
/// ```py
/// x[5:10:2]
/// ```
Three = 3,
/// ```py
/// x[5:10]
/// ```
Two,
/// ```py
/// x[5:10:2]
/// ```
Three,

Removing value not to block optimization. Adding a method will be enough

impl BuildSliceArgCount {
    const fn argc(&self) -> usize {
        ....
    }
}

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm, but then how this be validated at stack_effect?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not by using as cast, but something like step.get(arg).argc() which return 2 or 3.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
crates/compiler-core/src/bytecode.rs (1)

1153-1193: BuildSliceArgCount + OpArgType correctly restricts op_arg to {2, 3}

The enum plus OpArgType implementation ensures only 2 and 3 round‑trip: from_op_arg rejects other values, and to_op_arg always emits 2 or 3 via argc(), so invalid bytecode can’t be represented as a valid Arg<BuildSliceArgCount>. The separate argc() helper also avoids relying on enum discriminant values in stack analysis / VM code.

If you don’t specifically need NonZeroU8 or const‑fn semantics here, you could simplify and drop the unsafe by returning a plain u8 from argc() and adjusting to_op_arg accordingly, e.g.:

-    pub const fn argc(self) -> NonZeroU8 {
-        let inner = match self {
-            Self::Two => 2,
-            Self::Three => 3,
-        };
-        // Safety: `inner` can be either 2 or 3.
-        unsafe { NonZeroU8::new_unchecked(inner) }
-    }
+    pub const fn argc(self) -> u8 {
+        match self {
+            Self::Two => 2,
+            Self::Three => 3,
+        }
+    }
-    fn to_op_arg(self) -> u32 {
-        u32::from(self.argc().get())
-    }
+    fn to_op_arg(self) -> u32 {
+        self.argc().into()
+    }

This is purely optional and doesn’t affect correctness. As per coding guidelines, please also ensure cargo fmt and cargo clippy are clean after these changes.

📜 Review details

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 759195a and 5cfd018.

📒 Files selected for processing (1)
  • crates/compiler-core/src/bytecode.rs (5 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.rs

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

**/*.rs: Follow the default rustfmt code style by running cargo fmt to format Rust code
Always run clippy to lint Rust code (cargo clippy) before completing tasks and fix any warnings or lints introduced by changes
Follow Rust best practices for error handling and memory management
Use the macro system (pyclass, pymodule, pyfunction, etc.) when implementing Python functionality in Rust

Files:

  • crates/compiler-core/src/bytecode.rs
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)
  • GitHub Check: Run snippets and cpython tests (windows-latest)
  • GitHub Check: Run tests under miri
  • GitHub Check: Run snippets and cpython tests (macos-latest)
  • GitHub Check: Check the WASM package and demo
  • GitHub Check: Check Rust code with clippy
  • GitHub Check: Run snippets and cpython tests on wasm-wasi
  • GitHub Check: Run snippets and cpython tests (ubuntu-latest)
  • GitHub Check: Run rust tests (macos-latest)
  • GitHub Check: Run rust tests (windows-latest)
  • GitHub Check: Run rust tests (ubuntu-latest)
  • GitHub Check: Ensure compilation on various targets
🔇 Additional comments (4)
crates/compiler-core/src/bytecode.rs (4)

13-13: NonZeroU8 import matches new argc helper usage

Adding num::NonZeroU8 here is appropriate for BuildSliceArgCount::argc() and keeps the dependency localized to this module.


762-764: Using Arg<BuildSliceArgCount> for BuildSlice tightens the type contract

Switching BuildSlice to carry argc: Arg<BuildSliceArgCount> makes the opcode’s argument domain explicit and enforces 2 or 3 at the type level via OpArgType, aligning with the PR’s goal of rejecting other values.


1591-1595: stack_effect for BuildSlice now derives from validated argc

1 - (argc.get(arg).argc().get() as i32) yields -1 for Two (2‑arg slice) and -2 for Three (3‑arg slice), preserving previous stack behavior while now depending on the enum‑validated argument count instead of a loosely interpreted value.


1782-1783: Disassembly shows BuildSliceArgCount explicitly

Printing BuildSlice { argc } with ?argc will render BuildSlice(Two)/BuildSlice(Three), which is clearer in disassembly than a raw integer or boolean and matches the new enum.

Copy link
Member

@youknowone youknowone left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@youknowone youknowone merged commit 4051bec into RustPython:main Nov 30, 2025
13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants