-
Notifications
You must be signed in to change notification settings - Fork 1.4k
various fix to support subprocess/multiprocessing #6567
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
📝 WalkthroughWalkthroughAdds Unix statvfs support, refactors HMAC initialization to raise a dedicated UnsupportedDigestmodError, makes mmap buffer readonly when opened read-only, and adds an atomic recursion counter plus accessor to RLock. Changes
Sequence Diagram(s)(omitted) Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Suggested reviewers
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
1bcc263 to
be5631e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (3)
crates/vm/src/stdlib/thread.rs (1)
231-234: Potential underflow ifrelease()is called more times thanacquire().The
countis decremented before unlocking, but there's no guard against underflow. Ifrelease()is somehow called more times thanacquire()succeeded (e.g., due to a bug elsewhere),fetch_sub(1)on 0 would wrap around.While the
is_locked()check at line 228 should prevent this in normal usage, consider adding a debug assertion or checking the count value for defensive coding:🔎 Proposed defensive check
if !self.mu.is_locked() { return Err(vm.new_runtime_error("release unlocked lock")); } + debug_assert!( + self.count.load(std::sync::atomic::Ordering::Relaxed) > 0, + "RLock count underflow" + ); self.count .fetch_sub(1, std::sync::atomic::Ordering::Relaxed); unsafe { self.mu.unlock() };crates/vm/src/stdlib/os.rs (2)
1750-1769: Thef_fsidextraction approach is fragile and platform-dependent.The pointer-casting approach to extract
f_fsidworks but is somewhat brittle:
- The
std::slice::from_raw_partscall requires the pointer to be valid and properly aligned for reading bytes, which should be fine foru8.- However, the byte-order interpretation via
from_ne_bytesproduces different values on big-endian vs little-endian systems.Consider documenting this behavior or using a simpler approach if the exact value isn't critical:
🔎 Alternative: Document the platform-specific behavior
let f_fsid = { - // Use pointer cast to get raw bytes from fsid struct + // f_fsid is a struct on some platforms (e.g., Linux fsid_t) and a scalar on others. + // We extract raw bytes and interpret as a native-endian integer. + // Note: The value may differ across architectures due to endianness. let ptr = std::ptr::addr_of!(st.f_fsid) as *const u8; let size = std::mem::size_of_val(&st.f_fsid);
1787-1804: Error handling could include the filename for better diagnostics.When
statvfsfails with a path argument, the error doesn't include the filename, making debugging harder. Consider usingOSErrorBuilder::with_filenamelike other path-based functions in this file:🔎 Proposed fix to include filename in error
#[pyfunction] #[pyfunction(name = "fstatvfs")] fn statvfs(path: OsPathOrFd<'_>, vm: &VirtualMachine) -> PyResult { let mut st: libc::statvfs = unsafe { std::mem::zeroed() }; let ret = match path { OsPathOrFd::Path(path) => { - let cpath = path.into_cstring(vm)?; - unsafe { libc::statvfs(cpath.as_ptr(), &mut st) } + let cpath = path.clone().into_cstring(vm)?; + let ret = unsafe { libc::statvfs(cpath.as_ptr(), &mut st) }; + if ret != 0 { + return Err(OSErrorBuilder::with_filename(&io::Error::last_os_error(), path, vm)); + } + ret + } + OsPathOrFd::Fd(fd) => unsafe { libc::fstatvfs(fd.as_raw(), &mut st) }, + }; + if ret != 0 { + return Err(io::Error::last_os_error().into_pyexception(vm)); } - OsPathOrFd::Fd(fd) => unsafe { libc::fstatvfs(fd.as_raw(), &mut st) }, - }; - if ret != 0 { - return Err(io::Error::last_os_error().into_pyexception(vm)); - } Ok(StatvfsResultData::from_statvfs(st).to_pyobject(vm)) }Alternatively, a cleaner refactor matching the pattern used in
stat:fn statvfs(path: OsPathOrFd<'_>, vm: &VirtualMachine) -> PyResult { let mut st: libc::statvfs = unsafe { std::mem::zeroed() }; - let ret = match path { - OsPathOrFd::Path(path) => { - let cpath = path.into_cstring(vm)?; + let ret = match &path { + OsPathOrFd::Path(p) => { + let cpath = p.clone().into_cstring(vm)?; unsafe { libc::statvfs(cpath.as_ptr(), &mut st) } } OsPathOrFd::Fd(fd) => unsafe { libc::fstatvfs(fd.as_raw(), &mut st) }, }; if ret != 0 { - return Err(io::Error::last_os_error().into_pyexception(vm)); + return Err(OSErrorBuilder::with_filename(&io::Error::last_os_error(), path, vm)); } Ok(StatvfsResultData::from_statvfs(st).to_pyobject(vm)) }
📜 Review details
Configuration used: Path: .coderabbit.yml
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
Lib/test/test_hmac.pyis excluded by!Lib/**
📒 Files selected for processing (4)
crates/stdlib/src/hashlib.rscrates/stdlib/src/mmap.rscrates/vm/src/stdlib/os.rscrates/vm/src/stdlib/thread.rs
🧰 Additional context used
📓 Path-based instructions (1)
**/*.rs
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
**/*.rs: Follow the default rustfmt code style by runningcargo fmtto format Rust code
Always run clippy to lint code (cargo clippy) before completing tasks and fix any warnings or lints introduced by changes
Follow Rust best practices for error handling and memory management
Use the macro system (pyclass,pymodule,pyfunction, etc.) when implementing Python functionality in Rust
Files:
crates/stdlib/src/mmap.rscrates/vm/src/stdlib/os.rscrates/stdlib/src/hashlib.rscrates/vm/src/stdlib/thread.rs
🧠 Learnings (4)
📚 Learning: 2025-11-10T06:27:41.954Z
Learnt from: youknowone
Repo: RustPython/RustPython PR: 6243
File: vm/src/function/buffer.rs:123-129
Timestamp: 2025-11-10T06:27:41.954Z
Learning: In vm/src/function/buffer.rs, line 36 in the try_rw_bytes_like method intentionally uses TypeError (not BufferError) for the "buffer is not a read-write bytes-like object" error case, even though a similar error message in ArgMemoryBuffer::try_from_borrowed_object uses BufferError. The TypeError is the intended exception type for this specific code path.
Applied to files:
crates/stdlib/src/mmap.rs
📚 Learning: 2025-06-27T14:47:28.810Z
Learnt from: moreal
Repo: RustPython/RustPython PR: 5847
File: vm/src/stdlib/stat.rs:547-567
Timestamp: 2025-06-27T14:47:28.810Z
Learning: In RustPython's stat module implementation, platform-specific constants like SF_SUPPORTED and SF_SYNTHETIC should be conditionally declared only for the platforms where they're available (e.g., macOS), following CPython's approach of optional declaration using #ifdef checks rather than providing fallback values for other platforms.
Applied to files:
crates/vm/src/stdlib/os.rs
📚 Learning: 2025-06-27T14:47:28.810Z
Learnt from: moreal
Repo: RustPython/RustPython PR: 5847
File: vm/src/stdlib/stat.rs:547-567
Timestamp: 2025-06-27T14:47:28.810Z
Learning: In RustPython's stat module implementation, platform-specific constants like SF_SUPPORTED and SF_SYNTHETIC should be conditionally declared only for the platforms where they're available (e.g., macOS), following CPython's approach of optional declaration rather than providing fallback values for other platforms.
Applied to files:
crates/vm/src/stdlib/os.rs
📚 Learning: 2025-12-27T14:03:49.011Z
Learnt from: CR
Repo: RustPython/RustPython PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-12-27T14:03:49.011Z
Learning: Applies to **/*.rs : Use the macro system (`pyclass`, `pymodule`, `pyfunction`, etc.) when implementing Python functionality in Rust
Applied to files:
crates/stdlib/src/hashlib.rs
🧬 Code graph analysis (2)
crates/stdlib/src/mmap.rs (1)
crates/vm/src/protocol/buffer.rs (1)
simple(176-184)
crates/stdlib/src/hashlib.rs (1)
crates/vm/src/class.rs (1)
static_type(71-79)
🔇 Additional comments (9)
crates/stdlib/src/mmap.rs (1)
654-657: Consider whetherAccessMode::Copyshould also produce a readonly buffer.The current logic only sets
readonly = trueforAccessMode::Read. However,AccessMode::Copycreates a copy-on-write mapping where writes don't affect the underlying file. In CPython,ACCESS_COPYmmaps are still considered writable buffers since the user can modify the in-memory copy.The implementation looks correct for matching CPython behavior, where only
ACCESS_READproduces a truly readonly buffer.crates/vm/src/stdlib/thread.rs (3)
258-265: LGTM!The
_recursion_countmethod correctly returns the acquisition depth when owned by the current thread, and 0 otherwise. This matches CPython'sRLock._recursion_count()behavior and is useful for debugging and introspection.
218-223: LGTM!The count is correctly incremented only when acquisition succeeds. The
Relaxedordering is appropriate here since the mutex operations provide the necessary synchronization.
244-244: LGTM!Resetting the count to 0 during
_at_fork_reinitis correct—after a fork, the child process should see a clean lock state.crates/stdlib/src/hashlib.rs (2)
25-29: LGTM!The
UnsupportedDigestmodErrorexception follows the RustPython macro pattern for custom exceptions. Deriving fromPyValueErroris appropriate, and the#[repr(transparent)]ensures correct memory layout. This enables Python'shmacmodule to catch this exception and fall back to a pure-Python HMAC implementation.Based on learnings, the macro system (
pyexception) is correctly used for implementing Python functionality in Rust.
343-361: No changes needed—exception is properly initialized.The
#[pyattr]decorator onUnsupportedDigestmodErrorensures it is automatically registered as a module attribute when the_hashlibmodule loads. This is the standard RustPython pattern, consistent with other exception definitions throughout the codebase (e.g., incrates/stdlib/src/ssl/error.rs). Thestatic_type()call inhmac_new()will not panic because the exception is guaranteed to be initialized before any module functions can be called.crates/vm/src/stdlib/os.rs (3)
205-212: LGTM!The
ST_RDONLYandST_NOSUIDconstants are correctly guarded with#[cfg(all(unix, not(target_os = "redox")))], following the platform-specific constant pattern used throughout the codebase. Based on retrieved learnings, this matches CPython's approach of optional declaration for platform-specific constants.
1713-1731: LGTM!The
StatvfsResultDatastruct correctly mirrors Python'sos.statvfs_resultnamed tuple. The#[pystruct_sequence(skip)]annotation onf_fsidensures it's accessible as an attribute but not included in the tuple indices, matching CPython's behavior.
1740-1746: LGTM!The
slot_newimplementation follows the standard pattern for struct sequences, enabling Python-side construction ofstatvfs_resultobjects.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
crates/vm/src/stdlib/thread.rs (1)
193-193: Public field allows external modification.The
countfield is public, which means external code could directly access and modify it, potentially breaking the recursion depth invariant. Consider making this field private or add documentation warning against direct modification.🔎 Suggested refactor to make the field private
struct RLock { mu: RawRMutex, - count: std::sync::atomic::AtomicUsize, + pub(crate) count: std::sync::atomic::AtomicUsize, }
📜 Review details
Configuration used: Path: .coderabbit.yml
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
Lib/test/test_hmac.pyis excluded by!Lib/**
📒 Files selected for processing (4)
crates/stdlib/src/hashlib.rscrates/stdlib/src/mmap.rscrates/vm/src/stdlib/os.rscrates/vm/src/stdlib/thread.rs
🚧 Files skipped from review as they are similar to previous changes (2)
- crates/stdlib/src/mmap.rs
- crates/vm/src/stdlib/os.rs
🧰 Additional context used
📓 Path-based instructions (1)
**/*.rs
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
**/*.rs: Follow the default rustfmt code style by runningcargo fmtto format Rust code
Always run clippy to lint code (cargo clippy) before completing tasks and fix any warnings or lints introduced by changes
Follow Rust best practices for error handling and memory management
Use the macro system (pyclass,pymodule,pyfunction, etc.) when implementing Python functionality in Rust
Files:
crates/stdlib/src/hashlib.rscrates/vm/src/stdlib/thread.rs
🧠 Learnings (1)
📚 Learning: 2025-12-27T14:03:49.011Z
Learnt from: CR
Repo: RustPython/RustPython PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-12-27T14:03:49.011Z
Learning: Applies to **/*.rs : Use the macro system (`pyclass`, `pymodule`, `pyfunction`, etc.) when implementing Python functionality in Rust
Applied to files:
crates/stdlib/src/hashlib.rs
🧬 Code graph analysis (1)
crates/stdlib/src/hashlib.rs (1)
crates/vm/src/class.rs (1)
static_type(71-79)
🔇 Additional comments (6)
crates/vm/src/stdlib/thread.rs (3)
218-223: Verify memory ordering is sufficient.The use of
Ordering::Relaxedfor the atomic counter operations should be safe here because the underlyingRawRMutexprovides the necessary synchronization barriers. However, it's worth confirming that this is the intended behavior and that no stronger ordering guarantees are needed.The mutex's lock/unlock operations should provide sufficient memory barriers, making Relaxed ordering appropriate for this counter. Can you confirm this is the intended design?
231-236: Verify assertion placement and release order.The
debug_assert!checks thatcount > 0before decrementing, which is correct. However, note that the decrement happens before the actualunlock()call. This ordering appears correct since we want to update our bookkeeping before releasing the lock, but ensure this is intentional.
262-269: LGTM: Correctly checks ownership before returning count.The
_recursion_countmethod correctly returns 0 when the current thread doesn't own the lock, and returns the actual count only when owned. This prevents exposing recursion depth of other threads.crates/stdlib/src/hashlib.rs (3)
25-29: LGTM: Exception type properly defined.The
UnsupportedDigestmodErrorexception is correctly defined using thepyexceptionmacro, inheriting fromPyValueError. This follows Rust best practices for the macro system.Based on coding guidelines, this follows the pattern of using the macro system when implementing Python functionality in Rust.
345-352: LGTM: HMAC parameter structure updated appropriately.The refactored
NewHMACHashArgscorrectly reflects standard HMAC parameters:
key: the cryptographic key (required)msg: optional message to hashdigestmod: optional digest algorithmField renames improve clarity over the previous
name/datanaming.
355-362: Address why native HMAC support is intentionally disabled.The
hmac_newfunction unconditionally raisesUnsupportedDigestmodErrorto delegate to Python's pure-Python HMAC implementation added in this commit. This design is intentional—the exception pattern is valid and allowshmac.pyto fall back to its native implementation.However, clarify:
- Why is native Rust HMAC support being intentionally disabled in favor of Python's pure-Python implementation? This appears unrelated to the PR's subprocess/multiprocessing goals.
- Is this permanent or a placeholder for future native HMAC implementation?
The concerns about incomplete implementation and
StaticTypeinitialization safety are unfounded—thelet _ = argspattern is intentional (no argument processing needed when unconditionally raising), and the#[pyexception]macro ensures proper exception type initialization.
Summary by CodeRabbit
New Features
Bug Fixes
✏️ Tip: You can customize this high-level summary in your review settings.