Skip to content

Conversation

@youknowone
Copy link
Member

@youknowone youknowone commented Dec 20, 2025

Summary by CodeRabbit

  • New Features

    • SSL message callbacks now invoke user Python callbacks during handshakes.
    • Thread-local handshake propagation improves callback and SNI behavior.
  • Bug Fixes

    • Improved SSL read/shutdown handling with correct EOF, WANT_READ/WANT_WRITE semantics.
    • More accurate classification and construction of SSL and certificate verification errors.
    • Pre-checks for certificate/key file paths now yield clear file-not-found errors.
  • Chores

    • Reorganized and exposed SSL error constants and helper error constructors.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Dec 20, 2025

Walkthrough

Refactors SSL error constants and error construction, moves SSL_ERROR_* constants to ssl/error.rs with Python exposure, and implements thread-local handshake state, SSL ex_data storage, and a full Python-invoking message callback; also adjusts read/shutdown/BIO behavior and file-path validation for SSL inputs.

Changes

Cohort / File(s) Summary
OpenSSL integration & callbacks
crates/stdlib/src/openssl.rs
Replaced alias-based SSL error exports with create_ssl_* helpers; migrated error construction to canonical PySSLError/PySSLCertVerificationError; added full Python invocation in _msg_callback, thread-local HANDSHAKE_VM/HANDSHAKE_SSL_PTR, HandshakeVmGuard, and SSL ex_data usage for PySslSocket and SNI/ALPN callbacks.
SSL module constants relocation
crates/stdlib/src/ssl.rs
Removed previously exposed SSL_ERROR_* constant declarations from this file (moved to ssl/error.rs).
SSL error constants & helpers
crates/stdlib/src/ssl/error.rs
Added SSL_ERROR_* constants with #[pyattr] exposure (SSL_ERROR_NONE, SSL_ERROR_SSL, SSL_ERROR_WANT_READ, SSL_ERROR_WANT_WRITE, SSL_ERROR_WANT_X509_LOOKUP, SSL_ERROR_SYSCALL, SSL_ERROR_ZERO_RETURN, SSL_ERROR_WANT_CONNECT, SSL_ERROR_EOF, SSL_ERROR_INVALID_ERROR_CODE); updated create_ssl_eof_error/create_ssl_zero_return_error to pass errno values.
Read/compat behavior adjustments
crates/stdlib/src/ssl/compat.rs
Enhanced ssl_read control flow: when rustls wants to write, flush pending TLS records before retry; BIO mode now explicitly checks for EOF and maps to Eof vs WantRead accordingly.
Multiple callsite updates & wrapping
crates/stdlib/src/openssl.rs, crates/stdlib/src/ssl.rs (wrap/socket paths)
Updated various call sites (version, cipher, shared_ciphers, ALPN, compression, shutdown, read path, wrap_socket/_wrap_bio) to use thread-local SSL pointer when available; added SniCallbackData storage and lifecycle management on SSL ex_data; improved EOF and ZERO_RETURN mapping and BIO vs socket semantics.

Sequence Diagram

sequenceDiagram
    participant OS as OpenSSL
    participant CB as _msg_callback (native)
    participant TLS as SSL* (ex_data)
    participant TL as Thread-Local (HANDSHAKE_VM / SSL_PTR)
    participant VM as Python VM
    participant PS as PySslSocket
    participant UCB as User Callback

    OS->>CB: call _msg_callback(ssl, content_type, msg_type, buf, len, arg)
    CB->>TL: load HANDSHAKE_VM / HANDSHAKE_SSL_PTR
    alt handshake VM present
        TL-->>CB: vm_ref, ssl_ptr
    else no VM
        CB-->>OS: return (no-op)
    end
    CB->>TLS: read ex_data -> PySslSocket pointer
    TLS-->>CB: pyssl socket (PS)
    CB->>VM: create Python bytes from buf, build args (conn, direction, version, content_type, msg_type, data)
    CB->>VM: invoke user callback via PS
    VM->>UCB: execute user callback
    alt callback returns
        UCB-->>VM: result
        VM-->>CB: success
        CB-->>OS: return success
    else callback raises
        UCB-->>VM: exception
        VM-->>CB: exception info
        CB->>CB: run_unraisable(exception)
        CB-->>OS: return (exception handled)
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

  • Areas needing careful attention:
    • Thread-local HANDSHAKE_VM / HANDSHAKE_SSL_PTR correctness and concurrency implications across call sites.
    • _msg_callback path: ex_data retrieval, VM lifecycle, argument construction, and safe exception handling.
    • SSL ex_data & SniCallbackData lifecycle to avoid dangling pointers or reference cycles.
    • Error-constant relocation and create_ssl_* errno propagation to ensure public API compatibility.
    • Read/shutdown/BIO behavior changes to match CPython semantics.

Possibly related PRs

Suggested reviewers

  • coolreader18
  • ShaharNaveh
  • arihant2math

Poem

🐰 I hopped into callbacks late at night,

Thread‑locals set, ex_data snug and tight.
EOFs found homes; errors renamed with care,
ALPN and SNI now dance in the air.
Hooray — a rabbit fixes SSL flair! 🥕✨

Pre-merge checks and finishing touches

❌ Failed checks (1 inconclusive)
Check name Status Explanation Resolution
Title check ❓ Inconclusive The title partially captures the changeset but is overly broad. It mentions 'Share more ssl consts' which accurately reflects moving SSL error constants from ssl.rs to ssl/error.rs, but 'fix openssl' is vague and doesn't convey the substantial refactoring of OpenSSL error handling, callback implementation, and thread-local pointer management that constitute the bulk of changes. Consider a more specific title like 'Refactor OpenSSL error handling and callbacks' or 'Move SSL error constants and enhance OpenSSL callback support' to better represent the full scope of changes.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage ✅ Passed Docstring coverage is 89.66% which is sufficient. The required threshold is 80.00%.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

📜 Recent review details

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e413afe and 4949d39.

📒 Files selected for processing (2)
  • crates/stdlib/src/openssl.rs (30 hunks)
  • crates/stdlib/src/ssl/compat.rs (1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.rs

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

**/*.rs: Follow the default rustfmt code style by running cargo fmt to format Rust code
Always run clippy to lint Rust code (cargo clippy) before completing tasks and fix any warnings or lints introduced by changes
Follow Rust best practices for error handling and memory management
Use the macro system (pyclass, pymodule, pyfunction, etc.) when implementing Python functionality in Rust

Files:

  • crates/stdlib/src/ssl/compat.rs
  • crates/stdlib/src/openssl.rs
🧬 Code graph analysis (1)
crates/stdlib/src/openssl.rs (1)
crates/stdlib/src/ssl/error.rs (3)
  • create_ssl_eof_error (120-126)
  • create_ssl_want_read_error (104-110)
  • create_ssl_want_write_error (112-118)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)
  • GitHub Check: Run rust tests (windows-latest)
  • GitHub Check: Run snippets and cpython tests (windows-latest)
  • GitHub Check: Check the WASM package and demo
  • GitHub Check: Run rust tests (ubuntu-latest)
  • GitHub Check: Run rust tests (macos-latest)
  • GitHub Check: Run snippets and cpython tests (macos-latest)
  • GitHub Check: Run snippets and cpython tests on wasm-wasi
  • GitHub Check: Check Rust code with clippy
  • GitHub Check: Run snippets and cpython tests (ubuntu-latest)
  • GitHub Check: Ensure compilation on various targets
  • GitHub Check: Run tests under miri
🔇 Additional comments (4)
crates/stdlib/src/ssl/compat.rs (1)

400-424: LGTM: Well-designed bidirectional TLS handling in read path.

The new logic correctly handles the case where rustls needs to send data (e.g., TLS key updates) before it can receive more. The write-flush step mirrors handshake logic and the BIO EOF check aligns with the error signaling improvements described in the AI summary.

crates/stdlib/src/openssl.rs (3)

2523-2570: LGTM: Improved shutdown handling with proper retry logic.

The implementation correctly handles bidirectional SSL shutdown by retrying once when ret == 0, and properly distinguishes WANT_READ/WANT_WRITE errors. The BIO vs socket mode return semantics align with CPython behavior.


2853-2868: LGTM: Proper BIO mode EOF handling in read path.

The BIO-specific error handling correctly distinguishes between:

  • ZERO_RETURN: Clean EOF with TLS close_notify → SSLEOFError
  • WANT_READ with BIO EOF: Unexpected EOF without close_notify → SSLEOFError
  • WANT_READ without BIO EOF: Normal blocking → SSLWantReadError

This aligns with the EOF signaling improvements described in the AI summary.


1368-1389: LGTM: File existence checks improve error reporting.

The pre-OpenSSL path validation at lines 1368-1389 (load_verify_locations) and 1688-1708 (load_cert_chain) correctly returns FileNotFoundError with ENOENT when files/directories don't exist. This provides clearer error messages compared to OpenSSL's generic errors.

Also applies to: 1688-1708


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@youknowone youknowone marked this pull request as ready for review December 20, 2025 11:18
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7bfa5d9 and e413afe.

📒 Files selected for processing (3)
  • crates/stdlib/src/openssl.rs (20 hunks)
  • crates/stdlib/src/ssl.rs (0 hunks)
  • crates/stdlib/src/ssl/error.rs (2 hunks)
💤 Files with no reviewable changes (1)
  • crates/stdlib/src/ssl.rs
🧰 Additional context used
📓 Path-based instructions (1)
**/*.rs

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

**/*.rs: Follow the default rustfmt code style by running cargo fmt to format Rust code
Always run clippy to lint Rust code (cargo clippy) before completing tasks and fix any warnings or lints introduced by changes
Follow Rust best practices for error handling and memory management
Use the macro system (pyclass, pymodule, pyfunction, etc.) when implementing Python functionality in Rust

Files:

  • crates/stdlib/src/openssl.rs
  • crates/stdlib/src/ssl/error.rs
🧬 Code graph analysis (1)
crates/stdlib/src/openssl.rs (2)
crates/stdlib/src/ssl/error.rs (3)
  • create_ssl_eof_error (120-126)
  • create_ssl_want_read_error (104-110)
  • create_ssl_want_write_error (112-118)
crates/stdlib/src/ssl.rs (4)
  • _msg_callback (1669-1671)
  • read (3183-3316)
  • read (4050-4070)
  • e (1109-1109)
🔇 Additional comments (11)
crates/stdlib/src/ssl/error.rs (2)

13-33: LGTM! SSL error constants properly defined.

The error constants are correctly defined with appropriate values matching OpenSSL/CPython conventions, properly annotated with #[pyattr] for Python exposure, and use suitable pub(crate) visibility for internal use while being accessible to Python.


120-134: LGTM! Error helper functions correctly updated.

The helper functions now properly pass specific SSL error codes (SSL_ERROR_EOF and SSL_ERROR_ZERO_RETURN) as errno values, providing richer error information consistent with the PR's error handling improvements.

crates/stdlib/src/openssl.rs (9)

56-60: LGTM! Import changes align with centralized error handling.

The imports correctly reflect the migration from direct error type usage to helper functions (create_ssl_* helpers), and the use of canonical error type names (PySSLCertVerificationError, PySSLError) improves consistency.


1807-1828: Review ex_data pointer management for safety.

The ex_data setup stores raw pointers to Py<PySslSocket> (index 0) and SniCallbackData (SNI index):

  1. Index 0 pointer (lines 1814-1815, 1882-1883): Stores &*py_ref as *const Py<PySslSocket>. This is safe because:

    • py_ref (the PyRef<PySslSocket>) is returned to Python, so the object remains alive.
    • The SSL connection is owned by PySslSocket.connection, so the SSL object cannot outlive the PySslSocket.
    • Callbacks dereference this pointer, which remains valid as long as the PySslSocket exists.
  2. SNI callback data (lines 1821-1827, 1889-1895): Uses Box::into_raw to leak the SniCallbackData, with cleanup via cleanup_sni_ex_data after handshake. The weak reference to PySslSocket prevents reference cycles.

The implementation appears correct. However, the safety relies on the invariant that the SSL object (stored in PySslSocket.connection) does not outlive the PySslSocket itself. This is enforced by Rust ownership, but worth documenting explicitly.

Also applies to: 1875-1896


1983-2000: LGTM! Socket select logic correctly avoids busy-looping.

The updated logic properly handles blocking sockets by passing None as the timeout to sock_select, allowing the socket to wait indefinitely instead of busy-looping. This is a correctness improvement that aligns with the comment on lines 1983-1984.


2290-2302: LGTM! Thread-local SSL pointer pattern correctly avoids deadlocks.

These methods now use get_ssl_ptr_for_context_change instead of directly acquiring connection.read(). This pattern avoids deadlocks when these methods are called during callbacks (SNI, msg_callback) where the connection lock is already held:

  • During handshake/callbacks: HANDSHAKE_SSL_PTR thread-local is set by HandshakeVmGuard, so the SSL pointer is retrieved without locking.
  • Outside callbacks: Thread-local is None, so the lock is safely acquired.

This is a well-designed solution for the callback context problem.

Also applies to: 2311-2372, 2375-2398, 2509-2522, 2724-2728


2538-2541: LGTM! Handshake error handling correctly uses canonical error types.

The handshake error paths now use PySSLCertVerificationError::class() (canonical name) and properly set verification info on certificate verification errors, consistent with the PR's error handling improvements.

Also applies to: 2593-2597


2731-2838: LGTM! Read method signature and logic correctly updated.

The parameter type change from i32 to isize aligns with CPython's use of Py_ssize_t. The complex logic for handling negative n and buffer presence (lines 2737-2756) correctly implements CPython's behavior:

  • With buffer: If n <= 0 or n > buffer_len, use buffer_len; otherwise use n.
  • Without buffer: Raise ValueError if n < 0.

The buffer slicing at lines 2777-2780 includes a safety fallback, though it should not normally execute given the earlier validation. The zero-length read optimization (lines 2759-2765) is appropriate.


3318-3427: LGTM! Error handling consistently uses canonical error types.

The convert_openssl_error and new_ssl_error functions now consistently use canonical error type names (PySSLError, PySSLCertVerificationError) instead of aliases, improving code consistency and aligning with the PR's error handling refactoring.


3454-3501: LGTM! SSL error conversion uses centralized helper functions.

The convert_ssl_error function now uses helper functions (create_ssl_want_read_error, create_ssl_want_write_error, create_ssl_eof_error) for specific error cases, aligning with the PR's centralized error handling approach. The error detection logic remains correct, with proper handling of WANT_READ/WANT_WRITE, SYSCALL, and unexpected EOF cases.


1345-1347: LGTM! ASCII validation for cadata string is appropriate.

The added ASCII check for cadata when provided as a string (line 1345) is correct, as PEM format is ASCII-based. This improves input validation and provides a clear error message for invalid input.

Comment on lines 721 to 827
unsafe extern "C" fn _msg_callback(
_write_p: libc::c_int,
_version: libc::c_int,
_content_type: libc::c_int,
_buf: *const libc::c_void,
_len: usize,
_ssl_ptr: *mut sys::SSL,
write_p: libc::c_int,
mut version: libc::c_int,
content_type: libc::c_int,
buf: *const libc::c_void,
len: usize,
ssl_ptr: *mut sys::SSL,
_arg: *mut libc::c_void,
) {
// Intentionally empty to avoid deadlocks
if ssl_ptr.is_null() {
return;
}

unsafe {
// Get SSL socket from SSL_get_ex_data (index 0)
let ssl_socket_ptr = sys::SSL_get_ex_data(ssl_ptr, 0);
if ssl_socket_ptr.is_null() {
return;
}

// ssl_socket_ptr is a pointer to Py<PySslSocket>, set in _wrap_socket/_wrap_bio
let ssl_socket: &Py<PySslSocket> = &*(ssl_socket_ptr as *const Py<PySslSocket>);

// Get the callback from the context
let callback_opt = ssl_socket.ctx.read().msg_callback.lock().clone();
let Some(callback) = callback_opt else {
return;
};

// Get VM from thread-local storage (set by HandshakeVmGuard in do_handshake)
let Some(vm_ptr) = HANDSHAKE_VM.with(|cell| cell.get()) else {
// VM not available - this shouldn't happen during handshake
return;
};
let vm = &*vm_ptr;

// Get SSL socket owner object
let ssl_socket_obj = ssl_socket
.owner
.read()
.as_ref()
.and_then(|weak| weak.upgrade())
.unwrap_or_else(|| vm.ctx.none());

// Create the message bytes
let buf_slice = std::slice::from_raw_parts(buf as *const u8, len);
let msg_bytes = vm.ctx.new_bytes(buf_slice.to_vec());

// Determine direction string
let direction_str = if write_p != 0 { "write" } else { "read" };

// Calculate msg_type based on content_type (debughelpers.c behavior)
let msg_type = match content_type {
SSL3_RT_CHANGE_CIPHER_SPEC => SSL3_MT_CHANGE_CIPHER_SPEC,
SSL3_RT_ALERT => {
// byte 1 is alert type
if len >= 2 { buf_slice[1] as i32 } else { -1 }
}
SSL3_RT_HANDSHAKE => {
// byte 0 is handshake type
if !buf_slice.is_empty() {
buf_slice[0] as i32
} else {
-1
}
}
SSL3_RT_HEADER => {
// Frame header: version in bytes 1..2, type in byte 0
if len >= 3 {
version = ((buf_slice[1] as i32) << 8) | (buf_slice[2] as i32);
buf_slice[0] as i32
} else {
-1
}
}
SSL3_RT_INNER_CONTENT_TYPE => {
// Inner content type in byte 0
if !buf_slice.is_empty() {
buf_slice[0] as i32
} else {
-1
}
}
_ => -1,
};

// Call the Python callback
// Signature: callback(conn, direction, version, content_type, msg_type, data)
match callback.call(
(
ssl_socket_obj,
vm.ctx.new_str(direction_str),
vm.ctx.new_int(version),
vm.ctx.new_int(content_type),
vm.ctx.new_int(msg_type),
msg_bytes,
),
vm,
) {
Ok(_) => {}
Err(exc) => {
// Log the exception but don't propagate it
vm.run_unraisable(exc, None, vm.ctx.none());
}
}
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Find SSL_set_ex_data calls with index 0 in openssl.rs
grep -n "SSL_set_ex_data" crates/stdlib/src/openssl.rs | head -20

Repository: RustPython/RustPython

Length of output: 490


🏁 Script executed:

#!/bin/bash
# Examine SSL_set_ex_data setup and any cleanup patterns
echo "=== Finding SSL_set_ex_data calls ==="
grep -n "SSL_set_ex_data" crates/stdlib/src/openssl.rs

echo -e "\n=== Examining _wrap_socket around lines 1814-1815 ==="
sed -n '1800,1830p' crates/stdlib/src/openssl.rs

echo -e "\n=== Examining _wrap_bio around lines 1882-1883 ==="
sed -n '1870,1900p' crates/stdlib/src/openssl.rs

echo -e "\n=== Looking for PySslSocket drop implementation ==="
grep -n "impl.*Drop.*PySslSocket" crates/stdlib/src/openssl.rs

echo -e "\n=== Looking for SSL_free or cleanup of SSL object ==="
grep -n "SSL_free" crates/stdlib/src/openssl.rs

Repository: RustPython/RustPython

Length of output: 3865


Critical: Use-after-free vulnerability in _msg_callback ex_data access.

Lines 1815 and 1883 store a pointer to a local reference in SSL ex_data at index 0:

let py_ptr: *const Py<PySslSocket> = &*py_ref;
sys::SSL_set_ex_data(ssl_ptr, 0, py_ptr as *mut _);

The problem: py_ref is a temporary reference created by into_ref_with_type() that goes out of scope when the function returns. The pointer stored in ex_data becomes a dangling pointer. When _msg_callback later dereferences it (line 742), it accesses freed memory—a use-after-free vulnerability.

The code comment claims "ssl_socket owns the SSL object and outlives it," but the implementation stores a pointer to a stack reference, not to the Py<PySslSocket> object itself.

Fix: Store a clone of Py<PySslSocket> or use a Box allocation that persists for the SSL object's lifetime, and ensure it's cleaned up via an ex_data free callback registered with SSL_get_ex_new_index.

@youknowone youknowone merged commit 6660170 into RustPython:main Dec 20, 2025
13 checks passed
@youknowone youknowone deleted the ssl branch December 20, 2025 13:06
@coderabbitai coderabbitai bot mentioned this pull request Dec 20, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant