-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Fix ssl shutdown #6871
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix ssl shutdown #6871
Conversation
📝 WalkthroughWalkthroughReworks TLS shutdown: BIO path uses a single SSL_shutdown call; socket path performs a deadline-driven loop that flushes close_notify, waits for peer close_notify using select/timeout handling, maps SSL_WANT_* to SSLWant errors or timeouts, and returns the underlying socket on socket-mode shutdown. Changes
Sequence Diagram(s)sequenceDiagram
participant App as Application
participant TLS as TLS Layer
participant Socket as Socket
participant Selector as Select/Deadline
App->>TLS: shutdown(timeout)
TLS->>TLS: flush pending TLS data\nsend close_notify
alt BIO mode
TLS->>TLS: SSL_shutdown (single call)
TLS-->>App: return Ok(None) or error
else Socket mode (non-blocking)
TLS->>TLS: mark Completed\nclear connection
TLS-->>App: return Ok(Some(socket))
else Socket mode (blocking/timeout)
loop until peer close_notify or deadline
TLS->>Selector: wait socket readiness (deadline)
Selector-->>TLS: ready / timeout
alt timeout
TLS->>TLS: mark Completed\nclear connection
TLS-->>App: return timeout_error or Ok(Some(socket))
else ready
TLS->>Socket: read/process TLS data
Socket-->>TLS: data / EOF
TLS->>TLS: detect peer close_notify?
alt peer close_notify seen
TLS->>TLS: mark Completed\nclear connection
TLS-->>App: return Ok(Some(socket))
else continue loop
end
end
end
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
d34fa9e to
7436f38
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
🤖 Fix all issues with AI agents
In `@crates/stdlib/src/openssl.rs`:
- Around line 2845-2863: The shutdown path incorrectly treats
SelectRet::TimedOut and SelectRet::Closed as success when SSL_shutdown returns 0
(zeros counter path) and breaks out, ultimately returning Ok(Some(socket));
change the logic inside the branch handling socket_stream.select(SslNeeds::Read,
&deadline) so that on SelectRet::TimedOut and SelectRet::Closed you return an
error (or propagate a timeout/closed error) instead of breaking and continuing
to return Ok(Some(socket)), ensuring SSL_shutdown's timeout/closed cases are
surfaced as failures; adjust any callers of the SSL_shutdown loop (references:
SSL_shutdown, zeros counter, socket_stream.select, SelectRet::TimedOut,
SelectRet::Closed, and the code path that currently returns Ok(Some(socket))) to
handle the propagated error.
- Around line 2845-2862: The ret == 0 path currently continues on
SelectRet::Nonblocking which can busy-loop on nonblocking sockets; modify the
SelectRet::Nonblocking arm inside the ret == 0 handling (where
socket_stream.select(SslNeeds::Read, &deadline) is matched) to return an
SSLWantReadError (or the function's equivalent SSL_WANT_READ error) instead of
just continuing, so callers get a proper WouldBlock/WantRead signal; keep the
zeros counter and the other SelectRet arms (TimedOut/Closed/Ok) unchanged.
In `@crates/stdlib/src/ssl.rs`:
- Around line 4124-4130: The code currently swallows errors from
flush_pending_tls_output in the non-blocking shutdown branch (where conn_guard
is dropped) using `let _ =`, which can hide failures to deliver close_notify;
change this to capture the Result from
flush_pending_tls_output(self.flush_pending_tls_output(vm, None)) and if it Err,
either return that Err to the caller or at minimum log it via the existing
logger before proceeding, and only set *self.shutdown_state.lock() =
ShutdownState::Completed and clear *self.connection.lock() = None when the flush
succeeds; reference the flush_pending_tls_output call and the
shutdown_state/connection locks to locate and update this logic.
| // Non-blocking: return immediately after sending close_notify. | ||
| // Don't wait for peer's close_notify to avoid blocking. | ||
| drop(conn_guard); | ||
| let _ = self.flush_pending_tls_output(vm, None); | ||
| *self.shutdown_state.lock() = ShutdownState::Completed; | ||
| *self.connection.lock() = None; | ||
| return Ok(self.sock.clone()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Silent error suppression may hide close_notify delivery failures.
Line 4127 uses let _ = to discard any error from flush_pending_tls_output. In non-blocking mode, if the flush fails (e.g., socket buffer full, connection reset), the close_notify may not reach the peer, and the caller has no indication of this failure.
Consider propagating the error or at minimum logging it, especially since the code immediately marks shutdown as Completed and clears the connection regardless of flush success.
Proposed fix
Some(0.0) => {
// Non-blocking: return immediately after sending close_notify.
// Don't wait for peer's close_notify to avoid blocking.
drop(conn_guard);
- let _ = self.flush_pending_tls_output(vm, None);
+ // Best-effort flush; ignore WouldBlock but propagate other errors
+ if let Err(e) = self.flush_pending_tls_output(vm, None) {
+ if !is_blocking_io_error(&e, vm) {
+ // Non-blocking error is expected, but other errors may indicate
+ // close_notify was not sent. Still complete shutdown to avoid
+ // leaving socket in inconsistent state.
+ }
+ }
*self.shutdown_state.lock() = ShutdownState::Completed;
*self.connection.lock() = None;
return Ok(self.sock.clone());
}🤖 Prompt for AI Agents
In `@crates/stdlib/src/ssl.rs` around lines 4124 - 4130, The code currently
swallows errors from flush_pending_tls_output in the non-blocking shutdown
branch (where conn_guard is dropped) using `let _ =`, which can hide failures to
deliver close_notify; change this to capture the Result from
flush_pending_tls_output(self.flush_pending_tls_output(vm, None)) and if it Err,
either return that Err to the caller or at minimum log it via the existing
logger before proceeding, and only set *self.shutdown_state.lock() =
ShutdownState::Completed and clear *self.connection.lock() = None when the flush
succeeds; reference the flush_pending_tls_output call and the
shutdown_state/connection locks to locate and update this logic.
92ebf63 to
81685b2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@crates/stdlib/src/ssl.rs`:
- Around line 4169-4191: The shutdown loop currently treats deadline expiry and
self.sock_wait_for_io_with_timeout timing out as a successful shutdown by
breaking the loop; instead, when deadline is reached (the now >= dl branch) or
when timed_out is true, return a timeout error from the enclosing function so
callers see the timeout and can retry the close handshake. Update the code paths
around deadline/remaining_timeout and the result of
self.sock_wait_for_io_with_timeout (the timed_out branch) to return an
appropriate timeout Err variant (consistent with this crate's error type) rather
than breaking and proceeding to clear the connection; apply the same change to
the analogous block at the second location (around lines 4218–4221) so both
timeout cases propagate errors instead of being treated as success.
♻️ Duplicate comments (1)
crates/stdlib/src/ssl.rs (1)
4124-4133: Non‑blocking shutdown completes without peer close_notify.This path sets
ShutdownState::Completedand drops the TLS connection even thoughpeer_closedis still false, so callers can’t retry to observe the peer’s close_notify. That risks reporting a clean shutdown when the peer never replied. Consider keeping the state asSentCloseNotifyand returningSSLWantReadErrorinstead of completing.🛠️ Suggested change
- *self.shutdown_state.lock() = ShutdownState::Completed; - *self.connection.lock() = None; - return Ok(self.sock.clone()); + // Keep TLS state so caller can retry and observe peer close_notify. + return Err(create_ssl_want_read_error(vm).upcast());
937fbec to
994e31d
Compare
Summary by CodeRabbit
✏️ Tip: You can customize this high-level summary in your review settings.