-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Transport-agnostic resumable streams #899
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Transport-agnostic resumable streams #899
Conversation
Implements modelcontextprotocol#543 Some comments on my implementation: - `stream/begin` -> `stream/create` to match existing verbiage in the spec. - I think stream resumption should be a Request versus a Notification, such that stateless servers are able to signal a stream no longer exists (otherwise we would require them to send a StreamEndNotification, which they may not be able to.)
Use `stream/poll` and `stream/poll/all` for stream status
|
Build is red pending #898 |
|
|
||
| When using the Streamable HTTP transport, streams are implemented using the Server-Sent Events (SSE) protocol. When served over HTTP, a server cannot definitively when any given message is delivered. Clients use the `Last-Event-ID` HTTP header to resume their SSE connection to Streamable HTTP servers. | ||
|
|
||
| In order to deliver durable message guarantees, servers may maintain a buffer of recently sent messages (a two minute window aligns with common TCP idle timeout configurations), and use a scheme to correlate message IDs across streams, such as by using a strictly monotonic counter. When a client reconnects after a disconnection, the `Last-Event-ID` header can be used to derive the point from which its view of streams will resume and poll from. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note for reviewers: see also #543 (comment).
|
|
||
| ### Abandonment | ||
|
|
||
| If a client fails to resume or poll a stream within the `resumeInterval.max` period, the server may consider the stream abandoned and reclaim resources. The next time a client attempts to resume or poll the stream, the server will respond with an error. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| If a client fails to resume or poll a stream within the `resumeInterval.max` period, the server may consider the stream abandoned and reclaim resources. The next time a client attempts to resume or poll the stream, the server will respond with an error. | |
| If a client fails to resume or poll a stream within the `resumeInterval.max` period, the server may consider the stream abandoned and reclaim resources. The next time a client attempts to resume, the server will respond with an error. |
Depending on how aggressive the server is about reclaiming resources, it could respond to stream/poll with StreamStatus.status == "abandoned" instead of an error.
Alternatively, we could remove the "abandoned" option from the enum.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would be generally good removing abandoned as a state. As a client I would treat them identically.
| export interface StreamCreateNotification extends Notification { | ||
| method: "notifications/stream/create"; | ||
| params: { | ||
| /** | ||
| * The stream that was created. | ||
| */ | ||
| stream: Stream; | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since Stream no longer extends BaseMetadata, what are your thoughts about inlining streamId and resumeInterval here?
Also, we may need a requestId property here so that a client can associate the stream with an originating request on multiplexed connections (e.g. stdio).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that inlining makes sense now.
Also, we may need a requestId property here
As a client I was thinking I would just subscribe to all streams I can essentially. Are there cases you anticipate where a client might choose not to subscribe to a stream if it's for a request it cares less about?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, in the original design there was no stream/poll/all, and it was assumed that clients would track stream IDs internally. If a client cancelled a tool call, it would need to know which stream to not resume or poll.
Perhaps with the addition of stream/poll/all that knowledge is no longer necessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would track stream IDs internally, but on reconnection poll/all to find what streams I need to resubscribe to. From a client POV streams are super lightweight and there's no reason I wouldn't just want to get all the data all the time. There's nothing in MCP (yet) that's prohibitively chatty to the point that I would want to avoid the overhead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another scenario in which the client must be certain of the request ID / stream ID mapping is webhooks. In the original concept, stream IDs were not considered sensitive information, so a stream ID could be sent to a webhook in order to ping the client (see #543 (comment)). However, we are now considering stream IDs as sensitive information (which I think makes sense), so the server would need to send something else to the webhook instead. Namely, the request ID.
I just pushed a commit that adds a requestId property to notifications/stream/create (and addresses some of the comments in this thread), but I'm open to removing it in favor of something better.
|
I fundamentally think this proposal doesn't have enough real world use cases attached to it - the PR itself says "When a client sends a request to a server" - that's not really a thing, there are tool calls, resource, calls, connection calls, not all of these would potentially trigger a stream creation. To me, the fundamental issue here is that this is a protocol addition designed without well defined use cases to support it, and it's overly complex, AND this doesn't handle the very real world issue of message delivery for some well defined messages we already have (resource notification updates, tool change updates, etc). So, if the proposal here is that a change subscription on a resource always triggers a stream, that needs to be very clear Reliable message delivery in an intermittent connection world is actually a very well understood problem at this point, and this doesn't really use best practices from APNS. So, if we go back to first principals, and the goal here is reliable server to client message delivery, that's yet a different problem to solve. |
| The `resumeInterval` specifies: | ||
|
|
||
| - `min`: Minimum seconds a client should wait before resuming the stream (prevents excessive reconnections) | ||
| - `max`: Maximum seconds a client may wait before the server considers the stream abandoned |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are the Min and Max arguments mandatory here? How can I represent a stream that should never be considered abandoned?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The arguments are not mandatory: https://github.com/modelcontextprotocol/modelcontextprotocol/pull/899/files#diff-273099cbfa5ca22c6af97f73eca738047952f7a8e7dffc4c1572fec3bdbbc211R1408-R1422. So you can omit max to indicate that a stream will never be considered abandoned. We can make a note of that in the docs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, a note would be useful here.
|
|
||
| ### Disconnection and Resumption | ||
|
|
||
| Either the client or server may disconnect at any time. After a disconnection, the client can resume the stream by sending a `stream/resume` request with the stream ID: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the initialization skipped then? I guess it shouldn't be, since the client might have changed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is an interesting question. I don't think it's unique to stream/resume though. For example, if a client hasn't contacted the server in a while, should it perform initialization before sending a resources/read request? I think it is equivalent to asking "when should an MCP session expire?" It would be good to have more guidance about that in the spec, but I think it is an orthogonal concern.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I started this PR a few weeks ago to explore session management. I’m sure there are several inaccuracies, but it could serve as a useful starting point. That said, as you mentioned, it’s not the focus of this PR.
I still believe that initialization should always occur upon reconnection, since session management remains unclear at this stage. However, I think the client and server can choose to skip this step. Still, we should recommend—or even enforce—it in the standard.
| When a client sends a `stream/resume` request, the server should: | ||
|
|
||
| 1. Send all unsent messages for the stream to the client | ||
| 2. If the stream has ended, send a `notifications/stream/end` notification after sending all other unsent messages |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does the client need to reconnect after this in order to continue using the MCP?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you asking whether the server closes the connection after sending notifications/stream/end? I think that depends on the transport, just like tool calls. For HTTP, I would expect the server to close the connection; for stdio I would expect the server to not close the connection.
In general though, with resumable streams, both the client and the server should be able to close the connection at any time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it — so this would ultimately be up to the transport’s author? I think this should be clearly specified in the standard. Saying it depends on the underlying implementation breaks transport agnosticism from the client’s perspective. If the client has to handle each notifications/stream/end differently depending on the configured transport, it introduces ambiguity and weakens interoperability.
| ### Abandonment | ||
|
|
||
| If a client fails to resume or poll a stream within the `resumeInterval.max` period, the server may consider the stream abandoned and reclaim resources. The next time a client attempts to resume or poll the stream, the server may either respond with an error or a | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The end is missing here
|
|
||
| - Stream IDs should be treated as sensitive information as they can be used to retrieve message history | ||
| - Servers should validate that clients have appropriate permissions to resume streams | ||
| - Servers should limit the number of unsent messages a disconnected stream may retain in order to prevent denial of service attacks |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should the client be aware of this limit?
This is two way -- stream IDs can be set on any message via the
Notifications sent on a stream should be buffered and then replayed if/when the client reconnects and calls
It's for 'everything.' I'm not super clear on what you mean by having something at a base protocol level versus another primitive. By base protocol level, do you mean a transport-level thing? |
I think what @patwhite is saying is that if this is for 'everything' then the proposal needs to clearly document how every supported scenario will use this feature. It isn't sufficient to describe the generic stream mechanics and then say "jazz hands" all the transports and features will use it. The consequence would likely be variations in how implementations leverage the streams and that will create interop issues. |
…back-1 Fix typos and address feedback
|
Resumable streams are extremely useful! But I am not sure proposal here enables more beyond the resumability/redelivery stream aspect of the spec we have today. every tool call, we can already listen for progress, disconnect & reconnect using the Last-Event-Id. Side-thought: As a client, I would love to be able to initiate a stream that receives all notifications and messages. This would basically allow "multiplexing" of the different tool-calls, notifications, logs, etc. over a single connection, instead of spreading it out over numerous (potentially concurrent) long running streams. whereas today; I need to:
With clients that support some concurrent tool calls; especially for extremely long running tools, we end up listening and polling numerous Last-Event-Ids. wdyt? |
That's a fair question. I actually started out by thinking about how we could fix the shortcomings of the Streamable HTTP transport (which I list in the "Motivation" section of #543). My conclusion was that we'd at least need some sort of message to communicate the resume policy (e.g. However, based on some of the comments in this thread, I've also opened #925, which is a less general but more minimal approach.
That sounds a bit like using streams as a kind of universal subscription mechanism, which I think is a nice idea. If you're interested, I wrote about it in #543 (comment).
I think that's a really good point. Off the top of my head, I can think of three different approaches:
|
Yes, 100% what Darrel said - there are already mechanism for delivering tool change notifications, there are already mechaniusm for deliverying ellicitation requests, does this replace all of that and require a complete rewrite? I'll add, after my comment, we still haven't gotten a single concrete use case to actuall consider, this continues to be everything for everyone, which just means it's not a fully baked solution for a given problem. I genuinely think a conversation about better intermittent message delivery MUST start with the uses cases, then flow to solutions, not start with a solution. |
…back-2 Update transport-agnostic resumable streams
Well, one use case is long-running tool calls. There have been numerous threads about it, and it is explicitly mentioned in the documentation. Another use case is subscriptions, which I wrote about in #543 (comment) and mentioned in the comment right above yours. If you have use cases that this proposal would not work for, please share so that we can discuss! |
Those two uses cases can be handled with current tooling (or very minor updates to current tooling). A new out of band notification method could be created for subscriptions that mirrors the current notifications, and the long running tool calling can work with resources. This is why I'm saying we should really come up with a large list of what we're trying to solve for, if it's just those two, this major of a change doesn't make sense. |
…back-3 Tweak documentation
I have written about why resource-based approaches are unsound in #491 (comment), #549 (comment), #549 (comment), and #549 (comment), as well as in I have just added a summary of my thoughts to this PR. For ease of reference, I will copy it here: Resource-based approaches propose assigning a resource URL to a tool call result so that the client may read it at a later time. This requires modifying the definition of resources to accommodate the More critically, though, resource-based approaches require distinct handling mechanisms for each message type other than A resource-based approach would increase the number of provisions the spec must make, increase the number of code paths required for implementation, and increase the potential for incompatibilities when extending the spec. |
|
Closing in favor of #975 |
Preamble
Transport-agnostic Resumable Streams
Authors: Connor Peet (connor@peet.io), Jonathan Hefner (jonathan@hefner.pro)
Abstract
This proposal describes transport-agnostic resumable streams, as discussed in #543 by @jonathanhefner. Using transport-agnostic resumable streams:
Motivation
Last-Event-ID. If a connection is lost before an event is sent, there is no way for the client to resume the SSE stream. This is especially problematic because the spec currently says that disconnection should not be interpreted as the client cancelling its request.Specification
notifications/stream/createnotification with a uniquestreamId, aresumeToken, and aresumeInterval.streamIdandresumeTokenare used to resume or poll the stream in the event of disconnection.resumeInterval.minspecifies the minimum seconds a client should wait before resuming or polling the stream in order to prevent excessive reconnections.resumeInterval.maxspecifies the maximum seconds a client may wait before the server considers the stream abandoned.stream/pollrequest with thestreamIdandresumeToken.stream/resumerequest with thestreamIdandresumeToken.notifications/stream/endnotification.sequenceDiagram participant Client participant Server Client->>+Server: Request (e.g., tools/call) Note over Server: Server creates stream Server-->>Client: notifications/stream/create<br>{ streamId: "123", resumeToken: "abc" } loop Server-->>Client: Messages (e.g., notifications/progress) end Server--x-Client: Disconnection occurs Note over Client: Client polls stream status (optional) Client->>+Server: stream/poll<br>{ streamId: "123", resumeToken: "abc" } Server-->>-Client: StreamPollResult Note over Client: Client decides to resume Client->>+Server: stream/resume<br>{ streamId: "123", resumeToken: "abc" } Server-->>Client: Undelivered messages loop Server-->>Client: Messages (e.g., notifications/progress) end Server-->>Client: CallToolResult Note over Server: Server terminates stream Server-->>-Client: notifications/stream/end<br>{ streamId: "123" }Rationale
The above specification addresses the issues outlined in the Motivation in the following ways:
notifications/stream/createnotification as soon as possible after determining a request should be resumable. This causes the Streamable HTTP transport to send a usableLast-Event-IDto the client.stream/resumeis sent via POST, not GET, allowing servers to delete delivered messages as part of the resume request.notifications/stream/createnotification includes an optionalresumeInterval.maxparameter, informing the client of the maximum number of seconds it may wait after a disconnection before resuming the request or checking its status. After this time has elapsed, the server MAY cancel the request and free all associated resources.notifications/stream/createandstream/resume, it works the same for all transports.notifications/stream/createnotification, the server is allowed to disconnect at will. Thus the server is not required to maintain a long-running connection.stream/pollto poll the status of a stream after disconnection without having to fetch undelivered messages.Future Work
resources/subscribe/streammethod. See Proposal: Transport-agnostic resumable streams #543 (comment) for further discussion.stream/resume/allandstream/poll/all, or maybe something more closely integrated with sessions (e.g. asessions/resumemethod).Alternatives
#925: Transport-agnostic resumable requests
This proposal is a generalized version of #925. Whereas #925 focuses solely on JSON-RPC requests, this proposal suggests a more general mechanism: streams.
In terms of functionality, the two are mostly equivalent, but for #925, resumability is bounded by the JSON-RPC request message and response message. Meaning resumability cannot begin with a JSON-RPC notification, nor can it extend beyond a JSON-RPC response. With this proposal, both of those things are possible.
Resource-based approaches
Resource-based approaches propose assigning a resource URL to a tool call result so that the client may read it at a later time. This requires modifying the definition of resources to accommodate the
CallToolResulttype, which does not have a 1-to-1 mapping with theTextResourceContents/BlobResourceContentstypes. It also requires modifying the definition of resources such that resources may be "not ready", which in turn impacts all existing clients and servers that use resources.More critically, though, resource-based approaches require distinct handling mechanisms for each message type other than
CallToolResult. Fundamentally, the output of a request, such as a tool call, is a sequence of messages, even if the cardinality is 1 in many cases. If we try to represent the output as a resource, then we must define ways to handle messages that do not fit in a resource, such as progress notifications and sampling requests. Each message type that we introduce would need consideration about how it would work with "resource-ended" requests versus "normal" requests.A resource-based approach would increase the number of provisions the spec must make, increase the number of code paths required for implementation, and increase the potential for incompatibilities when extending the spec.
Backwards Compatibility
This feature is backward compatible because clients must opt in by advertising the
streamscapability, and servers have no obligation to create a stream.Reference Implementation
@almaleksia has done initial implementation of this for the Github MCP server. I will let her weigh in here with any learnings from the process 🙂
Security Implications
The
resumeTokenthat the server issues as part of thenotifications/stream/createnotification should be treated as sensitive information because it can be used to access messages related to the request.