-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Description
Preamble
Transport-agnostic Resumable Requests
Authors: Jonathan Hefner (jonathan@hefner.pro), Connor Peet (connor@peet.io)
Abstract
This proposal describes a transport-agnostic mechanism for resuming requests after disconnections. Using this mechanism:
- Clients and servers can disconnect and reconnect without losing progress.
- Servers can communicate expire-after-disconnect timeouts and reclaim resources thereafter.
- Clients can check request status after disconnect without having to fetch undelivered messages.
- All of the above works regardless of transport (HTTP, WebSocket, stdio, etc.).
Motivation
- Addressing limitations of resumability when using the Streamable HTTP transport.
- SSE-based resume requires the server to send at least one event in order for the client to obtain a
Last-Event-ID. If a connection is lost before an event is sent, there is no way for the client to resume the SSE stream. This is especially problematic because the spec currently says that disconnection should not be interpreted as the client cancelling its request. - The spec does not indicate whether a server can delete previously missed SSE events once they have been confirmed delivered by a resume. The spec could explicitly allow this, but resuming is done via HTTP GET, and HTTP GET requests should be read-only.
- There is no mechanism for a server to communicate that it will expire a request after a certain duration of client inactivity.
- SSE-based resume requires the server to send at least one event in order for the client to obtain a
- Extending resumability to other transports.
- Because resumability is defined by the transport layer, the burden of creating new or custom transports is higher.
- If each transport defines its own version of resumability, it is more difficult to develop MCP features without accounting for (or relying on) the nuances of a particular transport.
- Enabling robust handling of long-running requests such as tool calls.
- The spec does not allow servers to close a connection while computing a result. In other words, servers must maintain potentially long-running connections.
- There is no mechanism for a client to check the status of a request after disconnection without having to fetch undelivered messages.
Specification
See #925 for full documentation.
- If a client has advertised the
resumableRequestscapability, a server MAY send anotifications/requests/resumePolicynotification when responding to a request. The notification will specify the resume policy for the request in the event of disconnection, and will include a token that the client can use to resume the request. - After the resume policy is sent, both the client and the server MAY disconnect at will. This allows servers to handle long-running requests without maintaining a constant connection.
- After a disconnection, a client can optionally send a
requests/getStatusrequest to get the status of the original request without fetching pending messages. If the parameters of therequests/getStatusrequest are valid per the request policy, the server SHOULD reset policy-related timers and then return the status of the original request. - After a disconnection, clients can resume the request by sending a
requests/resumerequest with the same message ID as the original request, plus the server-issued token as a parameter. If the ID and token are valid per the resume policy, the server SHOULD reset policy-related timers, send any pending messages (e.g., progress notifications), and then continue as if it were handling the original request.
sequenceDiagram
participant Client
participant Server
Client->>+Server: Request (e.g., tools/call)<br>{ id: 123, params: { ... } }
Server-->>Client: notifications/requests/resumePolicy<br>{ params: { requestId: 123, resumeToken: "abc" } }
loop
Server-->>Client: Messages (e.g., notifications/progress)
end
Server--x-Client: Disconnection occurs
Note over Client: Client checks request status (optional)
opt
Client->>+Server: requests/getStatus<br>{ params: { requestId: 123, resumeToken: "abc" } }
Server-->>-Client: GetRequestStatusResult
end
Note over Client: Client decides to resume
Client->>+Server: requests/resume<br>{ id: 123, params: { resumeToken: "abc" } }<br>[Same `id` as original request]
Server-->>Client: Undelivered messages
loop
Server-->>Client: Messages (e.g., notifications/progress)
end
Server-->>-Client: CallToolResult<br>{ id: 123, result: { ... } }
Rationale
The above specification addresses the issues outlined in the Motivation in the following ways:
- The server sends
notifications/requests/resumePolicynotification as soon as possible after determining a request should be resumable. This causes the Streamable HTTP transport to send a usableLast-Event-IDto the client. - Because a client resumes using a request ID instead of solely an event ID, there is no expectation for servers to retain messages that have been confirmed delivered. Furthermore, for the Streamable HTTP transport,
requests/resumeis sent via POST, not GET, allowing servers to delete delivered messages as part of the resume request. - The
notifications/requests/resumePolicynotification includes an optionalmaxWaitparameter, informing the client of the maximum number of seconds it may wait after a disconnection before resuming the request or checking its status. After this time has elapsed, the server MAY cancel the request and free all associated resources. - Because resumability is handled at the application layer via
notifications/requests/resumePolicyandrequests/resume, it works the same for all transports. - After sending a
notifications/requests/resumePolicynotification, the server is allowed to disconnect at will. Thus the server is not required to maintain a long-running connection. - The client can use
requests/getStatusto check the status of a request after disconnection without having to fetch undelivered messages.
Future Work
- Support a callback mechanism such as webhooks.
- A client could inform the server about a webhook via either a client capability or a
_metaparameter for the request. Upon completion of the request, if the client is disconnected, the server could send the request ID to the webhook. The webhook host could then send a notification (e.g. push notification) to the client, and the client could resume the request to receive the result.
- A client could inform the server about a webhook via either a client capability or a
- Use resumable requests for subscriptions.
- For example, by adding a
resources/subscribe/resumablemethod. See Proposal: Transport-agnostic resumable streams #543 (comment) for a proximal discussion.
- For example, by adding a
- Support client roaming.
- Perhaps in the form of methods like
requests/resume/allandrequests/getStatus/all, or maybe something more closely integrated with sessions (e.g. asessions/resumemethod).
- Perhaps in the form of methods like
Alternatives
-
#899: Transport-agnostic resumable streams
This proposal is a simplified version of #899. This proposal focuses on making JSON-RPC requests resumable in a transport-agnostic way, whereas #899 proposes a more general transport-agnostic mechanism (streams).
In terms of functionality, the two are mostly equivalent, but for this proposal, resumability is bounded by the JSON-RPC request message and response message. Thus, with this proposal, resumability cannot begin with a JSON-RPC notification, nor can it extend beyond a JSON-RPC response (whereas both of those things are possible with #899).
-
Resource-based approaches
Resource-based approaches propose assigning a resource URL to a tool call result so that the client may read it at a later time. This requires modifying the definition of resources to accommodate the
CallToolResulttype, which does not have a 1-to-1 mapping with theTextResourceContents/BlobResourceContentstypes. It also requires modifying the definition of resources such that resources may be "not ready", which in turn impacts all existing clients and servers that use resources.More critically, though, resource-based approaches require distinct handling mechanisms for each message type other than
CallToolResult. Fundamentally, the output of a request, such as a tool call, is a sequence of messages, even if the cardinality is 1 in many cases. If we try to represent the output as a resource, then we must define ways to handle messages that do not fit in a resource, such as progress notifications and sampling requests. Each message type that we introduce would need consideration about how it would work with "resource-ended" requests versus "normal" requests.A resource-based approach would increase the number of provisions the spec must make, increase the number of code paths required for implementation, and increase the potential for incompatibilities when extending the spec.
-
#650:
tools/async/callvstools/call#650 proposes adding a new type of tool call,
tools/async/call. When a client calls a tool viatools/async/call, the server returns aCallToolAsyncResultresponse which includes a token. The client can then use the token to check the status of the tool call viatools/async/status, and to fetch the tool call result viatools/async/result.There is some overlap between #650 and this proposal, such as using tokens and having a dedicated polling method, but there are some important differences:
-
With #650, the client drives the decision of whether the tool call is async. This means the server cannot make the decision based on input arguments or session state.
-
#650 requires the server to implement an additional form of persistence for tool call results, separate from the message queue it must already implement for resumability.
-
Because
tools/async/resultonly captures the tool call result, #650 effectively requires the client to stream from the GET/mcpendpoint. Otherwise, the client may miss server-sent requests (e.g. sampling requests) that would block tool call progress.Thus, #650 is still affected by the same problems listed in this proposal's "Motivation" section. For example, if a disconnection occurs before the client receives an event ID on the GET
/mcpendpoint, and the server sends a sampling request, then the tool call would be blocked until it expires because the client would have no way to get the sampling request.Furthermore, it begs the question: if the client must stream from that (or any other) endpoint, why not also send the tool call result on that stream? (If the answer is to make the result fetchable separately from the stream, that can be achieved with resource links instead.)
-
-
#1003: Resume tokens for long-running operations
Essentially, #1003 is cursor-based pagination of results. In order to benefit from the proposal, a method must divide its result into chunks. Calls to retrieve each chunk are affected by the same problems listed in this proposal's "Motivation" section. If a result is divided enough, the problems could be mitigated, however each chunk will require an additional round trip. Also, #1003 does not apply when a result is indivisible, such as for a long-running computation that computes a singular value.
Other differences:
-
#1003 assumes client support; it does not define additional client capabilities nor consider them. If a client does not support the proposal, it will only receive the first chunk of the result. If the proposal were to define an additional client capability, it is not clear how result chunks could be automatically combined to support clients without the capability.
Note: if we decide we want to assume client support, this proposal (#925) can drop the
resumableRequestsclient capability. Everything else will work as expected. -
With #1003, the only way for a client to check the status of a request is to resume the request. If the server does not return an error, then the request is still ongoing.
Note: if we decide we don't want to support a dedicated polling mechanism, this proposal (#925) can drop the
requests/getStatusmethod. Everything else will work as expected.
-
Backwards Compatibility
This feature is backward compatible because clients must opt in by advertising the resumableRequests capability, and servers have no obligation to send a notifications/requests/resumePolicy notification.
Security Implications
The resumeToken that the server issues as part of the notifications/requests/resumePolicy notification should be treated as sensitive information because it can be used to access messages related to the request.