Skip to content

SEP-975: Transport-agnostic resumable requests #975

@jonathanhefner

Description

@jonathanhefner

Preamble

Transport-agnostic Resumable Requests

Authors: Jonathan Hefner (jonathan@hefner.pro), Connor Peet (connor@peet.io)

Abstract

This proposal describes a transport-agnostic mechanism for resuming requests after disconnections. Using this mechanism:

  • Clients and servers can disconnect and reconnect without losing progress.
  • Servers can communicate expire-after-disconnect timeouts and reclaim resources thereafter.
  • Clients can check request status after disconnect without having to fetch undelivered messages.
  • All of the above works regardless of transport (HTTP, WebSocket, stdio, etc.).

Motivation

  • Addressing limitations of resumability when using the Streamable HTTP transport.
    • SSE-based resume requires the server to send at least one event in order for the client to obtain a Last-Event-ID. If a connection is lost before an event is sent, there is no way for the client to resume the SSE stream. This is especially problematic because the spec currently says that disconnection should not be interpreted as the client cancelling its request.
    • The spec does not indicate whether a server can delete previously missed SSE events once they have been confirmed delivered by a resume. The spec could explicitly allow this, but resuming is done via HTTP GET, and HTTP GET requests should be read-only.
    • There is no mechanism for a server to communicate that it will expire a request after a certain duration of client inactivity.
  • Extending resumability to other transports.
    • Because resumability is defined by the transport layer, the burden of creating new or custom transports is higher.
    • If each transport defines its own version of resumability, it is more difficult to develop MCP features without accounting for (or relying on) the nuances of a particular transport.
  • Enabling robust handling of long-running requests such as tool calls.
    • The spec does not allow servers to close a connection while computing a result. In other words, servers must maintain potentially long-running connections.
    • There is no mechanism for a client to check the status of a request after disconnection without having to fetch undelivered messages.

Specification

See #925 for full documentation.

  1. If a client has advertised the resumableRequests capability, a server MAY send a notifications/requests/resumePolicy notification when responding to a request. The notification will specify the resume policy for the request in the event of disconnection, and will include a token that the client can use to resume the request.
  2. After the resume policy is sent, both the client and the server MAY disconnect at will. This allows servers to handle long-running requests without maintaining a constant connection.
  3. After a disconnection, a client can optionally send a requests/getStatus request to get the status of the original request without fetching pending messages. If the parameters of the requests/getStatus request are valid per the request policy, the server SHOULD reset policy-related timers and then return the status of the original request.
  4. After a disconnection, clients can resume the request by sending a requests/resume request with the same message ID as the original request, plus the server-issued token as a parameter. If the ID and token are valid per the resume policy, the server SHOULD reset policy-related timers, send any pending messages (e.g., progress notifications), and then continue as if it were handling the original request.
sequenceDiagram
    participant Client
    participant Server

    Client->>+Server: Request (e.g., tools/call)<br>{ id: 123, params: { ... } }

    Server-->>Client: notifications/requests/resumePolicy<br>{ params: { requestId: 123, resumeToken: "abc" } }
    loop
        Server-->>Client: Messages (e.g., notifications/progress)
    end
    Server--x-Client: Disconnection occurs

    Note over Client: Client checks request status (optional)
    opt 
        Client->>+Server: requests/getStatus<br>{ params: { requestId: 123, resumeToken: "abc" } }
        Server-->>-Client: GetRequestStatusResult
    end

    Note over Client: Client decides to resume
    Client->>+Server: requests/resume<br>{ id: 123, params: { resumeToken: "abc" } }<br>[Same `id` as original request]
    Server-->>Client: Undelivered messages
    loop
        Server-->>Client: Messages (e.g., notifications/progress)
    end
    Server-->>-Client: CallToolResult<br>{ id: 123, result: { ... } }
Loading

Rationale

The above specification addresses the issues outlined in the Motivation in the following ways:

  • The server sends notifications/requests/resumePolicy notification as soon as possible after determining a request should be resumable. This causes the Streamable HTTP transport to send a usable Last-Event-ID to the client.
  • Because a client resumes using a request ID instead of solely an event ID, there is no expectation for servers to retain messages that have been confirmed delivered. Furthermore, for the Streamable HTTP transport, requests/resume is sent via POST, not GET, allowing servers to delete delivered messages as part of the resume request.
  • The notifications/requests/resumePolicy notification includes an optional maxWait parameter, informing the client of the maximum number of seconds it may wait after a disconnection before resuming the request or checking its status. After this time has elapsed, the server MAY cancel the request and free all associated resources.
  • Because resumability is handled at the application layer via notifications/requests/resumePolicy and requests/resume, it works the same for all transports.
  • After sending a notifications/requests/resumePolicy notification, the server is allowed to disconnect at will. Thus the server is not required to maintain a long-running connection.
  • The client can use requests/getStatus to check the status of a request after disconnection without having to fetch undelivered messages.

Future Work

  • Support a callback mechanism such as webhooks.
    • A client could inform the server about a webhook via either a client capability or a _meta parameter for the request. Upon completion of the request, if the client is disconnected, the server could send the request ID to the webhook. The webhook host could then send a notification (e.g. push notification) to the client, and the client could resume the request to receive the result.
  • Use resumable requests for subscriptions.
  • Support client roaming.
    • Perhaps in the form of methods like requests/resume/all and requests/getStatus/all, or maybe something more closely integrated with sessions (e.g. a sessions/resume method).

Alternatives

  • #899: Transport-agnostic resumable streams

    This proposal is a simplified version of #899. This proposal focuses on making JSON-RPC requests resumable in a transport-agnostic way, whereas #899 proposes a more general transport-agnostic mechanism (streams).

    In terms of functionality, the two are mostly equivalent, but for this proposal, resumability is bounded by the JSON-RPC request message and response message. Thus, with this proposal, resumability cannot begin with a JSON-RPC notification, nor can it extend beyond a JSON-RPC response (whereas both of those things are possible with #899).

  • Resource-based approaches

    Resource-based approaches propose assigning a resource URL to a tool call result so that the client may read it at a later time. This requires modifying the definition of resources to accommodate the CallToolResult type, which does not have a 1-to-1 mapping with the TextResourceContents / BlobResourceContents types. It also requires modifying the definition of resources such that resources may be "not ready", which in turn impacts all existing clients and servers that use resources.

    More critically, though, resource-based approaches require distinct handling mechanisms for each message type other than CallToolResult. Fundamentally, the output of a request, such as a tool call, is a sequence of messages, even if the cardinality is 1 in many cases. If we try to represent the output as a resource, then we must define ways to handle messages that do not fit in a resource, such as progress notifications and sampling requests. Each message type that we introduce would need consideration about how it would work with "resource-ended" requests versus "normal" requests.

    A resource-based approach would increase the number of provisions the spec must make, increase the number of code paths required for implementation, and increase the potential for incompatibilities when extending the spec.

  • #650: tools/async/call vs tools/call

    #650 proposes adding a new type of tool call, tools/async/call. When a client calls a tool via tools/async/call, the server returns a CallToolAsyncResult response which includes a token. The client can then use the token to check the status of the tool call via tools/async/status, and to fetch the tool call result via tools/async/result.

    There is some overlap between #650 and this proposal, such as using tokens and having a dedicated polling method, but there are some important differences:

    • With #650, the client drives the decision of whether the tool call is async. This means the server cannot make the decision based on input arguments or session state.

    • #650 requires the server to implement an additional form of persistence for tool call results, separate from the message queue it must already implement for resumability.

    • Because tools/async/result only captures the tool call result, #650 effectively requires the client to stream from the GET /mcp endpoint. Otherwise, the client may miss server-sent requests (e.g. sampling requests) that would block tool call progress.

      Thus, #650 is still affected by the same problems listed in this proposal's "Motivation" section. For example, if a disconnection occurs before the client receives an event ID on the GET /mcp endpoint, and the server sends a sampling request, then the tool call would be blocked until it expires because the client would have no way to get the sampling request.

      Furthermore, it begs the question: if the client must stream from that (or any other) endpoint, why not also send the tool call result on that stream? (If the answer is to make the result fetchable separately from the stream, that can be achieved with resource links instead.)

  • #1003: Resume tokens for long-running operations

    Essentially, #1003 is cursor-based pagination of results. In order to benefit from the proposal, a method must divide its result into chunks. Calls to retrieve each chunk are affected by the same problems listed in this proposal's "Motivation" section. If a result is divided enough, the problems could be mitigated, however each chunk will require an additional round trip. Also, #1003 does not apply when a result is indivisible, such as for a long-running computation that computes a singular value.

    Other differences:

    • #1003 assumes client support; it does not define additional client capabilities nor consider them. If a client does not support the proposal, it will only receive the first chunk of the result. If the proposal were to define an additional client capability, it is not clear how result chunks could be automatically combined to support clients without the capability.

      Note: if we decide we want to assume client support, this proposal (#925) can drop the resumableRequests client capability. Everything else will work as expected.

    • With #1003, the only way for a client to check the status of a request is to resume the request. If the server does not return an error, then the request is still ongoing.

      Note: if we decide we don't want to support a dedicated polling mechanism, this proposal (#925) can drop the requests/getStatus method. Everything else will work as expected.

Backwards Compatibility

This feature is backward compatible because clients must opt in by advertising the resumableRequests capability, and servers have no obligation to send a notifications/requests/resumePolicy notification.

Security Implications

The resumeToken that the server issues as part of the notifications/requests/resumePolicy notification should be treated as sensitive information because it can be used to access messages related to the request.

Metadata

Metadata

Labels

Type

No type

Projects

Relationships

None yet

Development

No branches or pull requests

Issue actions