Skip to content

[notes] Long running tools/ async tools/ resumability #982

@ihrpr

Description

@ihrpr

This issue is just a note on all the proposals for Long Running Tools and related topics. It is used for tracking only and notes from the steering committee will be shared here.

If any related proposal/discussion/PR/Issue is missing, please comment.


Transport-Agnostic Solutions

  • SEP-975: Transport-agnostic resumable requests #925

    • Author: Jonathan Hefner & Connor Peet
    • Summary: Introduces a mechanism for resuming requests after disconnections across different transport protocols. Key features: clients and servers can disconnect and reconnect without losing progress, servers can communicate expire-after-disconnect timeouts, clients can check request status after disconnect, works across transport types (HTTP, WebSocket, stdio, etc.). Addresses limitations of resumability in Streamable HTTP transport and enables robust handling of long-running requests.
  • SEP-975: Transport-agnostic resumable requests #975

    • Author: Jonathan Hefner & Connor Peet
    • Summary: Formal SEP for transport-agnostic resumable requests mechanism allowing disconnect/reconnect without losing progress. Servers communicate expire-after-disconnect timeouts, clients check request status after disconnect without fetching undelivered messages. Works across HTTP, WebSocket, stdio, etc. Introduces methods like requests/resume and requests/getStatus.
  • Proposal: Transport-agnostic resumable streams #543

    • Author: Jonathan Hefner
    • Summary: Proposal for generalized, transport-independent mechanism for resumable streams in MCP. Key features: stream/begin JSON-RPC notification to start stream with unique ID, stream/end notification to mark completion, stream/resume notification to reconnect, stream/poll request to check status. Aims to improve stream handling across different transport mechanisms, providing more flexibility and reliability for tool interactions.

New Protocol Primitives

  • feature: Async support without using Resource #650

    • Author: bzsurbhi
    • Summary: Implements a basic polling mechanism for long-running operations. The proposed workflow includes: Client sends CallToolAsyncRequest → Server responds with CallToolAsyncResult → Client polls server with CheckToolAsyncStatusRequest → Server returns status → When status is ACTIVE, client retrieves final result. Also adds listToolsAsync call capability.
  • feature: async support #700

    • Author: bzsurbhi
    • Summary: Adds support for long-running asynchronous operations with workflow: Client → Server: CallToolAsyncRequest, Server → Client: CallToolAsyncResult → AsyncOperation, Client → Server (Poll): GetOperationRequest, Server → Client: GetOperationResult with operation status and resourceUri. Includes polling mechanism to check operation status and retrieve results once complete.
  • feature: async tool call with join for disconnected clients and progress resources #617

    • Author: davemssavage
    • Summary: Extends MCP to support long-running async tasks with key features: allows clients to submit async tool call requests, enables clients to rejoin in-progress async tasks after disconnection, provides progress notifications and resources during task execution, maintains backwards compatibility. Addresses the limitation that "long running async tasks on the server rely on a continuous connection to the server, also they have limited options for notification of progress."
  • Support progress notification trees for task fanout #929

    • Author: Jonathan Hefner
    • Summary: Proposal to improve progress notifications by allowing representation of concurrent subtasks. Suggests adding a notifications/progress/createTracker method that would: allow declaration of new progressToken, support parent-child relationships between progress trackers, enable tools to forward progress notifications from subtasks. Provides flexibility for tools managing complex workflows with nested/concurrent task progress tracking.
  • Cursors and Resume Tokens

    • Author:
    • Summary: Introduces resumeToken and nextResumeToken fields for handling long-running operations that extend beyond pagination. Addresses resuming operations whose parameters don't change between invocations but results may change over time. Supports database queries, eventually consistent systems, and human-in-the-loop workflows. The resumeToken (caller-provided) indicates a stateful/resumable operation, while nextResumeToken (receiver-provided) enables multiple call/result pairs indicating stages of completion. Works across connections and complements existing progress token mechanisms.

Resource-Based Approaches

  • fix: add status field in Resource class and Resource as a return type in CallToolResult #549

    • Author: bzsurbhi
    • Summary: Enhances support for long-running operations by implementing a Resource-based polling mechanism. Adds a status field to the Resource class and allows Resource as a return type in CallToolResult. The discussion evolved into exploring alternative approaches including Promise-based approach, AsyncOperation class, and task reference system. Multiple participants explored various approaches to handling long-running asynchronous operations.
  • feature: Async support using Resource #651

    • Author: bzsurbhi
    • Summary: Introduces async support utilizing Resource and its subscribe/notify features. Workflow: Client sends CallToolAsyncRequest → Server responds with CallToolAsyncResult (AsyncOperation) → Client polls with GetOperationRequest → Server returns GetOperationResult with resource URI and status. Enables partial results through ResourceUpdated notifications, allows hosts to disconnect/reconnect to check status, provides flexible async tool call management.
  • Asynchronous operations in MCP #491

    • Author: bzsurbhi
    • Summary: Comprehensive proposal for supporting long-running async operations with features: marking tools as supporting async execution, creating async resources to track operation status, allowing clients to subscribe and monitor operations, providing mechanisms for progress updates, result retrieval, and cancellation. Includes detailed specifications for AsyncResource class, Tool/ToolManager extensions, new API formats, and implementation examples.

Problem Space & Requirements

  • Defining the problem space for asynchronous tool execution #843

    • Author: Joffref
    • Summary: Explores challenges with current request/response model focusing on four key problems: 1) Unresponsive clients during long-running tasks, 2) Inefficiencies in agentic systems, 3) Incompatibility with long-running or indefinite jobs, 4) Lack of state certainty after client disconnection. Discussion explores potential solutions including resource pointers, subscriptions, and notification mechanisms while maintaining client responsiveness.
  • [RFC] Long Running Task/Job/Async handling modelcontextprotocol-community/working-groups#30

    • Author: cploujoux
    • Summary: RFC discussing challenges with handling long-running tasks in MCP including: connection persistence when using SSE for long periods, client disconnection handling to ensure task continuation, tracking progress and state of long-running operations, potential event system and webhook implementation. Links to multiple related proposals and has been moved to official Standards Track for review.

Summary Documents

  • Google Doc Summary
    • Summary: Summary document consolidating async/long-running task proposals

Note on Webhook Approaches

While not included in active proposals (closed PR #593 proposed webhook support), the webhook pattern has been discussed as a potential solution for async communication, particularly in modelcontextprotocol-community/working-groups#30. The approach was closed in favor of more generalized trigger mechanisms.

Themes

(AI generated summaries)

1. Asynchronous Operations

Most proposals focus on handling long-running operations that may take hours or days to complete. The main approaches include:

2. Transport-Agnostic Solutions

Several proposals aim to create solutions that work across different transport protocols:

3. Progress Tracking

Multiple proposals address the need for better progress tracking:

4. Connection Resilience

A common concern is handling client disconnections during long-running operations:

Related Issues and Cross-References

Metadata

Metadata

Assignees

No one assigned

    Labels

    notesNotes from meetings and discussions. Used for tracking purposes only, no action is needed.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions