Skip to content

Conversation

@bzsurbhi
Copy link

@bzsurbhi bzsurbhi commented Jun 5, 2025

Added support for long-running operations by implementing a basic polling mechanism. Inspired from 617 and 549

Motivation and Context

  • Client → Server - CallToolAsyncRequest
  • Server → Client - CallToolAsyncResult
  • Client → Server(poll) - CheckToolAsyncStatusRequest
  • Server → Client - CheckToolAsyncStatusResult(status)
  • Once status is ACTIVE
    ** Client -> Server - GetToolAsyncRequest
    ** Server -> Client - GetToolAsyncResult(returns CallToolResult)

lLJ1JiCm3BttAwoTntQVD4HL2XgW2Gt4FSfC2KkTAd4dqCI_uopjnjW4dBYqgUFt_9myzoAmyjpMDclaRRHf534KrY3nGUW1_HfzbasdC2F3Hh5n1DDfbdBXxBnBcG0xPyDOXNVIOwsiXGsf6eUWVDNMaXk6i49iSclM-94b59m9GQAQG26WHo798yK2GsiPDknMI0V8Z1mZk8Zkr06KRwXRYKODi9NmuXsvDjZt36a94R9j

How Has This Been Tested?

Breaking Changes

Non breaking changes

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation update

Checklist

  • I have read the MCP Documentation
  • My code follows the repository's style guidelines
  • New and existing tests pass locally
  • I have added appropriate error handling
  • I have added or updated documentation as needed

Additional context

@connor4312
Copy link
Contributor

connor4312 commented Jul 23, 2025

I don't believe this is the right level of abstraction to take for async processes and resumability. Thinking about A2A flows, we need a primitive that's resumable, pollable, and updatable -- basically the lifecycle that you describe here -- but that flow can have multiple tool calls and other operations happening on it. I think that we should introduce a notion of a "Task", which could contain tool calls or other operations (e.g. A2A messages). E.g. a Task that contains only a tool call might be represented as

  • Client: calls tasks/create({}), server returns { taskId: 'task-uuid' }
  • Client: calls tools/call as usual with _meta: { taskId: 'task-uuid' }
  • Polling and status checking details can happen as usual. All messages corresponding to that task are tagged with taskId. Maybe this composes with SEP-975: Transport-agnostic resumable requests #925 such that you can introspect the in-flight requests within that task.
  • Client: calls tasks/end({}) when the tool call is done and the task is no longer needed

Having a task lifecycle that extends outside of a single request is the building block of A2A flows, e.g.

  • Server: notifies notifications/tasks/created with { taskId: 'task-uuid' } when it triggers a A2A flow
  • Server: notifies notifications/messages/created with details when A2A messages are sent, calls elicitation/create when user input is needed, etc., all with _meta: { taskId: 'task-uuid' }
  • Server: notifies notifications/tasks/ended when the A2A flow ends

This is just off the top of my head and obviously need some more details here like TTLs and so on, but imo we should have a vision of how async support composes later on so we don't have to reinvent the wheel as soon as we start talking about A2A flows.

@MQ37
Copy link

MQ37 commented Aug 5, 2025

What is the reasoning behind tools/async/join? When a client disconnects and reconnects, and still has the token, it can still poll the status of the tool call. Once the status is ACTIVE (i.e., finished), the client can retrieve the result. What is the need for tools/async/join in this scenario? I suppose the tool will continue running even when the client is disconnected until it hits a hard timeout, and in the meantime, the client can reconnect and poll for the status.

@davemssavage
Copy link

davemssavage commented Aug 16, 2025

Having been the original author of #617 I think I've come to the conclusion that with a little creative use of a zero progress notification and some juggling of logic on the client side the existing protocol is able to support much of this use case already.

See modelcontextprotocol/python-sdk#1209 for an example of how this can be implemented in the python client.

Caveat to the above is I'm not sure I'm a big fan of the zero progress notification (even though it does minimise protocol changes) as it feels a bit to implicit and would need good documentation for all client and servers so they know how to implement this behaviour properly.

A more robust solution might be to create a specific protocol notification as an acknowledgement that a call tool has been received which then gives the client an sse event id to resume later. That could be then handled within the sdks rather than relying on clients and tool builders to implement long running tool calls in a specific way.

@dsp-ant
Copy link
Member

dsp-ant commented Nov 24, 2025

Tasks is a very similar feature that supports this now.

@dsp-ant dsp-ant closed this Nov 24, 2025
@davemssavage
Copy link

Looks like it satisfies what I was looking for, has anyone picked up the python implementation of this that anyone is aware of?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

No open projects
Status: Draft

Development

Successfully merging this pull request may close these issues.

5 participants