-
Notifications
You must be signed in to change notification settings - Fork 1.2k
SEP-1686: Tasks #1732
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SEP-1686: Tasks #1732
Conversation
8a4e3ac to
1a4e205
Compare
|
Will add this, meant to do so previously and lost track of that.
Breaking this down because it carries a lot of implications:
|
|
Because we are currently designing support for long-lifecycle calls, I've found that we need to provide support for both the client and server sides, as well as runtime. Here, we assume that both the MCP client and server run on the server side, i.e., an agentic system. This necessitates that both the server and client sides support normal operation even in the event of application crashes, restarts, deployments, rollbacks, or even cluster split-brain scenarios. This places significant demands on our entire task ID generation, task metadata storage, and task flow management. For example, we need task information to support multiple Availability Zones (AZs) and to continue functioning even in a split-brain scenario. If the client fails and restarts, it should also support resuming the previous flow, for example, continuing to drive the task after receiving a long-lifecycle call/task completion. Furthermore, in an agentic scenario, a task might actually be generated by a session, and this session might be genuinely associated with a user. If the agentic system is actually serving millions or tens of millions of users, we cannot allow user A to see user B's task list. More detailed identity isolation is essential. I haven't seen any retry-related specs for the task itself yet. Perhaps for long-running tasks, we need to associate a set of configurations to implement server-side autonomous retry logic. Therefore, I think this might be similar to a combination of CQRS and Workflow. |
|
The server needs support racing tasks too, if one succeeds first, cancel the others |
mikekistler
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good but I think there are some small improvements that can be made on the task statue.
|
All feedback up to this point has been responded to. |
localden
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM for experimental release.
| ## `tasks` | ||
|
|
||
| {/* @category `tasks` */} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These headings and categories are actually rather free-form. So ## could actually be tasks## Tasks, and @category `tasks` could actually be @category Tasks.
(I would keep backticks for the other categories because those refer to method name strings.)
Follow-up to modelcontextprotocol#1732. This explains how tasks can be used in some motivating use cases.
Follow-up to modelcontextprotocol#1732. This explains how tasks can be used in some motivating use cases.
Follow-up to modelcontextprotocol#1732. This explains how tasks can be used in some motivating use cases.
Follow-up to modelcontextprotocol#1732. This explains how tasks can be used in some motivating use cases. Co-Authored-By: Claude <noreply@anthropic.com>
|
Nice to see this got merged , @LucaButBoring Do you have already implemented the server side at Work? |
Implement full support for both server-side and client-side tasks on tool calls, sampling, and elicitation Refs modelcontextprotocol/modelcontextprotocol#1732
Implement full support for both server-side and client-side tasks on tool calls, sampling, and elicitation Refs modelcontextprotocol/modelcontextprotocol#1732
| This is to be interpreted as a fine-grained layer in addition to capabilities, following these rules: | ||
|
|
||
| 1. If a server's capabilities include `tasks.requests.tools.call: false`, then clients **MUST NOT** attempt to use task augmentation on that server's tools, regardless of the `taskHint` value. | ||
| 1. If a server's capabilities include `tasks.requests.tools.call: true`, then clients consider the value of `taskHint`, and handle it accordingly: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@LucaButBoring It was a {} but here it's a false/true.
|
|
||
| Besides server and client primitives, the protocol offers cross-cutting utility primitives that augment how requests are executed: | ||
|
|
||
| - **Tasks (Experimental)**: Durable execution wrappers that enable deferred result retrieval and status tracking for MCP requests (e.g., expensive computations, workflow automation, batch processing, multi-step operations) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd be curious to better understand the meaning of the term "experimental" here. Does it suggest that the feature is subject to breaking revisions in the future? Should SDK implementers accordingly annotate all APIs touching task functionality as experimental?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It appears this line is answering my question: https://github.com/modelcontextprotocol/modelcontextprotocol/pull/1732/files#diff-ce54e98f0e555c404f17c1180c4a65e774b0ed0ad71e207410030cf08356cf73R18
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@eiriktsarpalis Not sure how that clears up the meaning of "experimental" with regard to future breaking changes and necessity of SDK annotations. The line you linked to says:
Tasks are useful for representing expensive computations and batch processing requests, and integrate seamlessly with external job APIs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure how that happened, I meant to link to this line: https://github.com/modelcontextprotocol/modelcontextprotocol/pull/1732/files#diff-ce54e98f0e555c404f17c1180c4a65e774b0ed0ad71e207410030cf08356cf73R12
Tasks were introduced in version 2025-11-25 of the MCP specification and are currently considered experimental.
The design and behavior of tasks may evolve in future protocol versions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Which means adding more features and SubTask support.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe so, but if those additions are meant to be incremental I'm not sure why we'd want to qualify the feature as experimental.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The evolutions may not be purely incremental, but we'll see.
The request to mark this as experimental came out of the Core Maintainer reviews, and it's largely because we wanted to get this out in its current form for people to start building on, so we could get real, not-toy feedback on it in the coming months. We had been facing a chicken-and-egg problem over the past several months of discussions between getting this finalized and having people actually trial it in applications (huge thanks to @evalstate for being one of the few willing and able to, btw), which was something the Core Maintainers wanted to see to have confidence that this was definitely the right solution across the board.
I think that in particular, we want to avoid a situation like we have with structuredOutput, where there have been divergences between implementations in SDKs and applications that have led to spotty support that isn't easy to resolve anymore. Releasing this in the upcoming spec release with an experimental label was the way we agreed to signpost the possibility of changing it in upcoming releases -- while simultaneously having enough "officiality" to get it into SDKs, so that people can actually build on it. That will give us a very high degree of confidence that the current core design is as good as it can be (short of net-new additions on top of it) and we can remove the experimental label.
Simultaneously, I don't expect much to actually change in the core design, however. Consider it hedging, since this is such a large addition to MCP 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First of all, thank you all very much for your contributions. I'm currently developing a version (for Alibaba), and we also have similar requirements. Internally, we use various methods such as SSE, RocketMQ's litetopic, and asynchronous webhooks to achieve this. Within the company's network environment, we can also notify clients through other means. We've conducted some internal reviews based on this, and I think the current design is quite complete, at least it standardizes end-to-end behavior quite well.
While I don't know the actual implementation in your work, it will definitely involve databases and persistent message queue components to ensure the traceability of the entire task's state transition. For example, in our scenario, a heavy task might, after completion, notify our Agents system, allowing us to restart our agents either locally or on a different machine to continue processing.
I hope to deliver a version based on this first, thereby resolving internal interoperability issues. Deliver value first, then discuss improvements.
* mcp: implement sep-1732 tasks Implement full support for both server-side and client-side tasks on tool calls, sampling, and elicitation Refs modelcontextprotocol/modelcontextprotocol#1732 * wip * finish support
|
Just implemented this at work, works well. |
This PR defines Tasks, as proposed in #1686. This improves support for task-based workflows in MCP. It introduces both the task primitive and the associated task ID, which can be used to query the state and results of a task, up to a server-defined duration after the task has completed. This primitive is designed to augment other requests (such as tool calls) to enable call-now, fetch-later execution patterns across all requests for servers that support this primitive.
Motivation and Context
The current MCP specification supports tool calls that execute a request and eventually receive a response, and tool calls can be passed a progress token to integrate with MCP’s progress-tracking functionality, enabling host applications to receive status updates for a tool call via notifications. However, there is no way for a client to explicitly request the status of a tool call, resulting in states where it is possible for a tool call to have been dropped on the server, and it is unknown if a response or a notification may ever arrive. Similarly, there is no way for a client to explicitly retrieve the result of a tool call after it has completed — if the result was dropped, clients must call the tool again, which is undesirable for tools expected to take minutes or more. This is particularly relevant for MCP servers abstracting existing workflow-based APIs, such as AWS Step Functions, Workflows for Google Cloud, or APIs representing CI/CD pipelines, among other applications.
How Has This Been Tested?
Reference implementation: modelcontextprotocol/typescript-sdk#1041
Currently engaging with client application implementors to look at trial implementations in the field.
Breaking Changes
No breaking changes; this is all net-new spec material.
Types of changes
Checklist
Additional context