SEP-2557: Adapt Tasks for Stateless & Sessionless Protocol#2557
SEP-2557: Adapt Tasks for Stateless & Sessionless Protocol#2557LucaButBoring wants to merge 36 commits into
Conversation
4215bbd to
554c8bc
Compare
6c61b6e to
c1aa24a
Compare
|
@CaitieM20 @markdroth @Randgalt moving discussion here Updated low-level specification language is pending; the SEP document consolidates #2229 and #2339 with additional changes following @markdroth's feedback and discussion with @CaitieM20 a couple of days ago, and factors in the approval of #2260 which makes server-to-client task augmentations invalid (covered in the Rationale section). |
|
Seeing as we're breaking things... Would it be possible to add a |
8b2d86d to
6a4a14c
Compare
Yes, I'll add this in the next round of revisions - once MRTR lands we can potentially even just share most of the same types for those fields. |
add diagrams, schemas, http headers, and specify Gettask Behavior.
|
|
||
| ### Task Schema Changes | ||
|
|
||
| The `Task` schema defining the task metadata gains an optional `requestState` (see SEP-2322). We additionally introduce new derived types that inline `result`/`error`/`inputRequests`, to be used by `tasks/get` and `notifications/tasks/status`. This allows us to avoid introducing redundant/bloated fields in `CreateTaskResult`. |
There was a problem hiding this comment.
If we are going to keep requestState (which I still think we shouldn't), we need to spell out the semantics very clearly, and I don't see that we're describing them anywhere.
For example, what exactly is the lifetime of the request state from the client's perspective? Does the client always send the requestState back to the server on the next tasks/get request, but then drop it in favor of the requestState in the tasks/get response -- i.e., the client holds on to it only between two interactions, and the server needs to echo it back on each tasks/get response?
Alternatively, are we expecting that the client will hold on to the last seen requestState for the lifetime of the task, throwing it away only when the server provides a new value?
Are there ordering concerns here? What happens if the client makes two calls to tasks/get in parallel? Or do we expect that clients will never do that? Do we have some way to guarantee that?
I fear that this mechanism is adding more complexity than it may seem on the surface.
There was a problem hiding this comment.
Are you aware that this same mechanism is in MRTR?
There was a problem hiding this comment.
I'm one of the main authors of the MRTR SEP, so I assure you that I am aware of it. :)
There was a problem hiding this comment.
If we are going to keep
requestState(which I still think we shouldn't), we need to spell out the semantics very clearly, and I don't see that we're describing them anywhere.
It's mentioned in the spec language in this PR for the extended semantics for tasks, but I'm planning to lean on MRTR for the core of it. The current phrasing is:
Under Request State Management:
Servers MAY set an optional
requestStatestring on anyTaskobject to pass opaque routing or state information back to the client. When a client receives aTaskwith arequestStatevalue, it MUST echo back the exact value of that field in therequestStatefield of subsequenttasks/getandtasks/cancelrequests for the same task. The server can use this echoed value to recover routing context or session state without maintaining per-task server-side session data, enabling stateless, load-balanced deployments.
[...]
TherequestStatevalue is opaque to the client — clients MUST NOT inspect, parse, modify, or make assumptions about its contents. Servers MAY return a differentrequestStatevalue on eachtasks/getandtasks/cancelresponse; clients MUST always use the most recently received value in their next request. If norequestStateis present in the server's response, the client MUST NOT include it in the next request. Servers that includerequestStateSHOULD encrypt it to protect confidentiality and integrity, and MUST validate any receivedrequestStatebefore acting on it.Upon receiving a
notifications/tasks/statusnotification for a task status update, clients MUST update their trackedrequestStatevalue with any value provided in the notification, as they would do with a standard response.
...and under Data Types > Task:
requestState: Optional opaque string set by the server for stateless routing or state management. Clients MUST echo this value back exactly in subsequenttasks/getrequests (see Request State Management).
(I forgot to add "and tasks/cancel" there; I will do that now)
What happens if the client makes two calls to
tasks/getin parallel? Or do we expect that clients will never do that?
Last writer wins, and the server needs to reconcile that. Functionally, this should be no different from the non-parallel case where the client polls less frequently than the state is updating; eventually the server needs to handle the client's last-seen requestState potentially being out of sync with some upstream source of truth.
There was a problem hiding this comment.
Ah, okay -- guess I didn't read carefully enough. Thanks!
I didn't question the existance of any use case. I questioned whether this is a common enough use-case to be worth adding the complexity to the protocol. I think that whenever we add a mechanism to the protocol, we are imposing a requirement on every client implementation to be able to support it. We should not do that unless we are convinced that there are enough use-cases to justify that overhead.
The MRTR SEP is not a precedent in favor of adding The MRTR SEP explicitly differentiates between ephemeral and persistent workflows. The ephemeral workflow is the one where the server is not storing any state, so using
That's true, but in that case it would be a function of the tool definitions, which means that it would not impose any behavior on MCP clients or SDKs -- the LLM will basically handle the state in that case. I think that's fundamentally different from the proposal to include |
On this particular point - I think there are plenty of use cases if we distinguish the MCP server itself from the worker handling whatever the unit of work is. For example, consider a case where I have an existing Step Functions (or insert-framework-here) workflow that I want to expose an MCP interface over. That Step Functions workflow is effectively a standalone service with its own state independently of my MCP server, which is a different deployment altogether. Being able to leverage opaque, per-request state is very useful here, because it means I do not need to stand up a data store for my MCP server to hold task state in; I can instead simply serialize the task metadata into That means I don't need to modify my Step Functions deployment, I don't need to maintain persistent data alongside my MCP server, and I can trivially handle cases where my application logic needs more than what my workflow executor can express (for example, if I want to keep the TTL/poll interval consistent over the lifetime of a task, across deployments that might change it).
It depends - FWIW the majority of Step Function-backed APIs I've worked with have not had any cancellation support due to the typical execution duration not being long enough for that to make sense. The ones I've worked with that do take that long to execute conversely would actually not even use a TTL in the first place (results retained indefinitely unless explicitly deleted by design). |
|
|
||
| ### Task Flow Change | ||
|
|
||
| We propose simplifying the Task Flow into two methods: `tasks/get` and `tasks/cancel`. |
There was a problem hiding this comment.
After feedback from a few folks who would be implementing this I think we need to add a tasks/update or tasks/continue method.
This channel is where the client can send async messages to the server on. This is effectively a POST. So we should move Cancel Message & InputResponse messages here.
This ensures both sides of tasks, get & update are modeling truly async message passing. One of the problems currently is when InputResponses is provided on the get method the server either needs to process it before returning the status or might return something confusing like input_required again. this implies some kind of transaction semantics which aren't required and adds complexity.
This also has the nice benefit of tasks/get is idempotent, while tasks/update is not. So this more closely mimics REST GET & POST semantics.
dsp-ant
left a comment
There was a problem hiding this comment.
In addition: do we need to make security considerations about the task id so it can't be guessed?
| The current flow has a number of problems: | ||
|
|
||
| 1. `tasks/result` is overloaded and calling it to retrieve server requests when in the `input_required` status is unintuitive. | ||
| 2. `tasks/result` is expected to block until the task is completed if called prematurely. This has led to [implementation issues](https://github.com/modelcontextprotocol/java-sdk/pull/755#issuecomment-3806079033). Requires long lived persistent connections, which many clients & servers do not want to implement. This problem still exists even with MRTR. |
There was a problem hiding this comment.
Good catch. This is why Tasks are experimental.
| 2. `tasks/result` is expected to block until the task is completed if called prematurely. This has led to [implementation issues](https://github.com/modelcontextprotocol/java-sdk/pull/755#issuecomment-3806079033). Requires long lived persistent connections, which many clients & servers do not want to implement. This problem still exists even with MRTR. | ||
| 3. The flow is inefficient. Clients must make multiple calls to `tasks/get` to check on the status and then to `tasks/result` to retrieve the final result or required input. | ||
|
|
||
| Task Creation also has a number of issues since it is client-directed instead of being server-directed.Today the client must declare that it wants a task to be created by including the `task` field in the request. This creates several issues: |
There was a problem hiding this comment.
I remember we were quite deliberate about the client creating the task in the past. Do you remember the reasoning @LucaButBoring ?
There was a problem hiding this comment.
It was partially durability, and partially SDKs - it's hard to handle polymorphic results in many programming languages.
Regarding durability, there are other ways to handle that for task-augmented tool calls via tool arguments being used to bootstrap the task ID, and it was determined that layering a broader protocol-wide durability mechanism on top of that would be helpful, but out of scope for tasks themselves.
|
|
||
| We propose simplifying the Task Flow into two methods: `tasks/get` and `tasks/cancel`. | ||
|
|
||
| - The `tasks/get` methods will handle retrieving task statuses and results simultaneously, and will additionally act as the carrier for receiver-to-requestor requests for the purposes of SEP-2322: Multi Round-Trip Requests. This simplifies the flow and allows for polling or streaming updates on a single endpoint. Moreover this more closely matches Long Running Operation APIs where there is a single endpoint that is polled for the status of an operation. |
There was a problem hiding this comment.
This makes sense and i generally agree. Notable there are two appraoches:
- The OS approach of
poll()+accept(), which allows the polling to operate on multiple handlers, while accepting only one - The busy-polling approach from the web, which combines getting + results at the expense that you cannot ignore a task result until you want to deliberately accept it, but you save roundtrips.
Busy polling is probably best here, since it avoids race condition, reduces roundtrips and avoid the server to have to cache the result for a long time.
| The ResultType field was introduced in [SEP-2322: Multi Round-Trip Requests](https://github.com/modelcontextprotocol/modelcontextprotocol/pull/2322) to handle polymorphic results. `Tasks` has the same issue where a server may return a `CallToolResult` or a `CreateTaskResult`. To address this, we propose the addition of the `task` ResultType to indicate that a Response contains a `Task` object. | ||
|
|
||
| ```typescript | ||
| type ResultType = "complete" | "incomplete" | "task"; |
There was a problem hiding this comment.
| type ResultType = "complete" | "incomplete" | "task"; | |
| type ResultType = "taskCompleted" | "taskIncomplete" | "taskCreated"; |
complete and incomplete relate to tasks. Alternatively we can have two types:
type ResultType = "task" | "tool";type TaskResultStatus = "incomplete" | "complete" | "created"this way it's clear, and with a simple pattern match, the SDK can distinguish between complete tool results and complete task results.
I might be mistaken here, so correct me if you think the proposed result type is sufficient and clear enough.
There was a problem hiding this comment.
These are two separate concepts - complete is used to identify complete non-task requests, incomplete is used to identify incomplete non-task requests (both come from SEP-2322), and we're extending that with task to specifically identify CreateTaskResult.
|
|
||
| ### Task Creation Changes | ||
|
|
||
| Tasks will no longer be an optional capability, but will instead become a standard part of the protocol that servers MAY choose to implement. |
There was a problem hiding this comment.
I do not like that.
Not all client want to support tasks. There are many situations where you only want to allow tools and nothing else, for example, because they dont want to support client side state. These should be free to not have to deal with tasks.
It also means that everyone upgrading to the new protocol MUST support tasks. That's a big ask.
We need to find a way for this to be an expressed capability of the client.
There was a problem hiding this comment.
We can add back execution.taskSupport to allow task-returning tools to be filtered out of tools/list, but as long as task polling is transparent to applications by default (which I want to be the standard implementation path), there shouldn't be any additional burden regarding client state requirements. There are niche use cases where that cannot be the case, but I think the worst outcome is for task support to be optional, because that leads to the status quo where you need to have a state contract between the client and server to use tasks at all.
If at all possible, I think we should make it an invariant and only make it technically possible to opt out of, along the lines of stateless servers today (except in this case it would be stateless clients). The only use case I can see explicitly not being able to handle this is perhaps things like Claude/ChatGPT remote MCP execution, but even then that seems like a stretch. It's worth noting that whether or not polling happens is orthogonal from how long something takes, and we can have either 10-second or 10-hour non-task tool calls (setting aside the transport issues with this) just like we can have 10-second or 10-hour task-augmented tool calls.
There was a problem hiding this comment.
Essentially, I want 90% of host applications to be able to call new task-returning tools without making any code changes themselves - and then we can add an escape hatch for the remaining 10%.
There was a problem hiding this comment.
@LucaButBoring The problem with adding execution.taskSupport is that client that don't support it need to start supporting this. Existing clients that are not task aware would accidentally start calling tasks. I wonder if we can separate this out. For example, by adding a task/list or tools/list(tasks: true) (both seem to be not good ideas, but you get the idea).
There was a problem hiding this comment.
Without Tasks being mandatory, it will be difficult to provide tools that leverage them until there is a point of critical adoption among clients. To cover the same user journeys we want to cover with tasks, we would potentially need to provide different sets of tools for different capabilities, which significantly complicates server side implementations.
There was a problem hiding this comment.
It makes sense that we want people to support tasks, but we can't force anyone. If we make it mandatory, the very likely outcomes are either:
- People won't adopt the new spec because they don't have bandwidth to add tasks support to their product, or
- People will advertise the new spec version, but not actually implement tasks, maybe just stubbing out any client-side integration if using the SDKs.
Neither of these actually help server developers looking to leverage tasks.
I think it might be a different story if tasks were trivial to implement, but the fact that they are asynchronous makes them often non-trivial to integrate. It may even be practically impossible for some clients (e.g. a stateless endpoint backed by an agent that has to respond in seconds).
There was a problem hiding this comment.
People won't adopt the new spec because they don't have bandwidth to add tasks support to their product
The point I've been pushing since the previous iteration of tasks which I don't think I've been able to get people to grasp is that client SDKs can (and should) encapsulate the polling by default. The default case should be that callTool just handles tasks under the hood, so the minimum bandwidth to add tasks is upgrading a dependency.
If applications then decide to have richer integrations with tasks (for example, background execution), they should be able to drop down a level of abstraction and leverage the lifecycle directly.
It may even be practically impossible for some clients (e.g. a stateless endpoint backed by an agent that has to respond in seconds).
duration is orthogonal to tasks
There was a problem hiding this comment.
I agree with a couple of key points made above.
First, quoting @kurtisvg:
Without Tasks being mandatory, it will be difficult to provide tools that leverage them until there is a point of critical adoption among clients.
I think it's important to keep in mind that in our new stateless-by-default protocol, tasks is the mechanism to provide stateful functionality. If we don't make tasks a mandatory part of the protocol, then we're basically saying that server authors can't actually write stateful applications, because they have no guarantee that clients will support them.
In addition, I would argue that making tasks mandatory is consistent with one of the key principles that @dsp-ant articulated in our December meeting, which was to minimize the number of optional things in the protocol, since optional things themselves add complexity and fragment the ecosystem.
Next, quoting @LucaButBoring:
The point I've been pushing since the previous iteration of tasks which I don't think I've been able to get people to grasp is that client SDKs can (and should) encapsulate the polling by default. The default case should be that
callTooljust handles tasks under the hood, so the minimum bandwidth to add tasks is upgrading a dependency.
I think that's a really important point. It should be feasible for SDKs to handle the burden of supporting tasks, so individual clients and servers don't need to worry about it.
I think that if we do this right, making tasks mandatory will actually be much lower burden than making it optional.
| ### Remove/alter features made obsolete by upcoming SEPs | ||
|
|
||
| 1. We will remove the concept of client-hosted tasks (Sampling & Elicitation), as SEP-2260 disallows unsolicited server-to-client requests | ||
| 2. We will remove the optional `tasks/list` operation, as SEP-2567 removes sessions which was the only defined scope for listing tasks between a server and a client. We may expand task support to additional client-to-server request types in the future, and implementors are still advised against implementing tasks as a tool-specific protocol operation. |
There was a problem hiding this comment.
I understand the why ,but i am unsure about this. what happens if clients crash, or loose track. How can they sync with the the source of truth (server). I can be convinced that this is okay for all the good reasons, but i a worry about this piece.
There was a problem hiding this comment.
Host applications would have to handle this particular case, which they can do by listening on the CreateTaskResult for a task-augmented tool call and persisting the task ID, which they can then resume listening on upon resumption. It can be handled like #2567 treats session handles.
|
|
||
| We will add the following language to the specification to define the new flow and the expected behavior of `tasks/get`: | ||
|
|
||
| Upon receiving a `tasks/get` request with `inputResponses`, the server MUST process the provided responses and update the task state accordingly. The server MAY choose to transition the task back to `working` status if it determines that the provided input is sufficient to continue processing. |
There was a problem hiding this comment.
I don't think we should conflate task/get with inputResponses. Providing input and getting a status should be distinct things, otherwise task/get can't be idempotent, which can make things difficult for gateway implementors, etc to reason about. We should probably have these two:
task/getortask/pollfor getting status and result. Idempotent.task/continueortask/updatefor providing mutable changes to a task.
| | ------------ | ----------------------------- | ------------------------------------------------------ | | ||
| | `Mcp-Method` | `method` | All requests and notifications | | ||
| | `Mcp-Name` | `params.name` or `params.uri` | `tools/call`, `resources/read`, `prompts/get` requests | | ||
| | `Mcp-Name` | `params.taskId` | `tasks/get`, `tasks/cancel` requests | |
There was a problem hiding this comment.
I forgot why we went with one header and I vaguely remember it was my idea, which in retrospective seems not so smart, since we are overloading the meaning of the header.
I wonder if we should have a separate MCP-Task-Id header. What do loadbalancers prefer here? @markdroth what do you think?
There was a problem hiding this comment.
IIRC, the reason you suggested a single Mcp-Name header name was to avoid having to define a separate header name for each of tools, resources, and prompts -- i.e., the alternative would have been Mcp-Tool-Name for tools, Mcp-Resource-Name for resources, etc. (@kurtisvg can confirm my recollection here.)
If we have this established pattern where Mcp-Name basically indicates the object being accessed, independent of type (i.e., tool, resource, etc), then it seems reasonable to extend this same pattern to tasks. For a tool call, Mcp-Name indicates the name of the tool; for a resource read call, Mcp-Name indicates the name of the resource; and for a task call, Mcp-Name would indicate the name of the task.
There was a problem hiding this comment.
IMO, I thought separate headers seemed better because it gives more flexibility, but I don't have a strong feeling or concrete use-case for it.
I do wonder that if tasks should have the original tool name in the header so it can be routed to the same place it was created from. But maybe that could be accomplished with putting tool name in the taskId some how.
There was a problem hiding this comment.
I think putting the tool name in the task ID would work fine. And that case is probably a good argument for using the same header name, actually, since that will allow network data planes to always route based on the same header.
At the end of the day, we can make this work with either the same header or different headers. But I do feel that we should be consistent in how we do this across tools, resources, prompts, and tasks: if we use a separate header for tasks, then I think we should do so for the other features as well.
@dsp-ant That part is noted under Security Considerations > Task Isolation and Access Control in the spec already:
The gist being that task access should be authenticated, but if it's not, you should make task IDs resistant to guessing. |
I see, so this would be a case where the tool implementation needs to use the tool parameters after the dependent request is complete? I guess I can see that. I'm still a bit skeptical as to how many such use-cases there are in reality. But I seem to be in the minority here, so I'll stop pressing the point -- I don't want to be in the way of this SEP moving forward, since I think it's a very positive change overall.
Doesn't that mean that in any case where the client disappears, the results will be retained indefinitely? That seems like it's going to wind up accumulating increasing amounts of used data over time. |
Yes, though in the services that do this, that's an acceptable cost; things where the result is basically a metadata record that you want to maintain indefinitely for auditing purposes etc., but can also be consumed programmatically if needed (generally it's not technically indefinitely, but it's only deleted when some outer containing resource is deleted). |
|
Just discussed additional feedback - tentative changes are:
|
| - Consolidate polling lifecycle: `tasks/get` now handles status retrieval, result delivery, and server-to-client requests | ||
| - Remove `tasks/result` method: results and errors are inlined into the `Task` object | ||
| - Remove client-hosted tasks: per SEP-2260, client-hosted task polling is conceptually invalid | ||
| - Add `DetailedTask` types: `InputRequiredTask`, `CompletedTask`, `FailedTask` for typed task states |
There was a problem hiding this comment.
The DetailedTask is a task type property in the Task class? Why add more class definition instead status? what's the benefit here? since the client needs to seralized them to different task object.
There was a problem hiding this comment.
DetailedTask is a subclass of Task, if you want to look at it that way
|
|
||
| This SEP Removes all capabilities related to Tasks. | ||
|
|
||
| - SEP-2567 makes `tasks/list` impossible to support. |
There was a problem hiding this comment.
nit: It was never tied to sessions (which is good!). The current spec is around auth context.
|
|
||
| 1. We will remove the concept of client-hosted tasks (Sampling & Elicitation), as SEP-2260 disallows unsolicited server-to-client requests | ||
| 2. We will remove the optional `tasks/list` operation, as SEP-2567 removes sessions which was the only defined scope for listing tasks between a server and a client. We may expand task support to additional client-to-server request types in the future, and implementors are still advised against implementing tasks as a tool-specific protocol operation. | ||
| 3. We will change `Task.ttl` to be expressed in integer seconds to align with SEP-2549, which matches standard HTTP conventions. |
There was a problem hiding this comment.
Need to align with SEP-2549 but I'd really like this to be called ttlSeconds so that we completely remove the ambiguity. It also forces people to visit their code (which will be interpreting it as milliseconds right now, and would break on this change).
|
|
||
| ### Task Creation Changes | ||
|
|
||
| Tasks will no longer be an optional capability, but will instead become a standard part of the protocol that servers MAY choose to implement. |
There was a problem hiding this comment.
To be clear: we're keeping Tasks as experimental, but making it mandatory to implement for clients?
There was a problem hiding this comment.
If so, I think this requires some justification since it feels very bad to mandate folks to implement and maintain an experimental feature with no choice.
There was a problem hiding this comment.
I think it is worth discussing, I originally wanted this SEP to bring tasks out of experimental but that got walked back, and now puts this point on shaky ground.
There was a problem hiding this comment.
The main reason I wanted this to involve bringing tasks out of an experimental state is to avoid a repeat of the last release, where SDK maintainers decide that experimental means forcing client applications to use entirely different API surfaces for servers that might return tasks - when the alternative would be making it transparent with hooks into the task lifecycle for deeper application integration with them.
That meant adopting tasks in any client application was a big lift, even if you just want to start by wrapping them in a blocking interface as a stepping stone.
There was a problem hiding this comment.
Implementation reference: https://github.com/strands-agents/sdk-python/pull/1475/changes
| "createdAt": "2025-11-25T10:30:00Z", | ||
| "lastUpdatedAt": "2025-11-25T10:40:00Z", | ||
| "ttl": 60, | ||
| "pollInterval": 5000 |
There was a problem hiding this comment.
If we're changing ttl to seconds we should change pollInterval to seconds also for consistency OR rename it to pollIntervalMs (and if we go for seconds, should be pollInternalSeconds).
| - SEP-2567 makes `tasks/list` impossible to support. | ||
| - SEP-2260 disallows unsolicited server-to-client requests, which invalidates the concept of client tasks for elicitation and sampling operations. | ||
| - This SEP makes Tasks a standard part of the protocol, and not a negotiated capability. | ||
| - This SEP Makes `tasks/cancel` required to be supported, even if a server does not wish to support tasks. |
There was a problem hiding this comment.
Why would a server need to support tasks/cancel if it does not support tasks?
There was a problem hiding this comment.
I was thinking that would imply that it needs a capability, and it would be silly to have a capability for cancellation that is disjoint from supporting tasks in general (which if at all possible I also want to remove all capabilities for).
In practice this is a non-issue because there are no valid task IDs to use with a server that never creates tasks, so every tasks/cancel request would fail.
|
|
||
| [SEP-2243](./2243-http-standardization.md) introduces standard headers in the Streamable HTTP Transport to facilitate more efficient routing. Routing on `TaskId` is also desirable since there is often state associated with a specific Task that needs to be consistently routed to the same server instance. SEP-2243 requires that all requests and notifications declare an `Mcp-Method` header. | ||
|
|
||
| We will extend this with semantics for the `tasks/get` and `tasks/cancel` requests, requiring that the `Mcp-Name` header MUST be set to the value of `params.taskId` by the client when making `tasks/get` and `tasks/cancel` requests over the Streamable HTTP Transport. |
There was a problem hiding this comment.
This is a problem for unauthenticated servers (which tasks currently allow) since the task ID becomes the sole authz token and now we're saying that it is in a header, which gateways etc. will treat as non-sensitive.
If we're going to do this, I think we either need to just disallow tasks on unauth'd servers, or make it very clear that users should not have security expectations around access to the task.
There was a problem hiding this comment.
Mcp-Task-Id may be preferrable since it has different cardinality to e.g. tool names and allows operators mask it independently.
There was a problem hiding this comment.
I think it's fine to say that users should not store security-sensitive information in the task ID.
I think that any data plane that needs to do routing based on this can detect the task case by looking at the MCP method. We don't necessarily need a different header name to do that.
|
I just received confirmation that Lead Maintainers are aligned on the decision from #2639 to make Tasks an official extension rather than an experimental core spec feature - I'm creating a new SEP to cover how that looks and will roll in the planned changes I outlined in #2557 (comment). Task changes are still scoped for the next spec release, just in a different form. Will supersede this PR once I have the main SEP document done, will be out by 4/30 at the very latest (as that's the lock date for the 6/30 spec). |
| - `outputSchema`: Optional JSON Schema defining expected output structure | ||
| - Follows the [JSON Schema usage guidelines](/specification/draft/basic#json-schema-usage) | ||
| - Defaults to 2020-12 if no `$schema` field is present | ||
| - `execution`: Optional execution settings for the tool |
There was a problem hiding this comment.
Does this need proper schema representation?
| - Remove client-hosted tasks: per SEP-2260, client-hosted task polling is conceptually invalid | ||
| - Add `DetailedTask` types: `InputRequiredTask`, `CompletedTask`, `FailedTask` for typed task states | ||
| - Add `ResultType` discriminator: use `"task"` value to indicate responses containing Task objects | ||
| - Deprecate capability declarations: `tasks` (client), `tasks.cancel`, and `tasks.requests.tools.call` (server) |
There was a problem hiding this comment.
They're not really deprecated - they're removed.
| N/A | ||
| 1. Adapt tasks for a stateless and sessionless protocol ([SEP-2557](/seps/2557-adapt-tasks-for-stateless-and-sessionless-protocol)): | ||
| - Allow unsolicited task creation: servers can return `CreateTaskResult` even when clients don't request a task | ||
| - Allow immediate results: servers can return immediate results even when clients request a task |
There was a problem hiding this comment.
"When clients request a task" - they're not really requesting one anymore, do they?
| }, | ||
| "CreateTaskResult": { | ||
| "description": "The result returned for a task-augmented request.", | ||
| "description": "The result returned for a task-augmented request.\n\nThis type uses the base {@link Task} shape intentionally: `CreateTaskResult` represents\ntask metadata only, and MUST NOT include `result`, `error`, or `inputRequests` fields.\nThose fields are reserved for {@link DetailedTask} subtypes returned by\n{@link GetTaskRequesttasks/get} and {@link TaskStatusNotificationnotifications/tasks/status}.\n\nReceivers MAY return this result even when the request does not include a `task` field,\nallowing unsolicited task creation. When this occurs, requestors MUST handle the task\nby polling tasks/get.\n\nReceivers MUST NOT return a CreateTaskResult unless and until a tasks/get request would\nreturn that task; that is, in eventually-consistent systems, receivers MUST wait for\nconsistency before returning the CreateTaskResult.", |
There was a problem hiding this comment.
Receivers/requestors - can we use clients/servers terminology here?
|
|
||
| 1. Removes/alters features made obsolete by upcoming SEPs. | ||
| 2. Task Creation is determined by the Server. Removed the `task` parameter from tool call requests. | ||
| 3. Servers MUST support `task/cancel` operation. |
There was a problem hiding this comment.
| 3. Servers MUST support `task/cancel` operation. | |
| 3. Servers MUST support `tasks/cancel` operation. |
|
|
||
| Upon receiving a `tasks/get` request, the server MUST check the status of the task and respond accordingly: | ||
|
|
||
| 1. if the status is `working` the server MUST return a a `Task` object with status `working`. |
There was a problem hiding this comment.
| 1. if the status is `working` the server MUST return a a `Task` object with status `working`. | |
| 1. if the status is `working` the server MUST return a `Task` object with status `working`. |
|
|
||
| ### Changes to support upcoming SEPs | ||
|
|
||
| A number of changes are necessay to support the upcoming changes to the spec listed in the SEPs above. |
There was a problem hiding this comment.
| A number of changes are necessay to support the upcoming changes to the spec listed in the SEPs above. | |
| A number of changes are necessary to support the upcoming changes to the spec listed in the SEPs above. |
|
|
||
| A number of changes are necessay to support the upcoming changes to the spec listed in the SEPs above. | ||
|
|
||
| **[SEP-2260](./2260-Require-Server-requests-to-be-associated-with-Client-requests.md)** disallows unsolicited server-to-client requests, which invalidates the concept of client tasks for elicitation and sampling operations. This proposal removes client-hosted tasks & their associated capabiliteis since these scenarios are no longer supported. |
There was a problem hiding this comment.
| **[SEP-2260](./2260-Require-Server-requests-to-be-associated-with-Client-requests.md)** disallows unsolicited server-to-client requests, which invalidates the concept of client tasks for elicitation and sampling operations. This proposal removes client-hosted tasks & their associated capabiliteis since these scenarios are no longer supported. | |
| **[SEP-2260](./2260-Require-Server-requests-to-be-associated-with-Client-requests.md)** disallows unsolicited server-to-client requests, which invalidates the concept of client tasks for elicitation and sampling operations. This proposal removes client-hosted tasks & their associated capabilities since these scenarios are no longer supported. |
| 2. `tasks/result` is expected to block until the task is completed if called prematurely. This has led to [implementation issues](https://github.com/modelcontextprotocol/java-sdk/pull/755#issuecomment-3806079033). Requires long lived persistent connections, which many clients & servers do not want to implement. This problem still exists even with MRTR. | ||
| 3. The flow is inefficient. Clients must make multiple calls to `tasks/get` to check on the status and then to `tasks/result` to retrieve the final result or required input. | ||
|
|
||
| Task Creation also has a number of issues since it is client-directed instead of being server-directed.Today the client must declare that it wants a task to be created by including the `task` field in the request. This creates several issues: |
There was a problem hiding this comment.
| Task Creation also has a number of issues since it is client-directed instead of being server-directed.Today the client must declare that it wants a task to be created by including the `task` field in the request. This creates several issues: | |
| Task Creation also has a number of issues since it is client-directed instead of being server-directed. Today the client must declare that it wants a task to be created by including the `task` field in the request. This creates several issues: |
|
|
||
| ### On Polymorphism | ||
|
|
||
| In [SEP-1686](./1686-tasks.md), we explicitly chose not to introduce support for unsolicitated task creation, as this would have required all implementations to break all method contracts by allowing `CreateTaskResult` to be returned in addition to the non-task result shape. This proposal explicitly rejects that argument, opting to consider `CreateTaskResult` as something akin to a JSON-RPC error, which already needed to be handled in the standard result path. Implementations already needed to branch response handling for error response shapes - `CreateTaskResult` is different in that rather than being a different JSON-RPC envelope shape, it is a different subset shape of a JSON-RPC result. |
There was a problem hiding this comment.
| In [SEP-1686](./1686-tasks.md), we explicitly chose not to introduce support for unsolicitated task creation, as this would have required all implementations to break all method contracts by allowing `CreateTaskResult` to be returned in addition to the non-task result shape. This proposal explicitly rejects that argument, opting to consider `CreateTaskResult` as something akin to a JSON-RPC error, which already needed to be handled in the standard result path. Implementations already needed to branch response handling for error response shapes - `CreateTaskResult` is different in that rather than being a different JSON-RPC envelope shape, it is a different subset shape of a JSON-RPC result. | |
| In [SEP-1686](./1686-tasks.md), we explicitly chose not to introduce support for unsolicited task creation, as this would have required all implementations to break all method contracts by allowing `CreateTaskResult` to be returned in addition to the non-task result shape. This proposal explicitly rejects that argument, opting to consider `CreateTaskResult` as something akin to a JSON-RPC error, which already needed to be handled in the standard result path. Implementations already needed to branch response handling for error response shapes - `CreateTaskResult` is different in that rather than being a different JSON-RPC envelope shape, it is a different subset shape of a JSON-RPC result. |
This proposal builds on tasks by introducing several simplifications to the original functionality to prepare the feature for stabilization, following implementation and usage feedback since the initial experimental release. In particular, this proposal allows tasks to be returned in response to non-task requests to remove unneeded stateful handshakes, and it collapses
tasks/resultintotasks/get, removing the error-prone interaction between theinput_requiredtask status and thetasks/resultmethod. Additionally, following the acceptance of SEP-2260: Require Server requests to be associated with a Client request, we are removing client-hosted elicitation/sampling tasks, as they further complicate the transport-related interactions that SEP-2260 intends to simplify.Motivation and Context
Tasks were introduced in an experimental state in the
2025-11-25specification release, serving as an alternate execution mode for certain request types (tool calls, elicitation, and sampling) to enable polling for the result of a task-augmented operation.As we move toward stabilizing tasks in the June specification release, there are a few challenges to resolve to make them easier to implement support for in both the host application and on the server:
input_requiredis a transition point into SSE side-channeling: In Streamable HTTP, tasks rely on some SSE stream being used to deliver elicitations and sampling, but cannot define which stream is used, instead requiring tasks/result to be called prematurely to have a consistent option to open a stream on, from the server’s perspective.tasks/resultis a blocking method: To enable being used as a vehicle for SSE side-channeling,tasks/resultis not allowed to return until the entire operation is complete, complicating polling implementations for clients and making servers pay a heavier cost for using elicitation or sampling at all, and undermining the incentives for servers to use tasks in certain cases.Furthermore, in #2322 (Multi Round-Trip Requests), we identified that we would need to make a breaking change to
tasks/resultfor these changes anyways, but that redesigning the flow within #2322 would be a scope expansion that would derail MRTR discussion. Regardless, MRTR relies heavily on tasks as a solution for "persistent" requests that require server-side state, so these two proposals are somewhat interdependent.To both improve the adoption of tasks and to reduce their upfront messaging overhead, this proposal simplifies their execution model by allowing peers to raise unsolicited tasks to each other and consolidating the polling lifecycle entirely into the
tasks/getmethod.How Has This Been Tested?
TBD
Breaking Changes
Several; described in detail in the proposal.
Types of changes
Checklist
Additional context
Supersedes #2229 and #2339.
AI Use Disclosure: The core SEP document was written entirely by me, but the actual specification and schema changes were written with Claude Code. The LLM-annotated diff validating the specification and schema changes against the SEP requirements is available here.