SEP-2678: Introduce additional error codes to protocol#2678
SEP-2678: Introduce additional error codes to protocol#2678MatthewKhouzam wants to merge 1 commit into
Conversation
090e3cc to
cca3bf9
Compare
Introduce -32000/-32001/-32002 to match the implementation in FastMCP. This change paves the way towards "interceptors" or "middleware" standardization. This change was assisted with claude opus 4.7 Signed-off-by: Matthew Khouzam <matthew.khouzam@ericsson.com>
cca3bf9 to
dd6cb6c
Compare
|
@reviewers. I would like to talk about if my approach is correct and inline with your roadmap before merging. This patch is trivial, but I don't know if it's useful for your goals. |
|
Thanks for sharing this in the AAIF Identity & Trust WG. Good idea to align these three codes from FastMCP. On your open question #2, I'd argue that rate limit should get its own code (-32003), unless we're worried about eventually running out of code numbers. The HTTP 429 has been so useful. |
|
What I would recommend for rate limit is to put it in both FastMCP and here. |
|
Are there next steps to do? |
pja-ant
left a comment
There was a problem hiding this comment.
Thanks for putting this together. I think more specificity around error codes in the specification is a good thing, and welcome more attention here. That being said, I think this SEP as-is needs work.
We need to look more closely at what the existing SDKs are doing and think a bit more holistically about what the right semantics are for the error codes, e.g. I would definitely not suggest putting timeouts and permissions denied under one code. If we are going to specify something, we need to do it right and consider how clients will need to respond to different kinds of errors.
I'd be happy to sponsor this once it's ready.
| // Implementation-specific JSON-RPC error codes [-32000, -32099] | ||
| export const SERVER_ERROR = -32000; | ||
| export const NOT_FOUND = -32001; | ||
| export const RESOURCE_NOT_FOUND = -32002; |
There was a problem hiding this comment.
This has changed in #2678 (going out in new spec release).
| | -------- | ------------------ | ----------------------------------------- | ----------------------------------------------------------------------------- | | ||
| | `-32002` | Resource Not Found | Requested resource does not exist | `FileNotFoundError`, `KeyError`, `NotFoundError` on `resources/*` methods | | ||
| | `-32001` | Not Found | Requested entity not found (non-resource) | `FileNotFoundError`, `KeyError`, `NotFoundError` on non-`resources/*` methods | | ||
| | `-32000` | Server Error | Server-side failure | `PermissionError`, `TimeoutError`, or rate limit exceeded | |
There was a problem hiding this comment.
I don't like these "Triggered by" conditions. A permissions error, timeout error and rate limit exceeded are all very specific kinds of error that require different behavior from the client:
- Permission error: let the user know that they have no permission, do not retry
- Timeout error: server may have just died mid processing, retry later
- Rate limit: retry but with back-off
Mixing all of these into one error is the wrong thing to do. If anything, it would be preferable to separate them out so that clients can act appropriately.
edit: I see this is partially noted in the open question #2. Yes, I'd recommend splitting out here and thinking through what else should be split out.
note also: 003 and 004 are now used as of the last few weeks, so please take into account.
| same semantic purposes. Standardizing the values the ecosystem is already | ||
| converging on is less disruptive than picking new ones. |
There was a problem hiding this comment.
Do we have evidence that the ecosystem is converging on these? I had Claude look at all the SDKs and this is what it found:
| SDK | -32000 |
-32001 |
|---|---|---|
| TypeScript | Raw literal — generic server error in Streamable HTTP transport (bad request, no session header, DNS-rebinding, wrong content-type…). Was the named ErrorCode.ConnectionClosed in v1. |
Raw literal — "Session not found" (HTTP 404). Was the named ErrorCode.RequestTimeout in v1. |
| Python | CONNECTION_CLOSED — connection closed while requests pending (local-only, not wire). |
REQUEST_TIMEOUT — request timed out (raised to caller, not wire). |
| C# | Unnamed; hardcoded default of WriteJsonRpcErrorAsync → generic server-error for ~15 HTTP/transport validation failures. |
McpErrorCode.HeaderMismatch; also reused as raw literal for "Session not found" (404). |
| Kotlin | CONNECTION_CLOSED ("Connection was closed") — catch-all for closed transports. |
REQUEST_TIMEOUT ("Request timed out"); + one misuse returning it for "Session not found". |
| Swift | .connectionClosed — transport connection closed. |
.transportError — wraps underlying transport (stdio/network) error. |
| PHP | SERVER_ERROR (generic) — defined but dead code, never thrown. |
Absent. |
| Go | Absent. | Collision: CodeHeaderMismatch (emitted, SEP-2243 header validation) + internal ErrUnknown (defined, never emitted). |
| Java | Absent. | Absent. |
| Rust | Absent (one JS test fixture only). | Absent. |
| Ruby | Absent. | Absent. |
I haven't double-checked these, so could be wrong, but this doesn't seem to match what is being proposed. It would be good for this SEP to do the research across SDKs and check that we're not diverging rather than converging.
| Rate limit responses (`-32000`) SHOULD include only the information needed | ||
| for the client to back off (e.g. `retryAfter`) and SHOULD NOT expose per- | ||
| user quota state that could aid enumeration attacks. |
There was a problem hiding this comment.
Where is retryAfter specified? I don't see it anywhere in this SEP or the existing protocol.
| - **Type**: Standards Track | ||
| - **Created**: 2026-05-04 | ||
| - **Author(s)**: matthew.khouzam@ericsson.com (@matthewkhouzam) | ||
| - **Sponsor**: Ericsson (Ericsson) |
There was a problem hiding this comment.
nit: The sponsor should be a maintainer, not a corporation.
Introduce -32000/-32001/-32002 to match the implementation in FastMCP.
This change paves the way towards "interceptors" or "middleware" standardization.
Motivation and Context
The end goal is to better secure MCP against several attacks. FastMCP's middleware proxy design looks like a great solution to add local verification. This change would bring the error codes from that design pattern into the proper MCP. This should have 0 breaking changes as the error codes are already in the range of -32000 - -32099. This would just bring transparency to the error codes.
How Has This Been Tested?
This is a "documentation" and protocol change adding error messages to match those of FastMCP's implementation.
Breaking Changes
Types of changes
Checklist
Additional context
This code was generated with the assistance of claude opus 4.7 (the SEP was largely assisted for linguistic reasons). It was verified by a person.