SEP-2792: Internationalization via Per-Request Language Negotiation#2792
Conversation
|
Hi @kurtisvg, @pja-ant, tagging you both directly given your feedback on #2355, which is the seed for this SEP (indeed I believe we discussed at MCP Dev Summit that I would look at converting this to a SEP in order to include the metadata for all transports).
Scope is intentionally narrow: user-facing strings (titles, descriptions, UI-bound errors, user-visible notifications) are the primary target; servers MAY also translate body content. Reference implementation is up as a draft on the TypeScript SDK: modelcontextprotocol/typescript-sdk#2158. It includes the core Would either of you be willing to sponsor? |
Adds a transport-agnostic, fully opt-in i18n mechanism for MCP using _meta['io.modelcontextprotocol/acceptLanguage'] on requests and _meta['io.modelcontextprotocol/contentLanguage'] on responses, mirrored into the standard HTTP Accept-Language / Content-Language headers on the Streamable HTTP transport with a strict-mismatch rule consistent with SEP-2243. Per-request scope (no handshake-bound state) aligns with SEP-2575 and supports mid-conversation language switching. Reuses BCP 47, RFC 4647 language-range matching, and existing ecosystem libraries verbatim, no bespoke matcher or schema. Supersedes modelcontextprotocol#2355. Proposes subsuming the locale aspect of SEP-1809. Reference implementation: modelcontextprotocol/typescript-sdk#2158 (en/fr/de server + client, stdio and Streamable HTTP, unit and integration tests including mid-session language switch on stdio). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
ead3ab8 to
958a569
Compare
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
pja-ant
left a comment
There was a problem hiding this comment.
Clean, well-scoped, and I think the shape is right: per-request language in _meta, mirrored to the HTTP headers, no session state. Lines up with SEP-2243 and SEP-2575, reuses BCP 47 / Accept-Language rather than inventing anything. No objections to the design — and the scary-sounding mismatch-rule concerns don't hold up, since the 400 only fires when both the header and body _meta are present and disagree (and _meta isn't intermediary-rewritable).
Two things to fix first:
- Drop the Batching section. We removed JSON-RPC batching — one message per POST, no batch type in the schema — so the "union of language ranges" MUST describes a wire shape the spec forbids. (And a union header would mismatch every per-message
_meta, tripping the very 400 it's trying to avoid.) contentLanguageon errors has nowhere to live.Erroris{ code, message, data? }— no_meta— and your own illustrative schema only adds_metatoResult. Pick a home (add_metatoError, or useerror.data) and spell it out, since localized errors are a primary use case.
Minor: name the mismatch code explicitly (-32001 HeaderMismatch, now in the draft spec — the TS impl drifted to SendFailed); and "reused unchanged" undersells it — you're extending the 2243 rule to a standard header, worth a line in Security Implications.
Structure's all there, file/docs are right. Just needs a sponsor and (for Final) a conformance scenario per SEP-2484.
- Drop Batching section (JSON-RPC batching is removed from MCP) - Define error-response localization via error.data._meta - Name mismatch error explicitly as -32001 HeaderMismatch - Reframe SEP-2243 relationship as extending its rule and error code - Add SEP-2484 conformance scenario outline Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
Feedback applied @pja-ant - thank you for the thorough review. Changes in b5667477:
The TS SDK reference implementation (modelcontextprotocol/typescript-sdk#2158) is being updated in lockstep: switching the placeholder |
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
| - Servers that process the request body **MUST** reject requests where the | ||
| header and the `_meta` field are both present and disagree, using the | ||
| same JSON-RPC error code as the mismatch rule in [SEP-2243] | ||
| (`-32001 HeaderMismatch`), and on HTTP, status `400 Bad Request`. |
There was a problem hiding this comment.
SEP-2243's hard-fail contract is safe for Mcp-Method / Mcp-Name because nothing else on the path sets or touches those headers, so byte-equality is a sound invariant. Accept-Language differs on two counts:
-
The comparison itself is undefined. RFC 9110 treats several serializations of the same value as equivalent — optional whitespace after commas (§5.6.1.1), case-insensitive language tags (RFC 5646 §2.1.1) and
qparameter / trailing-zero weights (§12.4.2), and list fields legally split across field lines and recombined with ", " (§5.2–5.3). The SEP never says whether "disagree" means byte inequality or semantic inequality after parsing, so two conformant implementations can reach different verdicts on the same request — and conformance scenario 4 can't be written without picking one. -
Edge infrastructure doesn't reliably deliver the header verbatim. CloudFront removes
Accept-Languageby default unless it's explicitly forwarded, and the documented way to get the per-language caching benefit this SEP cites — Fastly'saccept.language_lookup()or Varnish's vmod_accept — overwrites the origin-bound header with a single normalized tag. Behind that vendor-recommended configuration, every request carrying the_metafield hard-fails with-32001/400.
Suggestion: make _meta authoritative, demote the headers to a best-effort mirror, and drop the hard reject for these two headers.
There was a problem hiding this comment.
I was only trying to avoid needles divergence from precedent but there is clearly justification to not do so in this case. I'm am convinced.
Header stripping being a very obvious one, as there's no guarantee of the header being consumed by servers (nor should there be), there's no reason to require the are persevered whatsoever. Never mind definition of equality.
There was a problem hiding this comment.
Sorry, I want to push back here.
The whole point of the headers matching is so that the routing can be done correctly without parsing the body. If headers are "best-effort" then the routing might be done incorrectly from the payload, which leaves open the door for undefined behavior (and potential vulnerabilities as result). It's also somewhat bad precedent to have some headers that are always matched, and some that don't.
I think this should say IF the request has the header, it MUST match the body or the request should be rejected. I don't think it's very a big deal if it's semantic or byte matching, so we should just pick one (although byte matching seems easier to get right).
There was a problem hiding this comment.
I agree in theory, but it's a bit of a problem if HTTP edge infra modifies it and makes every request fail validation. What do you do about that? (point 2 in the post above). It's not just semantic matching. Do we just tell people that they need to configure their middleware to not modify the headers at all?
There was a problem hiding this comment.
@kurtisvg can we tolerate complete omission? I think it's reasonable that a server that doesn't use them doesn't have to configure some reverse proxy to forward them at all if they are default stripped.
WDYT?
There was a problem hiding this comment.
I think what I suggested is complete omission is OK, right?
IF the request has the header, it MUST match the body or the request should be rejected.
I presume that if you are about the header for routing, you would make sure your infra supports it.
There was a problem hiding this comment.
But I do feel quite strongly that clients MUST put it, as if it's SHOULD some clients will ignore it.
There was a problem hiding this comment.
Hi @pja-ant any further pushback before I churn the PR again?
Options I see are:
- do as @kurtisvg suggests where publish is required http header match (although asymmetrically not required by server to avoid header stripping), but if it is present it must be a byte matching value to the metadata.
- leave as a best effort
- wait for additional input/specifically ask for another member of the WG to chime in.
I'm leaning towards 1 in my own mind.
There was a problem hiding this comment.
(1) is fine, but we still hit issues with the normalization e.g. done by vmod or fastly as noted above...
It's probably ok. People will just need to figure out that their header normalization is incompatible with MCP.
Accept-Language/Content-Language are routinely stripped, normalized, or rewritten by intermediaries (CloudFront default behavior, Fastly accept.language_lookup, Varnish vmod_accept). RFC 9110 also lacks a canonical byte-equality form for Accept-Language. Applying SEP-2243's hard-fail rule to these headers would error out exactly the edge-i18n deployments we cite as motivating. - _meta is now canonical; headers are best-effort hints (SHOULD mirror) - Remove -32001 HeaderMismatch reject path entirely - Reframe Why-mirror rationale around expected intermediary rewriting - Replace mismatch-attack security note with header-tampering-expected - Replace conformance scenario 4 (reject mismatch) with one asserting that a stripped/rewritten Accept-Language MUST NOT cause rejection - Update reference-implementation summary to drop mismatch claims Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
Feedback applied @pja-ant - and you're right, this is much cleaner.
This also sidesteps the -32001 vs SDK conventions discussion entirely, since we no longer need a code at all. |
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The PoC implementation notes (build status, deferred mismatch rule rationale, SSE vs JSON header behavior) belong in the PR description, not in the SEP itself. Keep the SEP focused on the protocol design. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
| - **Edge i18n services** that route requests to language-specific backends. | ||
| - **Observability tools** that segment usage by locale. | ||
|
|
||
| This SEP therefore says clients SHOULD mirror `_meta[acceptLanguage]` into |
There was a problem hiding this comment.
I think this should be clients MUST add the header, while servers MAY allow the request if the header is missing but MUST reject the request if the header is present and not matching.
Per WG feedback (Kurtis), revert the bare-header-tolerated design and restore strict byte-equality between _meta and the corresponding HTTP header on both request (Accept-Language) and response (Content-Language) sides, with header absence on requests tolerated for CDN compatibility. Use error code -32005 for HeaderMismatch instead of -32001, which is already in conflicting use across SDKs (REQUEST_TIMEOUT in Python and Kotlin vs HeaderMismatch in Go and C#); the code is provisional pending SEP-2243 / SEP-2678 / PR modelcontextprotocol#2642 schema-level reservation work. Add a Normalization footgun section covering Fastly, Varnish, CloudFront and reverse-proxy header rewriting; consolidate operator-config detail in that section so Security and Backward Compatibility just link to it. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Split the bundled 'header absent / bare header' bullet into two distinct rules so each cell of the (_meta-present, header-present) matrix has its own normative line. - Trim the Streamable HTTP intro to one opinionated paragraph. - Drop the trailing 'no new attack surface' platitude from Security. - Promote 'caches must Vary' from a reminder to a MUST. - Replace 'Tentative answer' hedging in Open Questions with explicit proposed resolutions. - Reference Implementation point 5 reads as a factual list rather than a results pitch. - Add [RFC 9111] reference link. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The MUST NOT rule lives in the Specification section that immediately follows; the cross-reference adds no information. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The bullet list and coordination note restated the preceding paragraph. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The detailed bullet list duplicated the linked PR's description and will rot. Keep the SEP terse; the PR is the source of truth. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The Vary requirement now lives normatively in the Response section. Security bullet shortened to a pointer. The Open Questions entry is resolved (MUST) and dropped. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Drop the bold leading-phrases on 6 and 7 so all seven scenarios use the same plain-bullet style. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Summary
A transport-agnostic i18n mechanism for MCP:
_meta['io.modelcontextprotocol/acceptLanguage']on every request, value matches the HTTPAccept-Languagefield syntax verbatim (BCP 47 ranges + quality values)._meta['io.modelcontextprotocol/contentLanguage']on every response, server echoes the language actually used.Accept-Language/Content-Languageheaders under the same payload/header agreement rule SEP-2243 established forMcp-Method/Mcp-Name. Comparison is byte-equality; the server tolerates header absence (CDN strip) but rejects byte-mismatch.Motivation
Converts the docs-only proposal in #2355 into a cross-transport SEP, addressing reviewer feedback there:
_metafirst, with HTTP headers as a mirror.Motivating precedents
Accept-Language/Content-Language.initializeas a place for persistent negotiated state; per-request language preference is the natural fit.request.params._meta) establishes_metaas the carrier for per-request metadata.io.modelcontextprotocol/vendor prefix.Design choices worth flagging
Accept-Languagevalue (whitespace, case,q-value normalization, list-field splitting). Requiring semantic equality would force every SDK to ship the same parser and is itself a conformance hazard. Byte-equality is unambiguous and a one-line check. Trade-off: operators using header-rewriting CDN features (Fastlyaccept.language_lookup, Varnishvmod_accept) need a one-time configuration change; a dedicated Normalization footgun section spells this out.Accept-Language. Rejecting on absence would lock out every operator behind such a CDN where a client sends the header, even if the have no n intention of reading it. Tolerating absence preserves the routing guarantee for callers that do supply the header and falls back to_metacleanly otherwise.-32005forHeaderMismatchrather than-32001. SDK survey on the WG channel showed-32001is already in conflicting use across SDKs (REQUEST_TIMEOUT in Python and Kotlin, HeaderMismatch in Go and C#). The exact code is provisional pending SEP-2243, SEP-2678, and #2642 (schema-level reservation work); this SEP will adopt whatever code that work assigns.Relationship to SEP-1809
SEP-1809 proposes a
clientContextobject ontools/callthat includeslocale. This SEP proposes to subsume the language aspect of SEP-1809 in favor of the cross-cuttingacceptLanguagedefined here, leavingtimezone,currentTimestamp, anduserLocationto SEP-1809.Reference implementation
modelcontextprotocol/typescript-sdk#2158 (draft):
_metafields end-to-end.get_greetingtool) localized into en / fr / de, runnable in either mode.tools/listcalls on the same connection with differentacceptLanguagevalues returning differently-localizedtitles, the runnable proof of mid-session language switching.Supersedes
This SEP supersedes (and proposes closing) #2355.
Status
Checklist
seps/seps/2792-i18n-language-negotiation.md) and docs regeneratedAI Disclosure
This PR was authored with assistance from GitHub Copilot CLI.