SEP-2356: File input support for tools and elicitation#2356
SEP-2356: File input support for tools and elicitation#2356
Conversation
Proposes Tool.inputFiles and ElicitRequestFormParams.requestedFiles to let servers declaratively mark which arguments expect user-selected files. Clients declaring the fileInputs capability render native file pickers and encode selections as data URIs (with a standardized name= media-type parameter for filenames). Key design points: - inputFiles is a sibling of inputSchema (not an annotation hint) - Per-argument accept[] MIME filters and maxSize limits - Supports single-file and array-of-file arguments - Adds StringArraySchema to PrimitiveSchemaDefinition for elicitation - Capability gates advertising, not acceptance: servers always accept well-formed data URIs regardless of negotiation - Error convention: -32602 with data.reason for size/type violations - Cites OpenAI Apps SDK openai/fileParams as prior art https://claude.ai/code/session_01UE8PfZW3WmKXvoqtamBbtp
- elicitation.form.fileInputs nests under existing client cap - Tool-side stays top-level (no ClientCapabilities.tools exists) - Independent gating: clients can support one surface without the other - Add Open Questions section debating alt placements: new ClientCapabilities.tools namespace vs single unified flag https://claude.ai/code/session_01UE8PfZW3WmKXvoqtamBbtp
One top-level ClientCapabilities.fileInputs flag instead of the split approach. Rationale captured: - Underlying capability (file picker + data URI encoding) is singular - Elicitation is already gated by the elicitation capability itself - Simpler server check - No ClientCapabilities.tools exists to nest under anyway Removes Open Questions section; the placement debate is resolved. https://claude.ai/code/session_01UE8PfZW3WmKXvoqtamBbtp
… to schema
Implements the file upload SEP with declarative file input metadata:
- FileInputDescriptor: { accept?: string[], maxSize?: number } - advisory MIME filter + byte limit
- Tool.inputFiles: maps argument names to FileInputDescriptor for native file pickers
- ElicitRequestFormParams.requestedFiles: symmetric support for elicitation forms
- StringArraySchema: new PrimitiveSchemaDefinition member for multi-file inputs
- ClientCapabilities.fileInputs: capability gate (server MUST NOT send inputFiles without it)
Files are transmitted as RFC 2397 data URIs: data:<mediatype>;name=<filename>;base64,<data>
https://claude.ai/code/session_01JxhHWiXrXgE4JWC27dznRN
Introduces an Overview section in the SEP that walks reviewers through the complete round trip on both surfaces before the formal spec: - Tool: `describe_image` definition with `inputFiles`, paired with the matching `tools/call` request carrying a data-URI argument. - Elicitation: `elicitation/create` request with `requestedFiles`, paired with the matching `ElicitResult` response carrying the data-URI content. All examples use the same real (non-truncated) 1x1 PNG so the wire encoding is concrete and copy-pasteable. The same examples are added as validated JSON files under schema/draft/examples/ and wired into schema.ts via @includecode so they appear in the generated reference docs. https://claude.ai/code/session_0168Rxur9BGcHnAzo3zpZkEH
Rename placeholder SEP file to 2356 (the PR number), fill in the header title and PR link, and regenerate the SEP docs so the new page appears in the community SEP index and navigation. https://claude.ai/code/session_0168Rxur9BGcHnAzo3zpZkEH
Replace 'future SEP' framing in three places with direct guidance: inputFiles covers the inline case, URL-mode elicitation covers files too large to embed. No new transport machinery needed. :house: Remote-Dev: homespace
… handling by surface - Gate StringArraySchema behind fileInputs capability (SEP prose + schema.ts docstring) so existing form-mode clients aren't broken by an unrecognized PrimitiveSchemaDefinition member - Clarify maxSize applies per-file for array-typed arguments - Reword schema-shape constraints to permit extra properties; add SHOULD-ignore rule for malformed inputFiles/requestedFiles entries - Require format:"uri" on StringArraySchema.items for file fields - Add percent-encoding test vector; require name= before ;base64 per RFC 2397 - Define accept matching: type/subtype only, case-insensitive, params stripped - Scope file_uri_malformed to broken data: URIs; non-data URIs are server-defined - Split error handling into tool-call (-32602) vs elicitation-result (re-elicit or fail enclosing op) subsections :house: Remote-Dev: homespace
…p-sdks-8Rub5 # Conflicts: # docs/community/seps/index.mdx # docs/docs.json # docs/seps/2356-declarative-file-inputs-for-tools-and-elicitation.mdx
The committed mdx was out of sync with its source .md (admonition line break). Main never tripped this because the workflow only runs when seps/**/*.md changes; adding 2356.md surfaces the drift.
|
Hey, great feature. Any timeline on this? |
|
|
||
| The capability object is currently empty but is defined as an object to allow future extension (for example, a client-side global size ceiling). | ||
|
|
||
| This single capability governs both surfaces. If the client does not declare `fileInputs`, the server **MUST NOT** include `inputFiles` on any `Tool` it lists, **MUST NOT** include `requestedFiles` on any form-mode elicitation request, and **MUST NOT** use `StringArraySchema` in any elicitation `requestedSchema`. This prevents advertising affordances the client cannot honor. (The `StringArraySchema` gating is conservative: it is technically renderable without file-picker support, but since this SEP introduces it and existing form-mode clients do not recognize it, tying it to `fileInputs` avoids breaking them. A future SEP may unbundle it.) |
There was a problem hiding this comment.
Do we need this capability? What's the harm in the server sending inputFiles or requestedFiles if the client doesn't support it? Can't the client just ignore those fields?
There was a problem hiding this comment.
In the rationale below:
the server author added the annotation expecting a file picker to appear, and silently nothing happens
The server author shouldn't be expecting anything to happen. They declare what they need. They can already declare that they want a URI. It's up to clients to make good UX.
| "photo": { | ||
| "type": "string", | ||
| "format": "uri", | ||
| "title": "Profile photo" |
There was a problem hiding this comment.
Why not put accept+maxSize here? JSON schema allows for extensions: https://json-schema.org/draft/2020-12/json-schema-core#section-6.5
pja-ant
left a comment
There was a problem hiding this comment.
Love the feature, addressing a real gap, but have a bunch of questions/concerns around how we're specifying this.
| "photo": { | ||
| "type": "string", | ||
| "format": "uri", | ||
| "title": "Profile photo" |
| } | ||
| ``` | ||
|
|
||
| `StringArraySchema` is intentionally narrow — arrays of strings only — to keep elicitation-form rendering tractable. Although it is a general-purpose shape, servers **MUST NOT** send it unless the client declared `fileInputs`, since existing form-mode clients do not recognize it (see [Client Capability](#client-capability)). |
There was a problem hiding this comment.
This feels super-janky to me and incredibly counterintuitive. It won't be long before someone needs an array of numbers or something else and then we'll be stuck with a horrible design where we have to do separate types for every possible combination.
- IMO we should just introduce an array of T.
- Do not tie to
fileInputs- can just make it a new things in the new spec that >=2026-06-xx have to support.
|
|
||
| ### Wire Encoding | ||
|
|
||
| When a client populates a file-input argument from a user-selected file, it **MUST** encode the file as an RFC 2397 data URI using base64 encoding: |
There was a problem hiding this comment.
Why MUST? Shouldn't we allow regular URIs? If my file is already uploaded somewhere or the file exists on my disk (for a stdio server), surely it's better to just send https://... or file:///? Also make it backwards compatible (since this was previously allowed behaviour).
There was a problem hiding this comment.
IMO, it would be preferrable for clients to not use a data URI and instead upload somewhere public and allow the server to download since it avoids the 33% base64 tax. In the future we may want to introduce a resource/write for sending files that has transport-dependent optimized paths that avoid the base64 round trip.
|
|
||
| The capability object is currently empty but is defined as an object to allow future extension (for example, a client-side global size ceiling). | ||
|
|
||
| This single capability governs both surfaces. If the client does not declare `fileInputs`, the server **MUST NOT** include `inputFiles` on any `Tool` it lists, **MUST NOT** include `requestedFiles` on any form-mode elicitation request, and **MUST NOT** use `StringArraySchema` in any elicitation `requestedSchema`. This prevents advertising affordances the client cannot honor. (The `StringArraySchema` gating is conservative: it is technically renderable without file-picker support, but since this SEP introduces it and existing form-mode clients do not recognize it, tying it to `fileInputs` avoids breaking them. A future SEP may unbundle it.) |
There was a problem hiding this comment.
In the rationale below:
the server author added the annotation expecting a file picker to appear, and silently nothing happens
The server author shouldn't be expecting anything to happen. They declare what they need. They can already declare that they want a URI. It's up to clients to make good UX.
| - `<filename>` is the file's basename, percent-encoded per RFC 3986. Characters that would otherwise be parsed as data-URI delimiters — notably `;` and `,` — **MUST** be encoded; for example, the filename `Report (v2); final.pdf` becomes `Report%20%28v2%29%3B%20final.pdf`. The `name` parameter is **OPTIONAL**; clients **SHOULD** include it when the original filename is known. | ||
| - `<data>` is the base64-encoded file content. | ||
|
|
||
| The `name` parameter is a media-type parameter permitted by RFC 2397's grammar but not defined by it. This SEP standardizes its use within MCP for carrying the original filename. Per that grammar, `name=` is part of the `<mediatype>` segment and **MUST** appear before the `;base64` marker; servers **MAY** reject data URIs that place it elsewhere. Servers **SHOULD** treat `name` as advisory metadata (useful for display, extension-based heuristics, or round-tripping) and **MUST NOT** rely on it for security decisions. |
There was a problem hiding this comment.
I feel like adding this name= stuff is introducing unnecessary complexity. What's the real use case? If the server needs a name, can't servers just add an additional name string parameter if they need it? The client may not want to use the original file's name. This would avoid the awkward data URI encoding relying on non-standard use of RFC 2397.
Summary
This PR adds support for declarative file inputs in tools and elicitation forms, allowing servers to specify which arguments/fields should be rendered as file pickers by clients. Files are transmitted as RFC 2397 data URIs with embedded metadata.
Key Changes
fileInputscapability toClientCapabilitiesto allow clients to declare support for file input handlinginputFilesfield toToolinterface to declare which tool arguments accept file inputs, with optional MIME type and size constraintsrequestedFilesfield toElicitRequestFormParamsfor file inputs in elicitation formsFileInputDescriptorinterface with optionalaccept(MIME type patterns) andmaxSize(byte limit) fields for client-side validation hintsPrimitiveSchemaDefinitionto includeStringArraySchemafor multi-file inputs in elicitation formsImplementation Details
data:<mediatype>;name=<filename>;base64,<data>name=parameter is percent-encoded to preserve original filenamesfileInputscapabilityFileInputDescriptorare advisory; servers must independently validate inputsInvalidParamsErrorand reason"file_too_large"{"type": "string", "format": "uri"}or arrays thereofPrototype PRs
Related work
_meta["openai/fileParams"]is used in MCP Apps for the same purpose, but with an extra step to upload a file & get a fileId