-
Notifications
You must be signed in to change notification settings - Fork 1.2k
RFC: add Tool.outputSchema and CallToolResult.structuredContent
#371
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
A couple of comments on this as I prepare to undraft #223 for RFC:
[update] With guidance that Servers returning a Structured Response MUST return a CallToolResult containing one EmbeddedResource of type |
|
Having
Another drawback I see is the lack of ability to dynamically define the structure/schema for response content. There are certainly cases where the output schema may not be known ahead of time, but would still be useful for the client or LLM consuming the content; it also enriches the capability of sampling and prompt messages. Lastly, going with this approach, extending its functionality in the future would likely represent a breaking chance since it largely goes against the implied design pattern of |
@lukaswelinder first of all, my apologies for putting up this PR without first participating in the discussion on #356 - I only saw it as I was writing the PR comments for this, but hadn't had a chance to look properly yet. Definitely didn't mean to step on your ongoing work. I'll respond to your comment a bit later (not at keyboard right now) and also make comments on #356. Meanwhile I'll move this to draft, pending further discussion. |
@evalstate thanks for the heads up - will come back to this after we see what comes out of the discussion on #356 , per previous comment |
@bhosmer-ant No offense taken, just glad to see there is motivation here. Input and feedback on #356 would be great. |
74c0122 to
30d1a5e
Compare
30d1a5e to
9876ab9
Compare
Tool.outputSchemaTool.outputSchema and CallToolResult.structuredContent
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Largely LGTM, thank you so much for moving this forward 🙏. Minor nit: given it's json native, I would love if we could avoid json-in-json and just have structuredOutput as an object.
| "result": { | ||
| "structuredContent": { | ||
| "type": "text", | ||
| "text": "[{\"id\":\"doc-1\",\"title\":\"Introduction to MCP\",\"url\":\"https://example.com/docs/1\"},{\"id\":\"doc-2\",\"title\":\"Tool Usage Guide\",\"url\":\"https://example.com/docs/2\"}]" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now that we have a separate key for structuredContent, would it be preferable for its value to directly conform to the outputSchema rather than having a type and a serialized json? At the very least should we avoid serialization?
schema/draft/schema.ts
Outdated
| * | ||
| * If the Tool defines an outputSchema, this field MUST be present in the result, and contain a serialized JSON object that matches the schema. | ||
| */ | ||
| structuredContent?: string; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| structuredContent?: string; | |
| structuredContent?: object; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this be aligned more with tool input?
| structuredContent?: string; | |
| structuredContent?: { [key: string]: unknown }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Based on my comment above, maybe
| structuredContent?: string; | |
| structuredContent?: any; |
|
|
||
| 1. Clients **MUST** validate that the tool result contains a `structuredContent` field whose contents validate against the declared `outputSchema`. | ||
|
|
||
| 2. Servers **MUST** provide tool results that conform to their declared `outputSchema`s. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be helpful to provide some guidelines on how the server should respond when validation of the output against the schema fails.
For example InvalidParams is used in https://github.com/modelcontextprotocol/typescript-sdk/pull/454/files
or recommend server authors to pick a custom error code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess I'm not sure we'd need to make an error path explicit for this situation - this was more just a restatement of the contract, since (in the absence of upstream errors returned in the usual way) the server should be able to guarantee that results conform to the declared schema. But definitely LMK if you have a specific situation in mind that I'm not thinking of!
| "description": "Whether the tool call ended in an error.\n\nIf not set, this is assumed to be false (the call was successful).", | ||
| "type": "boolean" | ||
| }, | ||
| "structuredContent": { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thinking about this, given that outputSchema is any arbitrary json schema (which makes sense), should this be of type any rather than an object,i.e. should we omit the type=object? It would make sense for eg if the tool just returns an int or even an array. This as is will prevent an array of objects from being returned for eg. even though you could describe it in json schema.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah see above - on balance I think the sticking with the top-level-object restriction in the rest of the protocol (and standard practice more generally) is worth the extra trouble of wrapping top-level primitives/arrays
schema/draft/schema.ts
Outdated
| * | ||
| * If the Tool defines an outputSchema, this field MUST be present in the result, and contain a serialized JSON object that matches the schema. | ||
| */ | ||
| structuredContent?: string; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Based on my comment above, maybe
| structuredContent?: string; | |
| structuredContent?: any; |
- `structuredOutput` typed as object rather than string - tighten CallToolResult to codify at-least-one constraint and explicitly allow structured results to contain `content` (but not the reverse) - update docs
[update: relaxed the modality of
contentandstructuredContentand the coupling ofstructuredContentandoutputSchemavalidation in #559]Adds support for strict validation of structured tool results.
Toolcan now optionally provide anoutputSchemaproperty, containing a JSON schema that defines the structure of its output.CallToolResultadds a newstructuredContentproperty, mutually exclusive withCallToolResult.contentproperty:result.structuredContentwill be absent, andresult.contentwill be returned as before.result.structuredContentwill contain an object whose contents must validate against the tool's outputSchema.Prototype for typescript SDK support in modelcontextprotocol/typescript-sdk#454.
Design notes
This PR aims to provide simple, lightweight support for strict validation of tool result data whose structure can be entirely described by a single JSON schema. The approach here pairs a new
Tool.outputSchemaproperty with a newCallToolResult.structuredContentproperty, avoiding use of theCallToolResult.contentarray.This approach leaves the path open for adding schematic validation support to the much richer and more complex space of tools that make use of the full expressiveness of the
CallToolResult.contentarray, via an additionalToolproperty. Support for these use cases has been proposed in #356, and is under active discussion there.(After exploring possible ways of providing integrated support for both kinds use cases with one set of protocol additions, it's clear that both will be better served by a disjoint approach: strict validation of statically typed data results can be accomplished with the simple additions provided here, and the subtleties arising from supporting full space of
CallToolResult.contentshapes - see e.g. #415, in addition to #356 - can be addressed more naturally absent the need to support the use cases addressed here.)Motivation and Context
For tools that return structured output, having a description of that structure available is useful for various tasks, including:
outputSchemas (or their absence) when making decisions about which tools to expose to the model.How Has This Been Tested?
No tests yet.
Breaking Changes
Optional new property that introduces a new behavior, not a breaking change.
Types of changes
Checklist
Additional context