What should a stdio MCP validator check beyond initialize and tools/list? #2733
Replies: 5 comments 1 reply
-
|
I would separate the result model into three buckets:
That distinction is worth keeping in the JSON schema. A package that cannot start is different from a server that starts but violates JSON-RPC/MCP. Registries and clients will want to display those differently. Checks I would add beyond For Python buffering, I would report it as a runtime advisory rather than a protocol failure unless it causes an actual timeout or missing response. Something like: {
"class": "runtime.pythonBuffering",
"severity": "warning",
"evidence": "server produced no stdout before timeout; retry with PYTHONUNBUFFERED=1 changed behavior"
}The other useful registry-facing check is a reproducibility fingerprint: command, args, cwd behavior, env vars used/redacted, package version, node/python version, and elapsed startup time. Without that, a pass/fail result is hard to debug when the same server works on the author's machine but fails in a client registry. |
Beta Was this translation helpful? Give feedback.
-
|
The three-bucket separation yudin-s sketched (install/runtime, transport hygiene, protocol conformance) is the right top-level cut. The bucket that keeps getting under-served in practice is a fourth one: schema quality. It's not a runtime failure and it's not a protocol violation — the server happily passes Checks that belong in that fourth bucket, in rough order of how often they bite: The "annotations honesty" angle yudin-s mentioned for resources/prompts has a direct parallel for tools: if On reproducibility fingerprint — strong yes, and worth including the protocol version the server actually negotiated (server can downgrade), not just the version it claims to support. A server that advertises For separation in the JSON schema (Q2): yes, and the wire format gain is that registry consumers can filter on "passes protocol but has schema warnings" — that's the cohort that needs the most help and currently gets bucketed with "fully broken" because the failure mode is invisible without a schema-level check. On Python buffering (Q3): treating it as a runtime advisory unless it causes a timeout is the right call. The trickier case is servers that emit partial JSON-RPC frames because of buffering interacting with One more from the repeated-run side: schema fingerprint stability across runs. Re-running For folks running adjacent tooling — |
Beta Was this translation helpful? Give feedback.
-
|
Strong cuts upthread from @yudin-s on the three-bucket separation and @PengSpirit on schema quality as the fourth bucket. The gap I'd flag — and the one we kept tripping over running stdio MCP servers in production for the last few months — is a fifth bucket: live-call behavior. Everything above passes against Concrete checks that belong in this bucket, in rough order of severity: On the JSON schema separation question (Q2): yes, and beyond install/runtime/transport/protocol/schema-quality, I'd recommend a One protocol-version note worth folding in: when the server downgrades its negotiated The reproducibility fingerprint @yudin-s mentioned is the right anchor — extending it with the negotiated protocol version, server-side concurrency limit (if discoverable), and a hash over the tool list (so re-runs flag schema drift between cold starts) covers most of the "works on my machine" reports we saw. Happy to PR any of this into |
Beta Was this translation helpful? Give feedback.
This comment was marked as spam.
This comment was marked as spam.
-
|
Hi, we at Gated have a catalog of 200+ checks across 5 families. Some of them are http specific, but most of them might be able to help you. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
I built a small CLI called
mcp-stdio-guardto catch MCP stdio server failures before users wire a server into Claude/Cursor/etc.Repo: https://github.com/1Utkarsh1/mcp-stdio-guard
Pilot notes: https://gist.github.com/1Utkarsh1/90d7a0f2ea85635e3a209eb1be4970f9
The current guard does a real
initializehandshake, can send a post-initialize request liketools/list, fails on stdout pollution, invalid JSON-RPC frames, crashes, and missing responses, while allowing stderr diagnostics.The JSON output now has
schemaVersion: 1and badge-friendlychecks.*classes for:I also ran a small registry-style pilot against 30 real stdio MCP entries from public MCP indexes and upstream package metadata/docs. 21/30 passed
initialize + tools/listtwice. The 9 failures were all pre-initialize process exits from yanked/superseded packages, not stdout pollution or JSON-RPC framing issues.I would love feedback from MCP server authors and client/registry maintainers:
My goal is not to shame server authors; it is to make failures visible and actionable before users lose time debugging a broken stdio setup.
Beta Was this translation helpful? Give feedback.
All reactions