Skip to content

SEP-2787: Tool call attestation#2787

Open
soup-oss wants to merge 22 commits into
modelcontextprotocol:mainfrom
soup-oss:tool-call-attestation
Open

SEP-2787: Tool call attestation#2787
soup-oss wants to merge 22 commits into
modelcontextprotocol:mainfrom
soup-oss:tool-call-attestation

Conversation

@soup-oss

@soup-oss soup-oss commented May 25, 2026

Copy link
Copy Markdown

Extensions Track SEP proposing signed tool call attestation envelopes for MCP — binding intent, agent identity, tool name, and arguments into a verifiable audit trail. Targets EU AI Act Article 12 compliance.

Motivation and Context

MCP has no standard mechanism to cryptographically prove which agent called which tool, with what arguments, and for what purpose. Regulated deployments (EU AI Act, AI Liability Directive) need this for audit trails. This SEP fills that gap with a minimal envelope carried in _meta, requiring no protocol changes.

How Has This Been Tested?

Example implementation was added in https://github.com/soup-oss/sep-tool-call-attestation/tree/master/example

@vaaraio has put together a full suite of conformance test vectors and validation criteria over in PR #2789

Breaking Changes

None

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation update

Checklist

  • I have read the MCP Documentation
  • My code follows the repository's style guidelines
  • New and existing tests pass locally
  • I have added appropriate error handling
  • I have added or updated documentation as needed

Additional context

@soup-oss soup-oss requested review from a team as code owners May 25, 2026 17:56
@soup-oss soup-oss changed the title Draft - Tool call attestation SEP-2787: Tool call attestation May 25, 2026
@vaaraio

vaaraio commented May 26, 2026

Copy link
Copy Markdown

Hi @soup-oss, thanks for filing this. The regulatory gap is real and the envelope shape lands close to work already in production.

Two pointers in case they are useful as prior art:

OVERT 1.0 (Glacis Technologies, published 2026-03-25, https://overt.is) defines a related primitive: signed, schema-closed envelopes a relying party can verify offline. Apache-2.0 open standard with a royalty-free patent covenant for conformant implementations. The shape rhymes with SEP-2787 but the design choices differ:

  • Encoding: canonical CBOR per RFC 8949 rather than JSON. Smaller wire size and stricter canonicalisation for signature stability across implementations.
  • Crypto: Ed25519 signatures over HMAC-SHA256 content commitments. The content commitment lets the request payload stay local while only the HMAC crosses the trust boundary, which matters for privacy and for the EU AI Act Article 12 evidence chain when arguments contain PII or trade secrets.
  • Schema: closed 9-field shape with IEEE-754 float rejection. Helps interoperability across multiple emitters.
  • Counter: monotonic counter across the emitter process so gaps are detectable on the verifier side.
  • Phase 3: the IAP role notary-signs the Provisional Receipt and anchors it in a transparency log, which complements rather than replaces an inline ack callback.

Vaara (https://github.com/vaaraio/vaara, Apache 2.0) ships an MCP proxy that emits OVERT 1.0 Base Envelopes per tools/call, resources/read, and prompts/get since v0.24.0 (released 2026-05-20). Working examples with real upstream MCP servers in examples/github-mcp-proxy-demo/, examples/sap-mcp-proxy-demo/, and examples/goose-mcp-proxy-demo/. The proxy is transparent to both the MCP client and the upstream server, so it works with any stdio MCP server without protocol changes (matches your design constraint).

For the SEP's open design questions:

  1. Emitter location. SEP-2787 implies a client-side emitter. The proxy pattern puts the emitter between client and server, which has the benefit that no MCP client needs to change to gain attestation. The trade-off is that the proxy must be deployed. Both shapes are valid and probably both belong in the spec landscape.
  2. Argument handling. Your inline-or-resource-URL design is one way. The HMAC-commitment route in OVERT keeps the payload entirely local. The verifier never sees the args, only the commitment they were bound to. For regulated deployments where arguments contain personal data this side-steps a category of disclosure risk.
  3. Cryptographic algorithm choice. JWT-family (HS256/ES256/RS256) optimises for ecosystem familiarity. Ed25519 (used in OVERT) gives smaller signatures, no ASN.1/DER awareness needed, and constant-time implementations are simpler to audit. Worth weighing against the JWT-tooling familiarity benefit.

Glacis also ships a Python SDK at https://github.com/Glacis-io/glacis-python (Apache 2.0) as the reference implementation by the standard's authors.

Reference verifier CLI is vaara overt verify RECEIPT.cbor --pubkey-file PUB.bin. The verifier reads only the wire format and takes no dependency on Vaara's emitter, so any OVERT-conformant implementation can route its conformance check through it. Test cases are available if useful for the SEP.

Apache 2.0 throughout, no commercial product.

@Rul1an

Rul1an commented May 26, 2026

Copy link
Copy Markdown

This is a useful direction. I like the core instinct here: MCP probably needs a standard way to make a tool call reviewable after the fact, without every client, gateway, or regulated deployment inventing its own envelope.

One boundary I would keep very sharp is the difference between a pre-execution attestation and post-execution evidence.

The current envelope works well as a signed pre-execution statement. It says who is asking, which tool/server is targeted, which arguments or argument digest are bound, what intent was declared, when it was issued, and which key/version signed it.

That proves an intent-bound request. It does not by itself prove that the tool actually executed, what the application-level outcome was, or whether a downstream system accepted the result. The optional ack starts to close that loop, but it feels like a different layer from the core attestation. My bias would be to keep this SEP focused on the request attestation, and treat acknowledgement/outcome receipts as either a separate phase or a follow-up extension.

The other thing I would make explicit is the source of each fact. In audit and replay systems it matters whether a field came from the client/agent planner, the attestation issuer, the MCP server verifier, the tool/application result, a policy engine, or a payload-derived projection/digest. Those are different trust surfaces. Keeping them apart prevents the envelope from claiming more than the layer can actually prove.

I would also be cautious with inline arguments. Signed args are useful in some deployments, but the privacy-friendly default should probably be digest, reference, or redacted projection rather than payload storage. The audit invariant is usually “this call was bound to this exact argument set” or “this reviewed projection was bound,” not “all arguments are now stored in a long-lived compliance artifact.” A more explicit args_digest / args_ref / args_projection shape may be easier to interop than overloading args: string with both inline JSON and resource references.

Canonicalization may need tightening too. “Sorted keys, no whitespace” is a good start, but different language stacks will still disagree on numbers, Unicode escaping, duplicate object keys, floats, NaN, and parser behavior. A JSON Schema plus JCS/RFC 8785, or a small restricted JSON profile, would make conformance testing less surprising.

One smaller concern: the multi-server toolCalls array is powerful, but I would clarify whether it is just a signed plan bundle or whether it is meant to carry stronger workflow semantics. If every server maintains its own nonce cache, a shared nonce helps with replay at each verifier, but it does not fully define what happens when only part of the multi-server plan executes.

So the shape I would find easiest to adopt is:

  • request attestation: signed before execution, binds issuer/subject/tool/server/arguments-or-digest/intent/time/nonce/key version
  • verification result: server-side allow/reject/error reason
  • execution receipt or ack: optional later layer, binding server identity and observed outcome or outcome digest

Overall, strong proposal. I think the most valuable thing this SEP can standardize early is not the whole compliance story, but the stable evidence boundary. Once MCP has a common way to bind a tool call to identity, target, argument digest/projection, intent, nonce, and key version, downstream audit, replay, policy, and receipt systems can compose around it much more cleanly.

@vaaraio

vaaraio commented May 26, 2026

Copy link
Copy Markdown

Thanks for the careful read.

The pre-execution attestation vs post-execution evidence split is exactly the boundary worth making explicit. OVERT 1.0 separates these via the Phase 3 IAP layer, where the relying party notary-signs a Provisional Receipt and anchors it in a transparency log, distinct from the inline ack pattern in SEP-2787. That feels like the cleaner place to land: pre-execution envelope is one primitive, execution receipt is a different primitive composed on top.

On argument handling, OVERT keeps the payload entirely local via an HMAC-SHA256 content commitment. The verifier never sees the args, only the commitment they were bound to. For regulated deployments with PII or trade-secret arguments this side-steps a whole disclosure category. The args_digest / args_ref / args_projection shape maps to the same intent. The digest case is essentially what OVERT does.

Canonicalization: OVERT uses canonical CBOR per RFC 8949 with IEEE-754 float rejection. Stricter than JCS / RFC 8785, but the underlying point is the same: pin the bytes so signatures verify identically across implementations. Whichever the SEP lands on, a normative schema plus explicit canonicalization rules will save reviewers from parser folklore.

On toolCalls as a signed plan bundle vs multi-server workflow: real ambiguity worth resolving in the SEP. OVERT envelopes are per-interaction, which sidesteps the question, but the SEP's bundle shape probably needs an explicit statement on partial-execution semantics and per-verifier replay windows.

The framing of a "stable evidence boundary" lands well. That's the right altitude: bind a tool call to identity, target, args commitment, intent, nonce, key version, and stop there. Outcome receipts and policy decisions compose on top, in separate layers.

@vaaraio

vaaraio commented May 26, 2026

Copy link
Copy Markdown

Follow-up after sitting with the review longer. Four concrete proposals, bundled so the envelope shape settles in one pass rather than drifting across threads.

On the source of each fact: the cleanest path is to annotate every envelope field with its trust surface. Issuer-asserted fields are set by the attestation issuer (subject, intent, time, nonce, key_version, args commitment). Verifier-asserted fields are set by the MCP server verifier (allow/reject/error reason, observed nonce). Payload-derived fields come deterministically from the request payload (args_digest, args_projection). Planner-declared fields are set by the client or agent upstream of the issuer (declared_purpose, requested_capability). The schema can either tag fields with a source annotation or group them under named blocks. The invariant is that no field gets sourced from "envelope" as an undifferentiated whole.

On argument handling: the current args: string field is overloaded with both inline JSON and resource references. An explicit three-way shape reads cleaner. args_digest is a hash commitment over canonical bytes, privacy-friendly default, payload stays local. args_ref is a content-addressed reference, digest plus retrieval URI. args_projection is a redacted or transformed projection of the args, with its own digest. Implementations pick one per call. The audit invariant becomes "this call was bound to this exact commitment" and inline payload storage is opt-in. OVERT 1.0 does this via its HMAC content commitment.

On canonicalization: the current "sorted keys, no whitespace" rule is too loose for cross-stack conformance. A normative reference to RFC 8785 (JSON Canonicalization Scheme) pins behaviour for numbers, Unicode escaping, duplicate keys, floats, and NaN, the places where language stacks silently disagree. A small JSON Schema accompanying the canonical form makes conformance testing tractable. CBOR per RFC 8949 with IEEE-754 float rejection is the stricter alternative. JCS keeps the JSON ecosystem familiarity at some cost. Either is defensible. The SEP needs to pick one and reference it normatively.

On scope: the optional ack field probably belongs in a follow-up extension rather than this SEP. The stable evidence boundary worth standardizing here is binding a tool call to identity, target, args commitment, intent, nonce, key version. Execution receipts and policy decisions compose on top in separate layers. This keeps the surface tight and lets downstream audit, replay, and receipt systems compose around a clear primitive.

vaaraio added a commit to vaaraio/vaara that referenced this pull request May 26, 2026
…ape) (#139)

* feat(attestation): add SEP-2787 reference implementation, proposed shape

Adds vaara.attestation.sep2787 implementing the SEP-2787 Tool Call
Attestation envelope (modelcontextprotocol/modelcontextprotocol#2787)
with the four schema changes Vaara raised in the v1 draft thread:
fact-source labels (three trust-surface blocks), three-way args shape
(ArgsDigest / ArgsRef / ArgsProjection), RFC 8785 (JCS) canonicalization
with IEEE-754 float rejection, and request-attestation-only scope
(the v1 optional ack field is excluded and belongs in a separate
extension). Supports HS256, ES256, RS256 signing per the v1 draft.

Coexists with the existing OVERT 1.0 implementation. See
docs/sep2787-overt-mapping.md for the field-level mapping between the
two envelopes.

16 unit tests covering all three signing algorithms, all three
args-commitment shapes, tampering rejection, canonicalization
invariants, and TTL handling. Ruff-clean.

* chore(attestation): address SEP-2787 PR review feedback

- Rename the test helper _emit_hs256 to _emit_attestation. It builds
  envelopes for all three signing algorithms, not only HS256, so the
  HS256-only name is misleading.
- Add test_ttl_clock_skew_tolerance_window covering the verifier's
  default 30-second skew window: a 60-second TTL with iat + 75 still
  verifies, iat + 91 does not.
- Switch the optional-dependency probe from a try/import block to
  importlib.util.find_spec. Eliminates the CodeQL unused-import
  finding on rfc8785 without changing skip semantics.
- Convert the docs/sep2787-overt-mapping.md reference to COMPLIANCE.md
  into a relative markdown link.

---------

Co-authored-by: vaaraio <267591518+vaaraio@users.noreply.github.com>
@Rul1an

Rul1an commented May 26, 2026

Copy link
Copy Markdown

This is useful prior art, and I think it also helps clarify the SEP boundary.

The thing I would keep strict is implementation neutrality. OVERT/Vaara can be one concrete receipt family, but I do not think MCP should standardize any one receipt family in this SEP.

The MCP-level primitive feels smaller: a request attestation that binds issuer, subject, server/tool target, argument commitment or projection, declared intent, nonce, time, and key version. A verifier result can be a separate server-side fact. Execution receipts, policy decisions, transparency logs, and signed receipt chains can then compose on top.

A good test might be: can two independent implementations agree on the request-attestation semantics without agreeing on the later receipt system? That probably means test vectors and conformance cases should be the normative artifact, not any single implementation.

If yes, the SEP has found the right layer. If no, it may be pulling too much of a downstream audit model into the MCP primitive.

@soup-oss

Copy link
Copy Markdown
Author

Hi @vaaraio , thank you for this incredible, high-signal feedback.

The work you’ve done with OVERT 1.0 and Vaara is amazing prior art. The privacy-first approach of using HMAC content commitments is a game-changer for handling PII under frameworks like the EU AI Act.

My primary goal with this SEP is ensuring MCP gets a native cryptographic attestation layer so developers can build secure infrastructure around agent intents. Whether the spec leans toward JWT for ecosystem familiarity or adopts a harder-nosed CBOR/commitment model like OVERT, getting this boundary into the protocol is what matters.

I see you’ve opened a related PR, I'll keep an eye on the discussion and try to help how we can align these two shapes

@vaaraio

vaaraio commented May 26, 2026

Copy link
Copy Markdown

On test vectors: that's the right call. tests/test_attestation_sep2787.py in the reference impl covers the conformance surfaces (signature verification across HS256/ES256/RS256, all three args-commitment shapes, JCS canonicalization invariants, envelope tampering rejection, TTL handling with clock-skew tolerance). Apache 2.0, ready to lift into a standalone normative test-vector artifact.

On implementation neutrality: vaara.attestation.sep2787 is the SEP-2787 envelope only. It does not embed, reference, or imply any downstream receipt system, transparency log, or signed-chain model. A second independent implementation of the same schema interops with this verifier without taking any position on what happens post-execution. The same package ships vaara.attestation.overt as a separate module for the CBOR-based OVERT 1.0 shape, and the two are wire-independent. The mapping in docs/sep2787-overt-mapping.md is a translation table, not a runtime dependency. OVERT 1.0 by Glacis Technologies (Glacis-io/glacis-python) is the empirical parallel: different org, different wire format, same logical layer.

Reference implementation: vaaraio/vaara#139.

@Rul1an

Rul1an commented May 26, 2026

Copy link
Copy Markdown

That sounds like a good seed.

The bit I would keep separate is authorship of the first implementation versus ownership of the conformance surface. Vaara’s tests may be a very useful starting point, but for the SEP I would expect the normative vectors to live with the proposal itself and be validated by at least one second implementation.

The thing to standardize is observable interop: same canonical bytes, same signature input, same verification result, same rejection cases. That keeps Vaara/OVERT as strong prior art and implementation input, while keeping the MCP primitive implementation-neutral.

@vaaraio

vaaraio commented May 26, 2026

Copy link
Copy Markdown

@Rul1an agreed. Authorship of the first implementation and ownership of the conformance surface belong in different places. The vectors should live with the SEP. A second implementation validating them is the right gate.

The four conformance dimensions you named (canonical bytes, signature input, verification result, rejection cases) are precisely what tests/test_attestation_sep2787.py exercises today. These can be extracted into a vector set and filed against this PR or a sibling location the SEP wants to maintain. Vaara's repo keeps the implementation, the SEP repo owns the normative artifact.

What format would work best on your side?

@Rul1an

Rul1an commented May 26, 2026

Copy link
Copy Markdown

@vaaraio Agreed, that split sounds right.

I cannot speak for the SEP maintainers, but the shape I would find easiest to review is a small fixture set that does not import Vaara, OVERT, or helper code from any implementation.

Maybe test-vectors/sep-2787/v0/, with plain files for the unsigned envelope, expected canonical bytes, signature input bytes, test keys, expected signed envelope, expected verification result, and a few negative cases for tampering, expired TTL, unsupported alg, bad canonicalization, and invalid args commitment shape.

The useful test is whether a second implementation can read those files directly and produce the same bytes and the same pass/fail result. That keeps the vectors boring, portable, and owned by the SEP instead of by the first implementation.

@vaaraio

vaaraio commented May 26, 2026

Copy link
Copy Markdown

@Rul1an v0 fixture set ready against your layout. 40 KB zipped.

Six positive cases: HS256, ES256, RS256 round-trips, plus one fixture each for the digest, ref, and projection args commitment shapes. Each carries unsigned_envelope.json, canonical_signing_input.bin, canonical_signing_input.hex, signed_envelope.json, and expected.json.

Seven negative cases: tampered planner_declared block, tampered issuer_asserted block, expired TTL (signature valid, clock past iat + exp + skew), unsupported alg (HS512), IEEE-754 float in canonical input, invalid args commitment kind, HS256 envelope against an ES256 verifier.

Keys are pinned. hs256_secret.bin is 32 raw bytes. ES256 and RS256 keys are PKCS8 / SPKI PEM. ES256 signatures are raw r||s (64 bytes), not ASN.1 DER. HS256 and RS256 are deterministic so a second implementation re-signing reproduces the stored signature_hex exactly. ES256 signing is randomised, so the ES256 case verifies the stored signature against the pinned public key rather than bit-for-bit reproduction.

The bundle includes _check_independent.py, a verifier that imports only the stdlib plus cryptography and rfc8785. It reads the fixtures from disk and walks the four conformance dimensions with no reference to the reference implementation. Output is six positive OKs, two negative OKs on the pure signature cases, and five SKIPs on verifier-policy cases that depend on the verifier's own clock or schema validator.

Origin and license are in README.md and MANIFEST.json. Apache-2.0, derived from tests/test_attestation_sep2787.py at commit 3d7af54 of vaaraio/vaara. SEP maintainers own the final normative artifact location.

sep-2787-vectors-v0.zip

@Rul1an

Rul1an commented May 26, 2026

Copy link
Copy Markdown

@vaaraio Nice turnaround. This looks like useful seed material.

I would leave acceptance and final layout to the SEP maintainers, but one distinction seems worth keeping in the bundle itself: byte/signature conformance cases versus verifier-policy cases. The former can be normative immediately. Things like TTL clock choice, unsupported alg handling, and schema rejection may need an explicit validator policy before they become pass/fail requirements.

I would also expect the final artifact to be committed as plain fixture files in the SEP repo rather than kept as an attached zip, so a second implementation can consume it in CI without depending on Vaara or on the comment thread.

From my side, the important gate is still the same: one independent implementation reads the SEP-owned fixtures and gets the same canonical bytes and verification results.

@vaaraio

vaaraio commented May 26, 2026

Copy link
Copy Markdown

@Rul1an Filed as #2789, layout under test-vectors/sep-2787/v0/.

normative/ covers signed-envelope round-trips across HS256/ES256/RS256, the three args-commitment shapes (digest, ref, projection), tampering rejection on the planner_declared and issuer_asserted blocks, and IEEE-754 float rejection at the canonicalisation boundary. Nine cases, pass/fail against the SEP-2787 wire format today.

verifier-policy/ covers TTL expiry past iat + exp + skew, unsupported-alg rejection (HS512), schema rejection of unknown args-commitment kinds, and HS256-against-ES256-verifier alg-mismatch. Four cases that depend on an explicit validator-policy paragraph in the SEP before they become normative.

_check_independent.py reads the fixtures from disk and walks the conformance dimensions with no reference to any Apache-2.0 throughout. SEP maintainers own the final layout, including whether the artifact lives at test-vectors/sep-2787/v0/, a sibling path, or in a separate repo. The gate stays the one you named: one independent implementation reads the SEP-owned fixtures and produces the same canonical bytes and signature verification results.

@Rul1an

Rul1an commented May 26, 2026

Copy link
Copy Markdown

Nice update. This is moving in the right direction: deferring ack, switching canonicalization to RFC 8785, and making the argument surface explicit all make the core primitive much easier to reason about.

One verification detail seems worth tightening before this becomes the shape implementers follow. The verification rules currently match the receiving server and tool name, but they do not explicitly require the verifier to bind the actual tools/call.params.arguments to the attested argument commitment.

For args_ref, that probably means resolving the referenced payload, checking the digest, and confirming that the tool arguments being executed are the same payload or the same canonical bytes.

For args_projection, it may need one sentence saying what the verifier can and cannot prove. If the projection is redacted or summarized, the verifier can prove only that the projection was signed, not that it is a complete representation of the runtime arguments. If it is an identity projection, the verifier can compare it directly to the canonicalized runtime arguments.

That keeps the request-attestation boundary tight: the SEP proves identity, target, intent, nonce, time, and an explicit argument commitment. Execution receipts and downstream outcome evidence can still stay deferred.

@vaaraio

vaaraio commented May 26, 2026

Copy link
Copy Markdown

@Rul1an On the argument-commitment binding: the reference impl now wires this as Step 5 in v0.37.1, released a few minutes ago. verify_args_commitment covers the three commitment shapes against the spec text.

For args_ref: resolve via a caller-supplied resolver (the verifier does no network IO), hash the content, match both the stored digest and the canonicalized runtime arguments.

For args_projection: recompute the projection digest, then report projection_match as a tri-state. True for identity projections, False for redacted or summarized projections, where the verifier accepts the signed projection but makes no completeness claim, per your reading.

For args_digest (Vaara's commitment-only shape): recompute the JCS-canonical hash of the runtime arguments and compare to the bound commitment.

Returns ArgsCommitmentResult(ok, reason, projection_match) with reason set to args_commitment_mismatch on failure, matching the spec's error-reason enum.

Composed after the existing signature and TTL checks once the tools/call arguments are in hand. Files: src/vaara/attestation/_sep2787_verifier.py plus 11 tests in tests/test_attestation_sep2787.py.

vaaraio added a commit to vaaraio/vaara that referenced this pull request May 27, 2026
…#150)

The SEP-2787 draft envelope adopted MCP camelCase convention in
soup-oss/modelcontextprotocol@48c739b1. Vaara's proposed-shape
reference implementation now emits camelCase JSON keys on the
serialisation boundary while keeping Python dataclass attributes in
snake_case, so user code is unchanged.

`Attestation.to_dict()` and the JCS-canonical signing payload emit
`plannerDeclared`, `issuerAsserted`, `payloadDerived`, `toolCalls`,
`serverFingerprint`, `secretVersion`, `expSeconds`,
`requestedCapability`, `projectionDigest`. New `issuer_to_dict` helper
replaces the prior `asdict()` call so the issuer block sorts and
renames deterministically without leaking Python-internal names.

`docs/sep2787-overt-mapping.md` updated. CHANGELOG entry under 0.39.1.
pyproject.toml, src/vaara/__init__.py, and clients/ts/package.json all
bumped. 28 attestation tests pass; ruff clean.

The v0 test vector PR (vaaraio/modelcontextprotocol#2789, head
2a9360f, cited in modelcontextprotocol/modelcontextprotocol#2787) was
regenerated with the same renames separately on 2026-05-27.

Co-authored-by: vaaraio <267591518+vaaraio@users.noreply.github.com>
@vaaraio

vaaraio commented May 27, 2026

Copy link
Copy Markdown

@soup-oss Trust-surface grouping landing in the envelope is the right shape. Vaara's reference impl will follow it through the remaining mechanical diffs in the next release: move toolCalls under payloadDerived, swap argsProjection to the JSON-stringified encoding, and drop Vaara's kind-discriminated argsDigest extension. Commitment-only audit composes cleanly on top of argsProjection as an identity projection of a hash-only object, no third kind needed in the spec. A v1-current sibling vector set against the merged shape is on offer when useful.

vaaraio added a commit to vaaraio/vaara that referenced this pull request May 27, 2026
…it-event schema 1.0, Qi survey mapping (#151)

The four mechanical alignments Vaara committed to in
modelcontextprotocol/modelcontextprotocol#2787 after the trust-surface
grouping was incorporated into the SEP draft on soup-oss commit
dd030d5b ship as the v2 envelope shape:

1. toolCalls lives under payloadDerived, not plannerDeclared. Tool
   bindings (name, server fingerprint, args commitment) are facts
   derived from the request payload, not planner declarations.
2. argsProjection serialises with a JSON-stringified projection field
   carrying the JCS-canonical encoding of the projection object. The
   digest is taken over those bytes.
3. The v1 kind-discriminated union is dropped. ArgsRef and
   ArgsProjection self-discriminate by which fields are present.
4. Commitment-only audit composes on ArgsProjection as a
   hash-only-identity projection of the form {"digest": "sha256:..."}.
   No separate ArgsDigest type ships in the spec.

parse_attestation(d) is the new wire-decode entrypoint: inverse of
Attestation.to_dict(). 13 new tests cover emit -> JCS bytes -> parse
-> verify across HS256, ES256, RS256 for both ArgsRef and
ArgsProjection, plus parse rejection on missing-field and
unsupported-alg inputs and a byte-identical re-emit check.

Two doc artefacts ship in the same release:

- docs/audit_event_schema.md: AUDIT-EVENT-SCHEMA-1.0, versioned
  wire/storage contract for the audit events Vaara emits. Independent
  of code version so third-party consumers can pin without coupling
  to a Python runtime version.
- docs/qi_survey_mapping.md: Vaara surface coverage against the
  taxonomy in Qi et al., Towards Trustworthy Agentic AI
  (arXiv:2605.23989, 2026-05-17). Direct, partial, and out-of-scope
  rows by Perceive / Plan / Act / Reflect / Learn / Multi-agent /
  Long-horizon stage under both top-level dimensions.

SEP-2787 reference implementation tag sep2787-ref-v2 lands on this
release commit alongside v0.39.2 for cross-repo provenance. The v0.40
slot stays reserved for the deployment-shape scope (HTTP transport,
multi-tenancy schema, hot-reload extended, fan-out) per
project_v040_roadmap_opa_frame_20260527.md.

Co-authored-by: vaaraio <267591518+vaaraio@users.noreply.github.com>
@Rul1an

Rul1an commented May 27, 2026

Copy link
Copy Markdown

This is converging nicely. One wording nit before this gets reviewed as the stable boundary: a few places still sound like the attestation proves execution, while the body later correctly defers execution acknowledgement/receipts.

In particular, the PR summary mentions "execution proof" / "whether it executed", and the Authorization section says attestation proves "that they called it." I think the tighter wording is that the attestation binds an observed tools/call request to issuer, subject, target, intent, nonce, time, and argument commitment/projection. Whether the tool executed, and what outcome occurred, stays in the deferred execution acknowledgement/receipt layer.

That keeps the current SEP crisp as pre-execution request attestation without weakening the future receipt story.

vaaraio added a commit to vaaraio/vaara that referenced this pull request May 27, 2026
One Vaara process now serves a fleet of upstream MCP servers over
Streamable HTTP, with multi-tenant policy, audit chain, and OVERT
attestation on the same substrate. v0.39 ran one Vaara process per
upstream; v0.40 collapses that into a single multi-tenant deployment.

Streamable HTTP transport on the proxy. `vaara-mcp-proxy --transport
http --http-host H --http-port P` runs POST /mcp via FastAPI / uvicorn.
The endpoint reads `X-Vaara-Tenant` and `X-Vaara-Upstream` per request,
pushes them into ContextVars, and dispatches into the existing
`_handle_request` path so policy, perimeter, OVERT, and
progress-notification handling all light up unchanged. Notifications
return 202. Bodies above 1 MiB return 413. Unknown upstream returns 404.

Fan-out via repeatable `--upstream NAME=CMD`. One Vaara process holds N
UpstreamMCPClient instances in a name -> client map. Bare `--upstream
CMD` keeps the v0.39 single-upstream contract (lands in the "default"
slot). When more than one upstream is configured, a request with no
`X-Vaara-Upstream` header returns 400 with the list of valid slots in
the error envelope. Single-upstream deployments keep the silent-default
contract.

tenant_id end-to-end. ScoreRequest, AuditEventRequest, PolicyReloadRequest
accept a `tenant_id` body field, with `X-Vaara-Tenant` as the HTTP-header
alternative (body wins over header). AuditRecord gains a `tenant_id`
field, excluded from `compute_hash()` so pre-v0.40 chains still
re-verify. AuditTrail keeps an `action_id -> tenant_id` map seeded by
`record_action_requested`, soft-capped at 50k entries.
SQLiteAuditBackend.write_record prefers per-record tenant. OVERT
envelopes carry `tenant_id` as a `non_content_metadata` claim.

Per-tenant policy plane. `vaara.policy.registry.PolicyRegistry` holds
one PolicyController per tenant with the empty string slot reserved as
the default fallback. `vaara serve --policy-dir DIR` loads one YAML/JSON
policy per file (filename stem = tenant_id). `POST /v1/policy/reload`
routes per tenant via body field or header.

Installs `vaara-mcp-proxy` as a top-level console script so the proxy
CLI matches what every v0.39+ docs surface advertises. Earlier releases
only shipped the proxy as `python -m vaara.integrations.mcp_proxy`;
v0.40 closes that gap. v0.41 will fold the proxy into the main `vaara`
verb tree (`vaara mcp-proxy ...`) and keep `vaara-mcp-proxy` as a thin
alias for one release cycle.

Per-tenant threshold dispatch at evaluate-time. `AdaptiveScorer.evaluate`
consults the registry on every call. A new `policy_lookup` constructor
arg (and `set_policy_lookup` for late binding from ServerState) lets
the scorer ask which tenant policy applies right now and use its
allow/deny thresholds for THIS evaluation. Unknown tenant or no lookup
configured falls back to the scorer-bound defaults that the default-slot
listener keeps fresh on reload. The backend decision dict surfaces the
applied threshold_allow and threshold_deny so operators can confirm
which tenant's policy ran. MWU expert state, the conformal calibrator,
agent profiles, and sequence patterns stay shared across tenants; only
threshold application is per-tenant in v0.40.

Scope notes. HTTP transport is POST-only (GET-SSE is v0.41). Per-tenant
policy reload is hot; classifier hot-reload still restart-only.
Cancellation routing across fan-out is v0.41 hardening. Fan-out latency
bench is v0.40.1 measurement.

862 passed, 12 skipped. 45 new tests across tests/test_v040_tenant.py,
tests/test_v040_policy_registry.py, tests/test_v040_mcp_http_transport.py,
tests/test_v040_per_tenant_threshold.py.

References modelcontextprotocol/modelcontextprotocol#2787 for the
SEP-2787 envelope shape v0.40 builds on top of.
vaaraio added a commit to vaaraio/vaara that referenced this pull request May 27, 2026
One Vaara process now serves a fleet of upstream MCP servers over
Streamable HTTP, with multi-tenant policy, audit chain, and OVERT
attestation on the same substrate. v0.39 ran one Vaara process per
upstream; v0.40 collapses that into a single multi-tenant deployment.

Streamable HTTP transport on the proxy. `vaara-mcp-proxy --transport
http --http-host H --http-port P` runs POST /mcp via FastAPI / uvicorn.
The endpoint reads `X-Vaara-Tenant` and `X-Vaara-Upstream` per request,
pushes them into ContextVars, and dispatches into the existing
`_handle_request` path so policy, perimeter, OVERT, and
progress-notification handling all light up unchanged. Notifications
return 202. Bodies above 1 MiB return 413. Unknown upstream returns 404.

Fan-out via repeatable `--upstream NAME=CMD`. One Vaara process holds N
UpstreamMCPClient instances in a name -> client map. Bare `--upstream
CMD` keeps the v0.39 single-upstream contract (lands in the "default"
slot). When more than one upstream is configured, a request with no
`X-Vaara-Upstream` header returns 400 with the list of valid slots in
the error envelope. Single-upstream deployments keep the silent-default
contract.

tenant_id end-to-end. ScoreRequest, AuditEventRequest, PolicyReloadRequest
accept a `tenant_id` body field, with `X-Vaara-Tenant` as the HTTP-header
alternative (body wins over header). AuditRecord gains a `tenant_id`
field, excluded from `compute_hash()` so pre-v0.40 chains still
re-verify. AuditTrail keeps an `action_id -> tenant_id` map seeded by
`record_action_requested`, soft-capped at 50k entries.
SQLiteAuditBackend.write_record prefers per-record tenant. OVERT
envelopes carry `tenant_id` as a `non_content_metadata` claim.

Per-tenant policy plane. `vaara.policy.registry.PolicyRegistry` holds
one PolicyController per tenant with the empty string slot reserved as
the default fallback. `vaara serve --policy-dir DIR` loads one YAML/JSON
policy per file (filename stem = tenant_id). `POST /v1/policy/reload`
routes per tenant via body field or header.

Installs `vaara-mcp-proxy` as a top-level console script so the proxy
CLI matches what every v0.39+ docs surface advertises. Earlier releases
only shipped the proxy as `python -m vaara.integrations.mcp_proxy`;
v0.40 closes that gap. v0.41 will fold the proxy into the main `vaara`
verb tree (`vaara mcp-proxy ...`) and keep `vaara-mcp-proxy` as a thin
alias for one release cycle.

Per-tenant threshold dispatch at evaluate-time. `AdaptiveScorer.evaluate`
consults the registry on every call. A new `policy_lookup` constructor
arg (and `set_policy_lookup` for late binding from ServerState) lets
the scorer ask which tenant policy applies right now and use its
allow/deny thresholds for THIS evaluation. Unknown tenant or no lookup
configured falls back to the scorer-bound defaults that the default-slot
listener keeps fresh on reload. The backend decision dict surfaces the
applied threshold_allow and threshold_deny so operators can confirm
which tenant's policy ran. MWU expert state, the conformal calibrator,
agent profiles, and sequence patterns stay shared across tenants; only
threshold application is per-tenant in v0.40.

Scope notes. HTTP transport is POST-only (GET-SSE is v0.41). Per-tenant
policy reload is hot; classifier hot-reload still restart-only.
Cancellation routing across fan-out is v0.41 hardening. Fan-out latency
bench is v0.40.1 measurement.

862 passed, 12 skipped. 45 new tests across tests/test_v040_tenant.py,
tests/test_v040_policy_registry.py, tests/test_v040_mcp_http_transport.py,
tests/test_v040_per_tenant_threshold.py.

References modelcontextprotocol/modelcontextprotocol#2787 for the
SEP-2787 envelope shape v0.40 builds on top of.
vaaraio added a commit to vaaraio/vaara that referenced this pull request May 27, 2026
One Vaara process now serves a fleet of upstream MCP servers over
Streamable HTTP, with multi-tenant policy, audit chain, and OVERT
attestation on the same substrate. v0.39 ran one Vaara process per
upstream; v0.40 collapses that into a single multi-tenant deployment.

Streamable HTTP transport on the proxy. `vaara-mcp-proxy --transport
http --http-host H --http-port P` runs POST /mcp via FastAPI / uvicorn.
The endpoint reads `X-Vaara-Tenant` and `X-Vaara-Upstream` per request,
pushes them into ContextVars, and dispatches into the existing
`_handle_request` path so policy, perimeter, OVERT, and
progress-notification handling all light up unchanged. Notifications
return 202. Bodies above 1 MiB return 413. Unknown upstream returns 404.

Fan-out via repeatable `--upstream NAME=CMD`. One Vaara process holds N
UpstreamMCPClient instances in a name -> client map. Bare `--upstream
CMD` keeps the v0.39 single-upstream contract (lands in the "default"
slot). When more than one upstream is configured, a request with no
`X-Vaara-Upstream` header returns 400 with the list of valid slots in
the error envelope. Single-upstream deployments keep the silent-default
contract.

tenant_id end-to-end. ScoreRequest, AuditEventRequest, PolicyReloadRequest
accept a `tenant_id` body field, with `X-Vaara-Tenant` as the HTTP-header
alternative (body wins over header). AuditRecord gains a `tenant_id`
field, excluded from `compute_hash()` so pre-v0.40 chains still
re-verify. AuditTrail keeps an `action_id -> tenant_id` map seeded by
`record_action_requested`, soft-capped at 50k entries.
SQLiteAuditBackend.write_record prefers per-record tenant. OVERT
envelopes carry `tenant_id` as a `non_content_metadata` claim.

Per-tenant policy plane. `vaara.policy.registry.PolicyRegistry` holds
one PolicyController per tenant with the empty string slot reserved as
the default fallback. `vaara serve --policy-dir DIR` loads one YAML/JSON
policy per file (filename stem = tenant_id). `POST /v1/policy/reload`
routes per tenant via body field or header.

Installs `vaara-mcp-proxy` as a top-level console script so the proxy
CLI matches what every v0.39+ docs surface advertises. Earlier releases
only shipped the proxy as `python -m vaara.integrations.mcp_proxy`;
v0.40 closes that gap. v0.41 will fold the proxy into the main `vaara`
verb tree (`vaara mcp-proxy ...`) and keep `vaara-mcp-proxy` as a thin
alias for one release cycle.

Per-tenant threshold dispatch at evaluate-time. `AdaptiveScorer.evaluate`
consults the registry on every call. A new `policy_lookup` constructor
arg (and `set_policy_lookup` for late binding from ServerState) lets
the scorer ask which tenant policy applies right now and use its
allow/deny thresholds for THIS evaluation. Unknown tenant or no lookup
configured falls back to the scorer-bound defaults that the default-slot
listener keeps fresh on reload. The backend decision dict surfaces the
applied threshold_allow and threshold_deny so operators can confirm
which tenant's policy ran. MWU expert state, the conformal calibrator,
agent profiles, and sequence patterns stay shared across tenants; only
threshold application is per-tenant in v0.40.

Scope notes. HTTP transport is POST-only (GET-SSE is v0.41). Per-tenant
policy reload is hot; classifier hot-reload still restart-only.
Cancellation routing across fan-out is v0.41 hardening. Fan-out latency
bench is v0.40.1 measurement.

862 passed, 12 skipped. 45 new tests across tests/test_v040_tenant.py,
tests/test_v040_policy_registry.py, tests/test_v040_mcp_http_transport.py,
tests/test_v040_per_tenant_threshold.py.

References modelcontextprotocol/modelcontextprotocol#2787 for the
SEP-2787 envelope shape v0.40 builds on top of.
vaaraio added a commit to vaaraio/vaara that referenced this pull request May 27, 2026
One Vaara process now serves a fleet of upstream MCP servers over
Streamable HTTP, with multi-tenant policy, audit chain, and OVERT
attestation on the same substrate. v0.39 ran one Vaara process per
upstream; v0.40 collapses that into a single multi-tenant deployment.

Streamable HTTP transport on the proxy. `vaara-mcp-proxy --transport
http --http-host H --http-port P` runs POST /mcp via FastAPI / uvicorn.
The endpoint reads `X-Vaara-Tenant` and `X-Vaara-Upstream` per request,
pushes them into ContextVars, and dispatches into the existing
`_handle_request` path so policy, perimeter, OVERT, and
progress-notification handling all light up unchanged. Notifications
return 202. Bodies above 1 MiB return 413. Unknown upstream returns 404.

Fan-out via repeatable `--upstream NAME=CMD`. One Vaara process holds N
UpstreamMCPClient instances in a name -> client map. Bare `--upstream
CMD` keeps the v0.39 single-upstream contract (lands in the "default"
slot). When more than one upstream is configured, a request with no
`X-Vaara-Upstream` header returns 400 with the list of valid slots in
the error envelope. Single-upstream deployments keep the silent-default
contract.

tenant_id end-to-end. ScoreRequest, AuditEventRequest, PolicyReloadRequest
accept a `tenant_id` body field, with `X-Vaara-Tenant` as the HTTP-header
alternative (body wins over header). AuditRecord gains a `tenant_id`
field, excluded from `compute_hash()` so pre-v0.40 chains still
re-verify. AuditTrail keeps an `action_id -> tenant_id` map seeded by
`record_action_requested`, soft-capped at 50k entries.
SQLiteAuditBackend.write_record prefers per-record tenant. OVERT
envelopes carry `tenant_id` as a `non_content_metadata` claim.

Per-tenant policy plane. `vaara.policy.registry.PolicyRegistry` holds
one PolicyController per tenant with the empty string slot reserved as
the default fallback. `vaara serve --policy-dir DIR` loads one YAML/JSON
policy per file (filename stem = tenant_id). `POST /v1/policy/reload`
routes per tenant via body field or header.

Installs `vaara-mcp-proxy` as a top-level console script so the proxy
CLI matches what every v0.39+ docs surface advertises. Earlier releases
only shipped the proxy as `python -m vaara.integrations.mcp_proxy`;
v0.40 closes that gap. v0.41 will fold the proxy into the main `vaara`
verb tree (`vaara mcp-proxy ...`) and keep `vaara-mcp-proxy` as a thin
alias for one release cycle.

Per-tenant threshold dispatch at evaluate-time. `AdaptiveScorer.evaluate`
consults the registry on every call. A new `policy_lookup` constructor
arg (and `set_policy_lookup` for late binding from ServerState) lets
the scorer ask which tenant policy applies right now and use its
allow/deny thresholds for THIS evaluation. Unknown tenant or no lookup
configured falls back to the scorer-bound defaults that the default-slot
listener keeps fresh on reload. The backend decision dict surfaces the
applied threshold_allow and threshold_deny so operators can confirm
which tenant's policy ran. MWU expert state, the conformal calibrator,
agent profiles, and sequence patterns stay shared across tenants; only
threshold application is per-tenant in v0.40.

Scope notes. HTTP transport is POST-only (GET-SSE is v0.41). Per-tenant
policy reload is hot; classifier hot-reload still restart-only.
Cancellation routing across fan-out is v0.41 hardening. Fan-out latency
bench is v0.40.1 measurement.

862 passed, 12 skipped. 45 new tests across tests/test_v040_tenant.py,
tests/test_v040_policy_registry.py, tests/test_v040_mcp_http_transport.py,
tests/test_v040_per_tenant_threshold.py.

References modelcontextprotocol/modelcontextprotocol#2787 for the
SEP-2787 envelope shape v0.40 builds on top of.
vaaraio added a commit to vaaraio/vaara that referenced this pull request May 27, 2026
One Vaara process now serves a fleet of upstream MCP servers over
Streamable HTTP, with multi-tenant policy, audit chain, and OVERT
attestation on the same substrate. v0.39 ran one Vaara process per
upstream; v0.40 collapses that into a single multi-tenant deployment.

Streamable HTTP transport on the proxy. `vaara-mcp-proxy --transport
http --http-host H --http-port P` runs POST /mcp via FastAPI / uvicorn.
The endpoint reads `X-Vaara-Tenant` and `X-Vaara-Upstream` per request,
pushes them into ContextVars, and dispatches into the existing
`_handle_request` path so policy, perimeter, OVERT, and
progress-notification handling all light up unchanged. Notifications
return 202. Bodies above 1 MiB return 413. Unknown upstream returns 404.

Fan-out via repeatable `--upstream NAME=CMD`. One Vaara process holds N
UpstreamMCPClient instances in a name -> client map. Bare `--upstream
CMD` keeps the v0.39 single-upstream contract (lands in the "default"
slot). When more than one upstream is configured, a request with no
`X-Vaara-Upstream` header returns 400 with the list of valid slots in
the error envelope. Single-upstream deployments keep the silent-default
contract.

tenant_id end-to-end. ScoreRequest, AuditEventRequest, PolicyReloadRequest
accept a `tenant_id` body field, with `X-Vaara-Tenant` as the HTTP-header
alternative (body wins over header). AuditRecord gains a `tenant_id`
field, excluded from `compute_hash()` so pre-v0.40 chains still
re-verify. AuditTrail keeps an `action_id -> tenant_id` map seeded by
`record_action_requested`, soft-capped at 50k entries.
SQLiteAuditBackend.write_record prefers per-record tenant. OVERT
envelopes carry `tenant_id` as a `non_content_metadata` claim.

Per-tenant policy plane. `vaara.policy.registry.PolicyRegistry` holds
one PolicyController per tenant with the empty string slot reserved as
the default fallback. `vaara serve --policy-dir DIR` loads one YAML/JSON
policy per file (filename stem = tenant_id). `POST /v1/policy/reload`
routes per tenant via body field or header.

Installs `vaara-mcp-proxy` as a top-level console script so the proxy
CLI matches what every v0.39+ docs surface advertises. Earlier releases
only shipped the proxy as `python -m vaara.integrations.mcp_proxy`;
v0.40 closes that gap. v0.41 will fold the proxy into the main `vaara`
verb tree (`vaara mcp-proxy ...`) and keep `vaara-mcp-proxy` as a thin
alias for one release cycle.

Per-tenant threshold dispatch at evaluate-time. `AdaptiveScorer.evaluate`
consults the registry on every call. A new `policy_lookup` constructor
arg (and `set_policy_lookup` for late binding from ServerState) lets
the scorer ask which tenant policy applies right now and use its
allow/deny thresholds for THIS evaluation. Unknown tenant or no lookup
configured falls back to the scorer-bound defaults that the default-slot
listener keeps fresh on reload. The backend decision dict surfaces the
applied threshold_allow and threshold_deny so operators can confirm
which tenant's policy ran. MWU expert state, the conformal calibrator,
agent profiles, and sequence patterns stay shared across tenants; only
threshold application is per-tenant in v0.40.

Scope notes. HTTP transport is POST-only (GET-SSE is v0.41). Per-tenant
policy reload is hot; classifier hot-reload still restart-only.
Cancellation routing across fan-out is v0.41 hardening. Fan-out latency
bench is v0.40.1 measurement.

862 passed, 12 skipped. 45 new tests across tests/test_v040_tenant.py,
tests/test_v040_policy_registry.py, tests/test_v040_mcp_http_transport.py,
tests/test_v040_per_tenant_threshold.py.

References modelcontextprotocol/modelcontextprotocol#2787 for the
SEP-2787 envelope shape v0.40 builds on top of.

Co-authored-by: vaaraio <267591518+vaaraio@users.noreply.github.com>
@vaaraio

vaaraio commented May 29, 2026

Copy link
Copy Markdown

@Rul1an the request/execution boundary you're drawing is right, and it's why Vaara v0.42.0 ships the complement alongside the attestation impl.

The execution receipt takes the attestation wire bytes as its backLink input (via attestationDigest over the full wire bytes + nonce) and records what actually executed. Envelope is three blocks: backLink (binds the receipt to the specific attestation it follows), receiptAsserted (issuer block, same signing surface as the attestation), outcomeDerived (status executed/refused/errored + completedAt + optional result commitment). No TTL: durable record, not a capability.

Reference implementation at vaaraio/vaara@4608c36, docs at docs/execution-receipts.md, v0 conformance vectors in tests/vectors/execution_receipt_v0/ with a stdlib-only independent verifier. Reuses RFC 8785 JCS + HS256/ES256/RS256 from SEP-2787 unchanged; a 2787 verifier needs no new crypto to verify a receipt.

@chopmob-cloud

This comment was marked as spam.

@Rul1an

Rul1an commented May 30, 2026

Copy link
Copy Markdown

Helpful prior-art and benchmark context from both sides.

The main thing worth preserving is that SEP-2787 v1 stays narrowly about request attestation.

For me, that means:

  • field-level trust surface first: what is being attested, by whom, and with what canonicalization/binding guarantees
  • execution/outcome evidence stays out of scope for v1: no ack, no post-execution rejection, no runtime effects, no broader execution-context drift in the same shape

Bench data can help prioritize. Prior-art can help expose tradeoffs. But neither should quietly widen the primitive itself.

A narrow v1 does not need to solve every threat class. It just needs to be explicit about what it proves, and equally explicit about what it does not.

@soup-oss

Copy link
Copy Markdown
Author

We strongly align with @Rul1an framing that v1 must stay thin and foundational. Rather than expanding the envelope scope now, we'd like to flag three design considerations that the ACK extension (when it comes) we believe should be considered:

  1. Pre-flight readines: The client needs a way to confirm the ACK endpoint is reachable, trusted, and accepting receipts before the tool executes. The exact mechanism (capability ping, session-level negotiation, per-call probe) should be left to implementation; the important thing is the protocol acknowledges this exists as a requirement, not that it prescribes the answer.

  2. Verifier key binding: The ACK payload should be verifiable by the entity controlling ackEndpoint without requiring the original signing key. How a deployment achieves this (shared key, derived encryption, asymmetric wrapping) is out-of-scope, but the ACK field should define a wrappedKey or encryptedPayload slot so that the receipt can be bound to a key the endpoint controls, keeping the attestation's intent and arguments opaque to the ACK infrastructure.

  3. Infrastructure discovery: If ackEndpoint is in the envelope plaintext, intermediaries learn the receipt topology. A nonce-derived capability URL or encrypted endpoint address prevents infrastructure discovery without adding per-call key exchange.

@vaaraio

vaaraio commented May 30, 2026

Copy link
Copy Markdown

Agree v1 should stay thin and request-scoped. The current envelope already carries what a verifier needs, and widening it now would slow the part that's ready to land.

On the ACK considerations, there's a shipped reference worth pulling from when that work opens. Vaara emits an execution receipt as the post-execution sibling of the 2787 attestation, signed over the same RFC 8785 JCS canonical bytes (ES256/RS256/HS256), with an independent verifier and published test vectors in the repo. On point 2 specifically: asymmetric receipts are already verifiable by whoever holds the public key, without the original signing key, so a wrappedKey/encryptedPayload slot is only needed for the confidentiality case, not for verification itself. I can bring the concrete format and verifier when the ACK extension is taken up.

@chopmob-cloud

This comment was marked as spam.

@vaaraio

vaaraio commented May 30, 2026

Copy link
Copy Markdown

On the conformance-vector thread: the receipt-verification failure modes coming up here are mostly already normative cases, and they sit in the SEP's own test-vectors tree where a verifier can run them with nothing external in the loop.

In #2789 the SEP-2787 v0 vectors carry the negatives directly: 07-tampered-planner-declared and 08-tampered-issuer-asserted for signature and field tamper, and 09-ieee754-float-in-canonical-input for a non-canonical JCS payload, alongside the HS256/ES256/RS256 positives and a stdlib-only _check_independent.py walker. The receipt layer carries its own negatives in the reference vectors: a broken back-link and a result-commitment mismatch, checked by the same independent walker.

Two of the modes raised here aren't cased yet: an unknown alg identifier, and a replay that substitutes a payload field rather than the verifier time. Both are small. I'll add the alg case to the #2789 set and the replay-substitution case to the receipt vectors, so the negative coverage for this surface is complete and lives in one place implementers already pull from.

Keeping the conformance artifact in the test-vectors tree is the same instinct as keeping v1 thin: an implementer should be able to check an envelope against fixtures in the repo, offline, without taking a dependency on any running service.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

proposal SEP proposal without a sponsor. SEP

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

5 participants