Skip to content

Conversation

@SamMorrowDrums
Copy link
Contributor

SEP: Trust and Sensitivity Annotations

Summary

This SEP proposes trust and sensitivity annotations for MCP requests and responses, enabling clients and servers to track, propagate, and enforce trust boundaries on data as it flows through tool invocations.

Motivation

As MCP adoption grows, data flows across tool boundaries without standardized trust metadata. This creates security gaps:

  1. Indirect Prompt Injection: Data from untrusted sources enters context without markers
  2. Data Exfiltration: Sensitive information can be passed to external destinations without policy enforcement
  3. Cross-Organization Boundaries: No way to mark internal vs. external data

Key Features

Annotations

  • sensitiveHint: Granular sensitivity levels (low, medium, high)
  • privateHint: Marks internal/private data
  • openWorldHint: Indicates untrusted/external data sources
  • maliciousActivityHint: Signals detected suspicious patterns
  • attribution: Provenance tracking for audit trails

Propagation Rules

  • Sensitivity escalates (never decreases) within an agent session
  • Boolean hints use union semantics (once true, stays true)
  • Attribution accumulates across context boundaries

Integration Points

Related Work

Open Questions

  1. Label namespaces for organization-specific classifications
  2. Declassification mechanisms
  3. Cross-server annotation sharing

Closes #711

/cc @dend (sponsor)

SamMorrowDrums and others added 5 commits November 21, 2025 14:29
Introduces trust and sensitivity annotations for MCP requests and responses,
enabling clients and servers to track, propagate, and enforce trust boundaries
on data as it flows through tool invocations.

Key features:
- Result annotations: sensitiveHint, privateHint, openWorldHint, maliciousActivityHint, attribution
- Request annotations for propagating trust context
- Propagation rules ensuring sensitivity markers persist across agent sessions
- Integration with Tool Resolution (modelcontextprotocol#1862) for pre-execution annotations
- Per-item annotations for mixed results (e.g., search results)
- Defense-in-depth approach complementing tool-level annotations

Closes modelcontextprotocol#711
… type

- Extend existing ToolAnnotations with trust fields (privateHint, sensitiveHint, etc.)
- Leverage existing openWorldHint with refined meaning per context
- Remove per-item annotations (response-level aggregation only)
- Remove _meta nesting - trust annotations live in flat annotations field
- Add Alternative 1 explaining why separate type was rejected
- Update Tool Resolution integration to use flat annotations
@dsp-ant dsp-ant changed the title SEP: Trust and Sensitivity Annotations SEP-1913: Trust and Sensitivity Annotations Dec 3, 2025
@localden localden self-assigned this Dec 4, 2025
Note over Web MCP: Detects prompt injection<br/>in page content
Web MCP-->>Client: Result (maliciousActivityHint: true,<br/>openWorldHint: true)

Client->>User: ⚠️ Warning: Potential malicious content detected

This comment was marked as resolved.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Should we call them MCP Server (FILE) and MCP Server (HTTP)

Although, it is kind of implied hence nit.

User->>Client: "Summarize this webpage"
Client->>Web MCP: tools/call (fetch URL)

Note over Web MCP: Detects prompt injection<br/>in page content
Copy link

@realArcherL realArcherL Dec 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we also highlight that this is best opportunity for servers to apply any preventative measures against indirect prompt injection (ex: Spotlighting, Prompt Sandwich etc)?

For example: Server applies Spotlighting and marks the data along with additional instruction. reference

OR do we want clients to deal with it, since the real attack of prompt injection(s) begin with LLMs?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[SPEC] Annotations for MCP Requests and Responses (security/privacy)

3 participants