Skip to content

Conversation

@nbarbettini
Copy link
Contributor

@nbarbettini nbarbettini commented May 6, 2025

Introduces a new client capability that servers can use to trigger an interaction with the end-user.

There are some important use cases that require the MCP server to interact with the end-user in a secure way:

  • Authorization and step-up auth
  • Gathering sensitive data
  • Payments

These interactions are highly sensitive in nature, and we can take inspiration from how OAuth/OIDC solved these problems on the web. The interaction type proposed here is type="ua", which requires the MCP client obtain consent from the user and navigate to a URL in a user-agent (aka browser), where the sensitive operation can occur securely.

Motivation and Context

One of the hot topics discussed but not addressed in #284 was the idea of fine-grained authorization for specific tools. @wdawson and I previously discussed this idea in #234 as well.

For example, the community identified scenarios involving "downstream" tools and resources that need authorization, but MCP authorization is about MCP client->MCP server authorization, not how "downstream" authorization would be handled in an MCP server that talks to a third-party API or resource server.

Kudos to @siwachabhi for the idea to model this as a client capability. This proposal is distinct from and complementary to the elicitation proposal (#382): it describes an interaction that takes place outside of the MCP client, whereas elicitation describes an interaction that takes place inside the MCP client.

Examples

Downstream authorization for tools

A big question in #284 was downstream authorization, or "tool authorization", meaning this scenario:

  • As an MCP server developer, I am building an MCP server that interacts with a third-party API.
  • Let's say I'm building the Nate's Awesome Google Tools server which exposes tools that interact with Google's REST API. Let's also say the server has these tools: search_email, send_email, trash_email
  • MCP authorization secures the connection between MCP clients and my MCP server. But since the API is a third-party API, getting a token for Nate's Awesome Google Tools doesn't mean that the user has also authorized scopes for Google's API.
    • My server could use the client's initial authorization flow to send the user on a "side quest" through Google's authorization flow. But doing that would require the user to approve all scopes up front; there is no way to approve just the scope to read my mail and later decide if I want to approve sending mail.
  • When an agent decides to execute a tool like search_email, it would be desirable to be able to ask the user "just in time" to approve a scope from Google (https://www.googleapis.com/auth/gmail.readonly)

Here is how this scenario would work with this proposal:

sequenceDiagram
    participant U as End-User
    participant B as User-Agent (Browser)
    participant AS as 3rd-party Authorization Server
    participant C as MCP Client
    participant S as MCP Server
    

    C->>S: Call tool send_email
    Note over S: Server determines user is not yet authorized
    S->>C: interaction/create type=ua<br>with OAuth 2.1 authorize URL<br>scope=https://www.googleapis...
    C-->>U: Present consent to open URL
    U-->>C: Provide consent
    C-->>B: Open URL
    B-->>AS: Navigate to URL
    C->>S: Send response ("Ack")
    Note over U,AS: Perform authorization<br>(out of band)
    AS-->>S: OAuth callback
    S->>AS: OAuth 2.1 token exchange

    Note over S: Server now has a valid 3rd-party token
    B-->>S: Perform interaction
    Note over S: Continue operations
Loading

By adding a path for the server to instruct the MCP client to send the end-user to a server-defined URI, we gain the ability to perform this "downstream" authorization. If we zoom out a bit, this doesn't just apply to servers acting as OAuth clients - directing the end-user to a URL also unlocks the ability to gather any kind of sensitive data that shouldn't pass through the MCP client (or an intermediary).

FAQ

Redirecting to a URL is great for doing OAuth, but what about servers that need the user to provide API keys, connection strings, etc?
These should not be exposed to the MCP client either! Redirecting to a URL (the type="ua" pattern proposed here) also works for gathering any sensitive information. For example, the MCP server itself can host a public page containing an HTML form that asks the user to enter a required API key -- all under the control of the MCP server, and without that sensitive information passing through the client.

Why is it such a big deal if sensitive information passes through the client?
For the same reason that you shouldn't give Yelp your Gmail password (to use a classic example). The MCP client's job is to communicate with the MCP server only, even though the MCP server might also be communicating with other APIs, resource servers, etc. on behalf of the user. There are important security boundaries at play:

  1. The MCP client: Responsible for interacting with the end-user and the MCP server. In OAuth terminology, it is a public client (not a confidential client) because its code cannot be trusted -- after all, it is running in a browser or as a desktop app and anyone can view or modify its source code.
  2. The MCP server: An OAuth resource server, responsible for validating authorization. Its code can be trusted (because it runs on a server!) and therefore it can store sensitive information.
  3. Sometimes, a "downstream" API or resource server: In some scenarios, the MCP server acts as a resource server to the MCP client and also an OAuth client to a "downstream" resource server. That downstream server may have its own credential needs (API keys, OAuth tokens). These must stay within boundary (2)-(3). Otherwise, the MCP client can become a confused deputy that is able to do perform actions in the "downstream" resource server that are not intended.

How Has This Been Tested?

TODO: Sample app showing both the client and server side of user interactions.
Coming soon -- wanted to get the doc up to start the discussion, and will follow up shortly with code.

Breaking Changes

None

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation update

Checklist

  • I have read the MCP Documentation
  • My code follows the repository's style guidelines
  • New and existing tests pass locally
  • I have added appropriate error handling
  • I have added or updated documentation as needed

Additional context

TODO:

  • Spec proposal draft
  • Finish updating schema.ts
  • Code sample (client and server)

@aaronpk
Copy link
Contributor

aaronpk commented May 7, 2025

I'm not sure I understand how this works. Could you go through a step by step example of the requests and responses that we would see with this?

@patwhite
Copy link
Contributor

patwhite commented May 7, 2025

Doesn't the new authorization work that was proposed handle this with headers? I might have misunderstood the original proposal, but I thought you could 401 with a header that pointed you to an auth server to auth with, but I might be mis-remembering

@wdawson
Copy link
Contributor

wdawson commented May 7, 2025

Doesn't the new authorization work that was proposed handle this with headers? I might have misunderstood the original proposal, but I thought you could 401 with a header that pointed you to an auth server to auth with, but I might be mis-remembering

@patwhite that is for when authorization is needed for the client to communicate (at all) to the server. This proposal is for any situation when a server would want to interact with the user. For example:

  • A productivity MCP server might request a user to authorize a third-party service to
    access their documents. <-- This is different than the authorization required for the MCP client to talk to the MCP server
  • A news MCP server might request a user to upgrade their subscription to access more features.
  • A banking MCP server might request a user to verify their account to access a new feature.
  • A media MCP server might request a user's favorite music genre to personalize the user
    experience.
  • A social media MCP server might request an image to use as a profile picture.

@wdawson
Copy link
Contributor

wdawson commented May 7, 2025

I'm not sure I understand how this works. Could you go through a step by step example of the requests and responses that we would see with this?

@aaronpk the Flow Diagram sections of the spec show the MCP protocol level requests at a high level. You'll notice the "Human interaction" note, which could be anything, similar to the "user authenticates" part of the OAuth authorization code flow. In this case, the "human interaction" might be an OAuth flow that redirects to the MCP server, granting it a token to call some OAuth protected API downstream. Or it could present a form asking for an API key. Or it could present a stripe payment portal requiring a subscription upgrade. The main point here is that the MCP spec doesn't need to care what the interaction is.

Does that help? Are you looking for a flow diagram for a specific use-case to ground understanding with an example?

@patwhite
Copy link
Contributor

patwhite commented May 7, 2025

Doesn't the new authorization work that was proposed handle this with headers? I might have misunderstood the original proposal, but I thought you could 401 with a header that pointed you to an auth server to auth with, but I might be mis-remembering

@patwhite that is for when authorization is needed for the client to communicate (at all) to the server. This proposal is for any situation when a server would want to interact with the user. For example:

OK, got it - I think this is a cool idea, and solves some issues in the protocol in general. But, I do worry about conflating generalized interactions with security specific interactions. For generalized interactions, I think it would be great to actually flesh this out ever further, add some additional details around multi-turn interactions etc.

But, for a security demand in particular, this PR could literally just be a single new error type (401 equivalent) that includes enough information to craft an oauth redirect. That could be a lot easier to generally get through the review process here, and I think it would also be a lot easier to get clients to implement it, cause it would be using mechanisms which already exist, and you could basically make it more or less required for the vNext spec.

@nbarbettini
Copy link
Contributor Author

But, for a security demand in particular, this PR could literally just be a single new error type (401 equivalent) that includes enough information to craft an oauth redirect. That could be a lot easier to generally get through the review process here, and I think it would also be a lot easier to get clients to implement it, cause it would be using mechanisms which already exist, and you could basically make it more or less required for the vNext spec.

This is a good point @patwhite. That is exactly what we've defined under "Requiring Interaction as an Error Response".
I'd be happy to split just that part out into a small, focused PR. But, it depends on many of the ideas in the rest of the proposal, so it'd need to bring at least the ua interaction type along with it.

@nbarbettini nbarbettini force-pushed the feature/user-interaction branch 2 times, most recently from 849c523 to 8cdd61a Compare May 8, 2025 19:10
@nbarbettini nbarbettini force-pushed the feature/user-interaction branch from 8cdd61a to a099db4 Compare May 8, 2025 19:31
@nbarbettini nbarbettini changed the title [feat] Add User Interaction as a client capability feat: Add User Interaction as a client capability May 8, 2025
@nbarbettini
Copy link
Contributor Author

FYI, I slimmed down this PR (thanks for the nudge from @patwhite) to focus just on the necessary interaction to unblock authorization. The language is written with an eye to extending the user interaction concept in the future, if desired.

I also added more flow diagram examples, let me know if it feels clearer now @aaronpk!

@adranwit
Copy link

adranwit commented May 9, 2025

Thanks for drafting this proposal. I have two follow-up questions:

  1. Interaction monitoring & callback mechanics
    Could you clarify the expected completion patterns (server-hosted callback) and what data, if any, the client must relay back to the server?

  2. Error handling & corner cases
    Will the spec define standard result / status values (e.g., completed, cancelled, timeout, error)?
    How should user-initiated aborts (for example, an OAuth error=access_denied) be surfaced?

@nbarbettini
Copy link
Contributor Author

@adranwit thanks for taking a look! Answers inline:

Could you clarify the expected completion patterns (server-hosted callback) and what data, if any, the client must relay back to the server?

The MCP client doesn't need to relay anything back to the server. The client's only responsibility is defined in "Interaction Types > User Agent Interaction":

The MCP client MUST facilitate the opening of the URL in a User Agent.

That's it! The rest of the interaction is controlled by the MCP server, and purposefully kept out of view of the MCP client. That maintains the correct security boundary around the MCP server.

Of course, in a interaction like a downstream OAuth authorization, the MCP server will indeed host a callback to complete the authorization flow. But again, that's not in view of the MCP client.

Will the spec define standard result / status values (e.g., completed, cancelled, timeout, error)?

I think a progress mechanism that gives the client updates on the status of the interaction would be a good addition. I've left it out of this proposal (for now) to keep this focused specifically on the absolute must-haves to unblock use cases like downstream tool authorization. Progress could be added as a fast-followup, or now if the community feels it is a must-have.

How should user-initiated aborts (for example, an OAuth error=access_denied) be surfaced?

The end-user cancelling or denying the interaction at the outset is addressed in "Client-Initiated Cancellation".

But you raise a different, interesting scenario: the user proceeds partway through an authorization flow (for example), and then rejects it in the downstream system. This is still out of view of the MCP client by definition, unless the MCP server chooses to indicate to the client that the entire interaction should be considered canceled. In that case, the cancellation notification described in "Server-Initiated Cancellation" is appropriate.

@nbarbettini
Copy link
Contributor Author

Added a concrete example (plus flow diagram) above of the tool authorization scenario, by far the most requested version of this idea!

@adranwit
Copy link

Thanks for adding the concrete example and flow diagram @nbarbettini —super helpful!

The sequence reads like a classic variation of Backend-for-Frontend (BFF) OAuth flow:

  1. The MCP server generate authorization URI.
  2. The MCP client launches the user-agent with authorization URI.
  3. The third-party authorization server redirects back to the MCP server.
  4. The server redeems the code/token.
    Correlation is already handled by the OAuth state parameter.

All that said, applicability is far beyond OAuth flows, especially when the MCP client is effectively out-of-band for the secure exchange.

Idea: instead of layering another “correlationId”-style value on top of state or other key per interaction, should the MCP spec allow an explicit interaction-completed callback (POST) on MCP server?

  • The server already knows the redirect URI and session id, interaction request id; it could include the session ID/interaction request id, plus the payload it needs (e.g., auth_code) in that callback.
  • The MCP server would just wait for the callback (or timeout), avoiding the need to pass the auth code back on URL via regular GET redirect.
    I’ve always been wary of auth code or other secret log-leak (not real issue with PKCE though), for these, I've experimented with a localhost-redirect BFF variant (discussion #483)

Would love the team’s thoughts on whether a POST-back (or similar out-of-band signal) callback could simplify the flow.

Thanks again—this example really clarifies the proposal!

@chhamilton
Copy link

chhamilton commented Jun 5, 2025

@wdawson

Regardless of whether or not you want to specifically build in-band E2EE primitives, I think there are use cases (anti-abuse, access policy, monitoring, etc) that are going to want the chain of actors to be auditable and provable. So you're going to want cryptographic identity as part of the MCP spec (and any other related specs, like A2A), so that a service E can know it's being called from User A > Agent B > Agent C > Service D.

Gotcha. I want to decouple this PR from any details about chaining MCP servers and how that needs to work securely. I was entertaining the discussion earlier so as to illustrate that it's possible and how I could see things working in the current state. But I agree there are many details to work out as far as verifiable identity beyond each server being a different OAuth RS (and/or client in the chaining case). Are you good with pulling that discussion out of this PR? Or is there a critical piece I'm missing as to why it's a prerequisite to user interaction?

Absolutely fine to move that conversation elsewhere, and happy to refocus this conversation on the User Interaction primitives. I only brought it up because we started talking about in-band E2EE, impersonation, MITM, etc. Can you suggest a good venue for secure chained agent/service identity mechanisms? I see it best fitting in with MCP Initialize, I suppose?

@wdawson
Copy link
Contributor

wdawson commented Jun 5, 2025

@chhamilton

Absolutely fine to move that conversation elsewhere, and happy to refocus this conversation on the User Interaction primitives. I only brought it up because we started talking about in-band E2EE, impersonation, MITM, etc. Can you suggest a good venue for secure chained agent/service identity mechanisms? I see it best fitting in with MCP Initialize, I suppose?

I did a quick search in the Discussions in this repository and found this one that might be a better fit from @ggoodman , but you could also start a new Discussion if you'd like.

@wdawson
Copy link
Contributor

wdawson commented Jun 5, 2025

@LucaButBoring

Given that impersonation is discussed in the spec, I think it is important to discuss seriously here, or at least to amend the proposed spec to clearly outline what is currently in- and out-of-scope regarding impersonation for this proposal, and/or to go into more detail on the recommended mitigations for it in the security considerations section of the spec.

Definitely fair. I think we'll need to disambiguate "impersonation" and whether that's the user level or MCP client level. Thanks for your comments here. We already had on our list to clean up that section of the spec and this will help I think!

I'm not sure what you mean here. An MCP server has no reason to return a pre-signed URL as the interaction URL. The interaction URL is meant, effectively, to be a challenge to a user, not a credential. We can clarify that in the spec.
But if you mean that an MCP server might issue an interaction request and then ask for a pre-signed URL to get access to some data (in an S3 bucket for example), that's fine. But there are MITM issues with receiving data in this way, as I think you indicated in your 3rd point. OAuth has mitigated many of the problems that arise out of essentially sharing your password (or one time token) with another service.

I was actually referring to the former case - user interactions come across as general enough that sending a pre-signed interaction URL seems like a conceivable use case (maybe exposing a non-anonymous, non-OAuth-guarded form or something that I need temporary access to but don't want intermediaries to intercept). If that's explicitly not desired and this isn't intended to be general-purpose, secure out-of-band communication, that's fine but should be specified, or the assumptions under which it is secure should be stated explicitly in a separate section.

Definitely will make this clearer! The intention for interaction URLs is to provide a mechanism by which the MCP server can communicate directly with the user. That URL will be visible to the MCP client and therefore MUST NOT contain any sensitive or secure information directly. Again, we don't want to be prescriptive about the interaction itself here, but you could imagine that once the user agent loads that URL, the server could reveal the pre-signed URL to that user. There might be some level of authentication (e.g., checking a session) before that happens if desired as well.

The problem goes both ways, at any rate - if the interaction URL itself is delivered through a proxy, intermediaries can override it; and if the end-user fills data into that proxied interaction URL, that's vulnerable to being exfiltrated by an intermediary that has overridden the proxied interaction URL. In the auth challenge scenario where GT issues a challenge to request authorization to do more than the token from PA->GT allows, that is vulnerable to being intercepted by PA to instead grant PA permission to perform that same action.

I think this is really just going back to the unaddressed comment from @localden about what exactly it means to verify server authenticity to avoid phishing attacks (in hindsight that is effectively what I was describing).

Yeah, we'll need to detail this a bit more as well. Thanks for the nudge here!

@localden localden requested review from dsp-ant and pcarleton June 6, 2025 04:12
@shlomiken
Copy link

shlomiken commented Jun 10, 2025

@nbarbettini - thank you for putting this proposal , i have no role in the review process but would like to share my thoughts as someone handling auth for the past 15 years now.

  1. Resource servers (in our case MCP servers) - needs to be kept simple, just like a simple API server that only knows to validate some auth headers sent along with request to API. this means does NOT have callback endpoints etc. its the responsibility of the caller to obtain the proper access token.

  2. In your flow diagram step 2 the "Server determines user is not yet authorized" - how exactly ? does it needs to manage sessions now ? (resource servers are usually stateless) , what the client need to send so the server knows this user already authorized ? this brings a lot of burdain to the resource server (specially if its a cluster of nodes)

  3. You say the client is not confidential - why ? maybe claude desktop is not , but many applications and clients are written as part of a web application which is confidential .
    the client does not run in browser code , it usually part of the message loop an agent manages on some backend process.

I admit that in this MCP world i'm quite a beginner and probably haven't thought this through all use cases (although downstream auth is definitely interesting to me)
but according to the spec of 2.1 with Dynamic Client Registration , it feels like the MCP client is the client
(hosting callbacks, clientIds, secrets , exchanging codes with tokens etc) and it sole purpose is to navigate messages between LLM request and MCP server tools , taking care of authorization while doing so.

this way downstream auth is made simple in the eyes of the MCP server , you either send me a token or get 401 asking you to issue one, then come back.

@aaronpk - would love to hear what do you think as well.

@nbarbettini
Copy link
Contributor Author

nbarbettini commented Jun 10, 2025

@shlomiken Thanks for taking a look!

  1. Resource servers (in our case MCP servers) - needs to be kept simple, just like a simple API server that only knows to validate some auth headers sent along with request to API. this means does NOT have callback endpoints etc. its the responsibility of the caller to obtain the proper access token.

Resource server is one of the roles that an MCP server plays, but not the only role. This proposal enables the MCP server to take the role of OAuth Client for any downstream resource servers it needs to access.

  1. In your flow diagram step 2 the "Server determines user is not yet authorized" - how exactly ? does it needs to manage sessions now ? (resource servers are usually stateless) , what the client need to send so the server knows this user already authorized ? this brings a lot of burdain to the resource server (specially if its a cluster of nodes)

MCP servers that act as OAuth clients to downstream services need to be stateful. It's implied but not stated explicitly in the text of this proposal - I can call that out more clearly.

  1. You say the client is not confidential - why ? you don't trust claude desktop (as an example) to not steal your credentials , if so don't install it , just like you trust google when you install chrome to not steal your passwords .
    the client does not run in browser code , it usually part of the message loop an agent manages on some backend process.

In the language of OAuth 2.1, Claude Desktop is a public (not confidential) client. SPA code that runs in a browser is public as well. If the MCP client code does indeed run entirely on a server-side/backend process, then it qualifies as confidential.

The distinction between public and confidential is not about whether I trust Claude Desktop or not. It's whether I can decompile or right-click->View Source. If I can, then a malicious actor can too, which means we need to treat the client differently from a security perspective. For example, we can't keep OAuth client secrets safe anywhere in a public client.

it feels like the MCP client is the client (hosting callbacks, clientIds, secrets , exchanging codes with tokens etc) and it sole purpose is to navigate messages between LLM request and MCP server tools , taking care of authorization while doing so.

this way downstream auth is made simple in the eyes of the MCP server , you either send me a token or get 401 asking you to issue one, then come back.

This is tempting, and it's where I started first. I landed on the current proposal after conversations with @wdawson, @aaronpk, @mcguinness and others.

The reasons the MCP client can't also be the client to the downstream OAuth service are:

  1. It risks token exfiltration. If the MCP client (which in many cases is a public client) ends up holding tokens for many downstream services, the tokens can easily be exfiltrated. On the other hand, if the tokens are kept secret on the MCP server and not shared with the MCP client, they cannot be exfiltrated.
  2. It risks token misuse. When an MCP server talks to a downstream API, it may need to impose additional security constraints (or rate limits, validation, etc) on the request. If the MCP client holds a real token for the downstream API, it can bypass any constraints imposed by the server and talk to the downstream API directly.
  3. It breaks the security boundary clearly defined by MCP authorization. This is also explicitly discussed in the Security Best Practices page under Token Passthrough.
  4. It burdens the client with a lot more complexity. Besides requiring callbacks, token exchanges, etc. it requires the MCP client to store client secrets (which makes all public clients, e.g. Claude Desktop incompatible with this feature), or requires the client to perform DCR to every OAuth service the MCP server wants to access (which is not possible for many popular services today).

@RichardoC
Copy link

Thanks for this proposal, how would the following work with this solution?

I have an MCP server that can perform various levels of sensitive actions. Let's use the Github MCP server as an example. It can make commits on a personal and corporate repo.

The MCP server has permission to carry out all of these actions because the user has that level of permission. I want to user to be prompted by default above a certain level of sensitive action - but to give them a way to say "next time I do this, I don't want to be prompted"

For example, I want them to be prompted for any time they're performing mutation actions against the corporate repo - unless they have decided that it's fine for repo X. However, I want them to be always prompted if they're changing security settings for any repo

@ggoodman
Copy link
Contributor

For example, I want them to be prompted for any time they're performing mutation actions against the corporate repo - unless they have decided that it's fine for repo X. However, I want them to be always prompted if they're changing security settings for any repo

@RichardoC, I think there are tools emerging that help address your use-case. For a naive consent prompt, perhaps Elicitiation will suffice. However, if you want stronger signals and better 'receipts', you might reach for this proposal.

Your sensitive tool calls could respond with errors and requests for interaction. The user would need to navigate to the linked page and confirm their consent there. It would be upon you, to correlate the interaction request from the MCP Server and the user confirmation on the web page. A JWT query param could be a good way to correlate these two interactions in a tamper-proof and stateless way.

@shlomiken
Copy link

shlomiken commented Jun 11, 2025

@shlomiken Thanks for taking a look!

Resource server is one of the roles that an MCP server plays, but not the only role. This proposal enables the MCP server to take the role of OAuth Client for any downstream resource servers it needs to access.

Yeh , after thinking more - it make sense

although the spec say the MCP Client act as Oauth 2.1 client - which is confusing i think .
image

Another thing i'm not fully understand is if the auth loop is closed at the MCP server then how the client is being notified on that , and re-issue their tool request . also what in the protocol can be used as session (since we don't have cookies here)
so request 6, and 7 in the quick diagram i put together .
image

One more question to oauth guys - isn't that approach make DCR redundant ? a MCP server only knows a finite number of resource servers he will talk with , why do we need Dynamic registration then ?

Regarding confidential clients - my bad, i updated my comment , probably you saw it later. i then read that even if you use OS native secure storage , IDPs cannot rely on an app to do so , so all installed apps are public.

@ggoodman
Copy link
Contributor

Another thing i'm not fully understand is if the auth loop is closed at the MCP server then how the client is being notified

This is the "Transactional" aspect I've been pushing for. I believe that this User Interaction capability would really benefit from a robust Server -> Client tracking / notification upon landing and not as a follow-on.

@wdawson
Copy link
Contributor

wdawson commented Jun 11, 2025

@shlomiken

although the spec say the MCP Client act as Oauth 2.1 client - which is confusing i think .

This is correct for the authorization between the MCP client and MCP server. The type of authorization we're talking about here is when an MCP server needs authorization to a downstream API. For example, a Personal Assistant MCP server needing authorization to Google's APIs for a particular user. It is a bit confusing, but the MCP server can play both the OAuth resource server and client roles. Remember, these are better considered roles that can be played instead of fixed capabilities.

Another thing i'm not fully understand is if the auth loop is closed at the MCP server then how the client is being notified on that , and re-issue their tool request .

We have added a section back into the spec for how the MCP client can monitor the progress of the interaction from the MCP server. We did this to leverage existing capabilities in the MCP spec. This can accomplish the arrow number 6 in your diagram.

As @ggoodman points out, there is appetite in the community for a better pattern when dealing with asynchronous operations. We don't want to introduce a new pattern for that in this PR, but are open to changes in the future.

also what in the protocol can be used as session (since we don't have cookies here)

The MCP server receives an access token from the client. Depending on the implementation, the server can leverage the subject identifier as a way to track which user to link the downstream credentials to. This can be used to deliver arrow number 7 in your diagram. Ultimately, this will be up to the exact implementation of the MCP server and its authorization server for how to do this.

One more question to oauth guys - isn't that approach make DCR redundant ? a MCP server only knows a finite number of resource servers he will talk with , why do we need Dynamic registration then ?

I think this question is confusing the roles again. DCR is required for MCP servers that will not necessarily know their clients ahead of time. Think my particular instance of Claude Desktop I have installed wanting to access your MCP server. The MCP server, however, may act as an OAuth client to the Google API. In that case, it's a different OAuth flow entirely and DCR may not be required.

@dave1010
Copy link

On the UAInteraction object, I see the url field is specified with format: "uri". This means we have the option of supporting non-browser interactions. For example, a server could send a tel: URI to initiate a call, or perhaps a custom scheme like apple-pay:// to trigger a native payment. Is that intentional and OK? URL And URI are often used interchangeably but it might be useful to be explicit in the spec.

@adranwit
Copy link

Hi @nbarbettini, given the Elicitation is now part of the MCP spec, I could imagine the following election request schema

"requestedSchema": {
  "type": "object",
  "properties": {
    "secureFlow": {
      "type": "string",
      "title": "Open Link To Provide Secrets",
      "description": "Start Secure ...."
      "default":"https://somehost/secure/flow"
      "format":"uri"
    },
  },
  "required": ["secureFlow"]
}

where do we stand with User Interaction ?

@dmiyamasu
Copy link

Hi @nbarbettini, given the Elicitation is now part of the MCP spec, I could imagine the following election request schema

"requestedSchema": {
  "type": "object",
  "properties": {
    "secureFlow": {
      "type": "string",
      "title": "Open Link To Provide Secrets",
      "description": "Start Secure ...."
      "default":"https://somehost/secure/flow"
      "format":"uri"
    },
  },
  "required": ["secureFlow"]
}

where do we stand with User Interaction ?

@adranwit
The MCP Elicitation draft spec states

Servers MUST NOT use elicitation to request sensitive information.

In your example, does "required": ["secureFlow"] qualify as sensitive information?

@dmiyamasu
Copy link

@nbarbettini instead of adding this functionality as a client capability, why not as a MCP Proxy Server capability?

Why change the target? B/c the MCP Proxy Server definition

An MCP server that connects MCP clients to third-party APIs, offering MCP features while delegating operations and acting as a single OAuth client to the third-party API server.

seems to more accurately describe the role of "client" you are describing.

@btiernay
Copy link

Is there still interest in pursuing this proposal / addition? It seems to have stalled, curious why that is. Perhaps it is being discussed elsewhere or there is another proposal?

@adranwit
Copy link

@dmiyamasu thanks for the feedback. In this scenario, elicitation could simply results in a URL being rendered or opened by the MCP host. At this stage, there's nothing inherently secure about that interaction—the MCP client is just triggering a link. Whatever happens after that (i.e., within the URL target) is entirely out of scope for the MCP protocol and the elicitation process itself.

So, even though "secureFlow" is a required field, it's just a URI. No sensitive data is being directly requested or transmitted via elicitation. The actual handling of secrets or sensitive input—if any—would happen within the hosted flow pointed to by the URL, which is beyond the control or visibility of the MCP client. In addition, I'd assume any of secure flow would generate URL that can be redeem only once.

Therefore, in my opinion this doesn't violate the draft spec’s guidance that “Servers MUST NOT use elicitation to request sensitive information.”

@wdawson
Copy link
Contributor

wdawson commented Jun 26, 2025

Hey folks! Not stalled. We're working through all the things in order to rework this on to of the most recently released version that includes elicitation. I hope to have something ready this week!

Edit: I, unfortunately, did not quite make it over the line this week. Will definitely be early next week! Thanks for your patience, folks!

@Xerenz
Copy link

Xerenz commented Jun 28, 2025

Thanks for putting this proposal @nbarbettini
On reading through the discussion one of the problems it immediately solved for us was "Authorization" on tool calls.

After going through the draft and all the discussions I realised it will be so much cleaner if authorization was handled between the third party API and its AS and would like your review on the same, whether or not this will be a good design or has any holes in it.

For example: A user calls a tool "approve_pull_request" to approve any PR on a Github MCP server. This requires checking if the user is authorized to approve PRs in that particular git repo or not. Our very initial thought was to handle these checks when the MCP tool is called - on the MCP server or with a custom AS. But with this proposal I feel its simplified, as an MCP server can only do what the user has the permissions to do on Github. It removes any scope of privilege escalation or access misconfiguration (In the case where we were handling checks at tool calls)

@nbarbettini
Copy link
Contributor Author

@Xerenz Yes, you understood the proposal correctly! In cases like yours where an MCP server is calling a downstream API (such as GitHub) on behalf of the end-user, the best place to check the user's authorization is in the downstream API itself (or its authorization server). Otherwise, MCP servers must re-build a ton of authorization logic.

However, the MCP server may need to pass an authorization challenge back to the end-user ("click here to authorize access to GitHub"). Some MCP servers are nested within other MCP servers, so the "pass back" mechanism needs to be robust and something that can be programmatically forwarded to upstream servers. That's the foundational principle of this proposal.

@nbarbettini
Copy link
Contributor Author

Hey everyone! Thanks for all the great feedback here. To keep things clean we've created a new PR for the updated version of this proposal: #887
Comments and reviews are quite welcome. See ya there!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.