feat: add MCP server configuration backend for chats by kylecarbs · Pull Request #23227 · coder/coder

kylecarbs · 2026-03-18T14:46:32Z

Summary

Adds the database schema, API endpoints, SDK types, and encryption wrappers for admin-managed MCP (Model Context Protocol) server configurations that chatd can consume. This is the backend foundation for allowing external MCP tools (Sentry, Linear, GitHub, etc.) to be used during AI chat sessions.

Database

Two new tables:

mcp_server_configs: Admin-managed server definitions with URL, transport (Streamable HTTP / SSE), auth config (none / OAuth2 / API key / custom headers), tool allow/deny lists, and an availability policy (force_on / default_on / default_off). Includes CHECK constraints on transport, auth_type, and availability values.
mcp_server_user_tokens: Per-user OAuth2 tokens for servers requiring individual authentication. Cascades on user/config deletion.

New column on chats table:

mcp_server_ids UUID[]: Per-chat MCP server selection, following the same pattern as model_config_id — passed at chat creation, changeable per-message with nil-means-no-change semantics.

API Endpoints

All routes are under /api/experimental/mcp/servers/ and gated behind the agents experiment.

Admin endpoints (ResourceDeploymentConfig auth):

POST / — Create MCP server config
PATCH /{id} — Update MCP server config (full-replace)
DELETE /{id} — Delete MCP server config

Authenticated endpoints (all users, enabled servers only for non-admins):

GET / — List configs (admins see all, members see enabled-only with admin fields redacted)
GET /{id} — Get config by ID (with auth_connected populated per-user)

OAuth2 per-user auth flow:

GET /{id}/oauth2/connect — Initiate OAuth2 flow (state cookie CSRF protection)
GET /{id}/oauth2/callback — Handle OAuth2 callback, store tokens
DELETE /{id}/oauth2/disconnect — Remove stored OAuth2 tokens

Security

Secrets never returned: OAuth2ClientSecret, APIKeyValue, and CustomHeaders are never in API responses — only boolean indicators (has_oauth2_secret, has_api_key, has_custom_headers).
Field redaction for non-admins: convertMCPServerConfigRedacted strips OAuth2ClientID, auth URLs, scopes, and APIKeyHeader from non-admin responses.
dbcrypt encryption at rest: All 5 secret fields use dbcrypt_keys encryption with full encrypt-on-write / decrypt-on-read wrappers (11 dbcrypt method overrides + 2 helpers), following the same pattern as chat_providers.api_key.
OAuth2 CSRF protection: State parameter stored in HttpOnly cookie with HTTPCookies.Apply() for correct Secure/SameSite behind TLS-terminating proxies.
dbauthz authorization: All 18 querier methods have authorization wrappers. Read operations use ActionRead, write operations use ActionUpdate on ResourceDeploymentConfig.

Governance Model

Control	Implementation
Global kill switch	`enabled` defaults to `false`
Availability policy	`force_on` (always injected), `default_on` (pre-selected), `default_off` (opt-in)
Per-chat selection	`mcp_server_ids` on `CreateChatRequest` / `CreateChatMessageRequest`
Auth gate	OAuth2 servers require per-user auth before tools are injected
Tool-level allow/deny	Arrays on `mcp_server_configs` for granular tool filtering
Secrets encrypted at rest	Uses `dbcrypt_keys` (same pattern as `chat_providers.api_key`)

Tests

8 test functions covering:

Full CRUD lifecycle (create, list, update, delete)
Non-admin visibility filtering (enabled-only, field redaction)
auth_connected population for OAuth2 vs non-OAuth2 servers
Availability policy validation (valid values + invalid rejection)
Unique slug enforcement (409 Conflict)
OAuth2 disconnect idempotency
Chat creation with mcp_server_ids persistence

Known Limitations (Deferred)

These are documented and intentional for an experimental feature:

Audit logging not yet wired — will add when feature stabilizes
Cross-field validation (e.g., OAuth2 fields required when auth_type=oauth2) — admin-only endpoint, will add when stabilizing
force_on auto-injection — query exists but not yet wired into chatd tool injection (follow-up)
Additional test coverage — 403 auth tests, GET-by-ID tests, callback CSRF tests planned for follow-up

What's NOT in this PR

Frontend UI (admin panel + chat picker)
Actual MCP client connections (chatd/chatmcp/ manager)
Tool injection into chatloop/

coder-tasks · 2026-03-18T14:48:58Z

Documentation Check

Updates Needed

docs/ai-coder/agents/chats-api.md - Document the new mcp_server_ids field on CreateChatRequest ("Create a chat" endpoint table) and CreateChatMessageRequest ("Send a follow-up" endpoint table). The field accepts an array of MCP server config UUIDs to include for the chat session.

New Documentation Needed

docs/ai-coder/agents/mcp-server-configs.md (or similar) - Document the new admin-managed MCP server configuration API (/api/experimental/mcp/servers/), covering: admin CRUD endpoints, availability policy (force_on / default_on / default_off), auth types (none / OAuth2 / API key / custom headers), field redaction for non-admins, and the per-user OAuth2 connect/disconnect flow. This parallels chats-api.md in scope for an experimental feature.

Automated review via Coder Tasks

Adds the database schema, API endpoints, and SDK types for admin-managed MCP server configurations that chatd can consume. This is the backend foundation for allowing external MCP tools (Sentry, Linear, GitHub, etc.) to be used during AI chat sessions. ## Database Three new tables: - mcp_server_configs: Admin-managed server definitions with URL, transport (Streamable HTTP / SSE), auth config, tool allow/deny lists, and an availability policy (force_on / default_on / default_off). - mcp_server_user_tokens: Per-user OAuth2 tokens for servers requiring individual authentication. - mcp_server_tool_snapshots: Admin-approved frozen tool lists, preventing silent server-side tool changes (inspired by ChatGPT's frozen snapshot pattern). New column on chats table: - mcp_server_ids UUID[]: Per-chat MCP server selection, following the same pattern as model_config_id (passed at chat creation, changeable per message). ## API Endpoints Admin endpoints (ResourceDeploymentConfig auth): - GET/POST /api/experimental/chats/mcp-servers/ - PATCH/DELETE /api/experimental/chats/mcp-servers/{id} - GET /api/experimental/chats/mcp-servers/{id}/tools - POST /api/experimental/chats/mcp-servers/{id}/refresh-tools (501 placeholder) ## Governance Model - Servers default to disabled (enabled=false) and default_off availability. - force_on: Always injected, user cannot deselect. - default_on: Pre-selected in new chats, user can deselect. - default_off: Available in picker but not pre-selected. - OAuth2 servers require per-user authentication before tools are injected. - Tool-level allow/deny lists for granular control. - Frozen tool snapshots require admin re-approval on server changes.

…points - Rename mcp_server_configs.go → mcp.go - Move routes from /chats/mcp-servers to /mcp/servers (adjacent to existing /mcp/http transport endpoint) - Update SDK client paths to match new route prefix - Add user-facing endpoints: - GET /mcp/servers/user — list enabled servers with per-user auth status - GET /mcp/servers/{id}/oauth2/connect — initiate OAuth2 flow (redirect) - GET /mcp/servers/{id}/oauth2/callback — exchange code for token, store per-user, close popup - DELETE /mcp/servers/{id}/oauth2/disconnect — revoke user's token Route structure: /api/experimental/mcp/ /servers/ — admin CRUD (GET, POST) /servers/user — user list with auth status /servers/{id}/ — admin CRUD (PATCH, DELETE) /servers/{id}/tools — tool snapshot /servers/{id}/oauth2/connect — start OAuth2 /servers/{id}/oauth2/callback — complete OAuth2 /servers/{id}/oauth2/disconnect — revoke token /http — MCP HTTP transport (existing)

The /servers/user endpoint was redundant — the main GET /servers/ endpoint already branched on admin vs non-admin. Merged the user token lookup into the main handler so a single endpoint returns configs + auth_connected status for the calling user.

- Add 8 test functions covering CRUD, non-admin visibility, auth_connected field, availability validation, slug uniqueness, tool snapshots, OAuth2 disconnect, and per-chat MCP server ID persistence. - Rename codersdk/mcpserverconfigs.go → codersdk/mcp.go to match coderd/mcp.go. - Fix nil slice bug where omitted tool_allow_list/tool_deny_list in create requests would send NULL to NOT NULL columns.

Add rename entries in sqlc.yaml so generated Go types use idiomatic casing (MCPServerConfig, MCPServerIDs, etc.) instead of sqlc's default CamelCase (McpServerConfig, McpServerIds).

…eanup - Add dbcrypt encrypt/decrypt wrappers for all MCP server methods: - MCPServerConfig: OAuth2ClientSecret, APIKeyValue, CustomHeaders - MCPServerUserToken: AccessToken, RefreshToken - 8 decrypt-only wrappers for Get* methods - 3 encrypt+decrypt wrappers for Insert/Update/Upsert methods - Fix revive lint: split convertMCPServerConfig into two functions (convertMCPServerConfig + convertMCPServerConfigRedacted) instead of a single function with bool control flag - Implement CleanupDeletedMCPServerIDsFromChats in dbauthz (was panic stub) - Remove orphaned code from incomplete sed replacement - Remove non-existent convertMCPServerConfigForRole references

…iciency - Fix getMCPServerTools 500 for non-admin users: use AsSystemRestricted for both GetMCPServerConfigByID and GetActiveMCPServerToolSnapshot, and enforce Enabled check for non-admin callers. - Fix OAuth2 state cookie Secure flag: use HTTPCookies.Apply() from deployment config instead of r.TLS != nil, matching the pattern in userauth.go. Fixes CSRF cookie not working behind TLS-terminating reverse proxies. - Fix strings.Split on empty OAuth2Scopes producing [""] instead of []: guard with non-empty check before splitting. - Add AuthConnected population to GET-by-ID handler, matching the list handler's token lookup logic. - Remove omitempty from AuthConnected bool field so false is always serialized in JSON responses. - Narrow CleanupDeletedMCPServerIDsFromChats WHERE clause to only update chats that actually reference deleted config IDs, avoiding unnecessary WAL writes.

The frozen tool snapshot table is premature without the refresh-tools flow wired up. Tool filtering is handled by tool_allow_list and tool_deny_list on mcp_server_configs, which is sufficient for the initial implementation. Removes: - mcp_server_tool_snapshots table and its partial unique index - GetActiveMCPServerToolSnapshot, InsertMCPServerToolSnapshot, DeactivateMCPServerToolSnapshots queries - getMCPServerTools and refreshMCPServerTools handlers + routes - MCPServerTool, MCPServerToolSnapshot SDK types + client method - dbauthz/dbmetrics wrappers for removed queries - TestMCPServerConfigsToolSnapshot test

- Fix revive if-return lint in dbcrypt helpers: return decryptField directly instead of redundant if-err-return-nil pattern. - Fix timeout context used after t.Parallel: create per-subtest contexts in TestMCPServerConfigsAvailability.

johnstcn

Not done reviewing yet, but there's one blocker I'm seeing immediately.

johnstcn · 2026-03-19T11:34:14Z

coderd/mcp.go

+		configs, err = api.Database.GetMCPServerConfigs(ctx)
+	} else {
+		//nolint:gocritic // All authenticated users need to read enabled MCP server configs to use the chat feature.
+		configs, err = api.Database.GetEnabledMCPServerConfigs(dbauthz.AsSystemRestricted(ctx))


I'm extremely wary of this as we store encrypted fields in this table. I know that we elide them in convertMCPServerConfig below but this is one over-eager prompt away from exposing secrets.

It would be better to store the "secret" stuff in a separate table. It's a bit more work but then you can just tell dbauthz to allow everyone to read that, and lean on the owner role to protect the secrets.

Valid concern. For now the converter functions (convertMCPServerConfig / convertMCPServerConfigRedacted) ensure secrets never reach API responses, and non-admin code paths use the redacted variant. I'll look into splitting the secret fields into a separate table as a follow-up — that would let dbauthz enforce the boundary at the DB layer rather than relying on application-level converters.

For the moment, let's add a load-bearing unit test (or an assertion in an existing test) so that if this ever does happen, it won't do so silently.

Added TestMCPServerConfigsSecretsNeverLeaked — creates a config with all secret fields populated (OAuth2 client secret, API key value, custom headers with sentinel values), then JSON-marshals the responses from every endpoint (admin create, admin list, admin get-by-ID, member list, member get-by-ID) and asserts none of the sentinel strings appear. Also verifies the Has* booleans are true and that non-admin responses have admin-only fields stripped (OAuth2ClientID, auth URLs, API key header, URL, Transport).

If anyone accidentally wires a secret field into the SDK type or converter, this test will catch it.

mafredri

🤖 mafredri sent me to read all 37 files and 3,500 new lines. My weights are exhausted but my findings are not.

Review: MCP server config backend with validation gaps, dead cleanup code, and thin test coverage for security-critical OAuth2 paths

Validation

go test ./coderd/ -run TestMCPServer -count=1 -v
Verify CleanupDeletedMCPServerIDsFromChats is called somewhere, or remove it
Test that a non-admin cannot call create/update/delete endpoints
Test the OAuth2 connect → callback round-trip

Cross-cutting findings (not tied to a single line)

P1 CleanupDeletedMCPServerIDsFromChats is defined through all generated layers but never called anywhere. Stale UUIDs accumulate in chats.mcp_server_ids after MCP server config deletion. The FK cascade handles mcp_server_user_tokens, but chats.mcp_server_ids is a UUID array with no FK constraint and no code invokes the cleanup. (Gopher, DBA, Tester, Breaker, Mutex)

P1 The entire OAuth2 connect/callback flow (~200 lines across mcpServerOAuth2Connect and mcpServerOAuth2Callback) has zero test coverage. State cookie CSRF, code exchange, token storage, HTML popup response, and error branches are all untested. (Tester)

P1 The 195 lines of dbcrypt encrypt/decrypt wrappers in enterprise/dbcrypt/dbcrypt.go for MCP server configs and user tokens have no test coverage. An encrypt/decrypt field ordering bug would ship undetected. (Tester)

P1 Non-admin write operations (create, update, delete) are never tested for authorization rejection. TestMCPServerConfigsNonAdmin creates a memberClient but only calls the list endpoint. If the RBAC check on any write endpoint were removed, no test would catch it. (Tester)

P2 All user-token dbauthz wrappers (DeleteMCPServerUserToken, UpsertMCPServerUserToken, GetMCPServerUserTokensByUserID) gate on ResourceDeploymentConfig, but every call site bypasses with AsSystemRestricted. The authz layer provides zero protection for per-user token operations. If a future caller forgets AsSystemRestricted, a normal user is blocked from managing their own token. These should either use a per-user resource or be explicitly documented as system-only. (Gopher, Sentinel)

P2 AuthConnected=true after token storage is never tested; GET /servers/{id} endpoint is entirely untested (admin vs non-admin behavior, disabled-config filtering); partial update test asserts only 4 of 15+ fields. Non-admin list response is never checked for field redaction. (Tester)

P2 The icon_url: IconURL sqlc rename is globally scoped, causing unrelated changes to coderd/workspaceproxies.go and enterprise/coderd/workspaceproxy.go. Correct Go naming fix but out of scope for this PR and creates merge conflict risk. (Gopher)

P3 Phantom sqlc renames for mcp_server_tool_snapshot, mcp_server_tool_snapshots, tools_json reference types not introduced by this migration. Dead config for a future PR. (Gopher)

🤖 Seven reviewers, one JSON payload, zero regrets. Hejdå!

coderd/chats.go

codersdk/mcp.go

coderd/mcp.go

codersdk/mcp.go

coderd/mcp.go

Fixes for P1 issues: - Validate mcp_server_ids against actual configs in postChats and postChatMessages, returning 400 if any ID doesn't exist. - Add cross-field validation on create: oauth2 requires client_id, auth_url, token_url; api_key requires header and value; custom_headers requires at least one header. - Clear stale auth fields when auth_type changes on update (e.g., switching from oauth2 to none clears all oauth2 secrets). - Check OAuth2 error query parameter in callback before looking for authorization code. - Return 500 on token lookup error in getMCPServerConfig instead of silently swallowing it. Fixes for P2 issues: - Verify auth_type==oauth2 in callback handler (not just connect). - Replace leaked err.Error() in token exchange with generic message. - Add validate:"omitempty,url" tags on OAuth2AuthURL/TokenURL. - Redact URL and Transport for non-admin callers. Fixes for P3 issues: - Add Content-Security-Policy header to OAuth2 callback HTML.

…ransaction - Use api.HTTPClient via context.WithValue(ctx, oauth2.HTTPClient, ...) for the OAuth2 token exchange instead of http.DefaultClient. This respects proxy settings and avoids SSRF with unrestricted client. - Wrap the update handler's read-modify-write in api.Database.InTx so concurrent PATCHes don't silently clobber each other. Request parsing and custom header validation stay outside the transaction.

TestMCPServerConfigsSecretsNeverLeaked creates a config with all secret fields populated (OAuth2 client secret, API key value, custom headers), then verifies via JSON marshaling that none of the sentinel secret values appear in responses from: - Admin create response - Admin list endpoint - Admin get-by-ID endpoint - Non-admin list endpoint (also checks admin fields are redacted) - Non-admin get-by-ID endpoint If a code change accidentally starts exposing secrets in API responses, this test will catch it.

coderd/database/queries/mcpserverconfigs.sql

coderd/chatd/chatd.go

coderd/database/dbauthz/dbauthz_test.go

load-bearing test added

mafredri

🤖 The neural network squinted at three commits and found closure. Almost.

All code findings from the initial review have been addressed. Verified the fixes for auth_type validation, stale field cleanup on type change, callback error handling, token error propagation, mcp_server_ids validation, SSRF/error leak, InTx on update, and redaction.

Once @johnstcn's three open items are resolved (nil MCPServerIDs crash on InsertChat, CleanupDeletedMCPServerIDsFromChats NULL bug + no caller, 15 missing dbauthz_test entries), this is good on our side.

Test coverage for the OAuth2 connect/callback flow, dbcrypt round-trip, and non-admin write rejection would be nice to have but doesn't block.

🤖 Scanning complete. Shutting down until the next git push. — mafredri's mass of weights

coderd/mcp_test.go

…tests - Coalesce nil MCPServerIDs to empty slice in chatd.CreateChat and postChats handler, preventing SQL NULL on the NOT NULL column. This protects all callers including sub-agent chat creation paths. - Fix NULL propagation in CleanupDeletedMCPServerIDsFromChats: wrap array_agg subquery in COALESCE so cleanup works when all configs have been deleted (array_agg returns NULL on empty input). - Add 15 missing dbauthz MethodTestSuite entries for all MCP server querier methods, preventing CI failures from TearDownSuite.

- Remove outdated comment claiming GET-by-ID endpoint doesn't exist. - Make member get-by-ID assertions consistent with member list loop: assert all 7 redacted fields (OAuth2ClientID, OAuth2AuthURL, OAuth2TokenURL, OAuth2Scopes, APIKeyHeader, URL, Transport). - Add missing OAuth2Scopes assertion to member list loop.

mafredri

One last suggestion, otherwise LGTM on my end. But I'll defer to Cian for the dbcrypt stuff.

coderd/database/migrations/000447_mcp_server_configs.up.sql

johnstcn

Just missing the dbcrypt internal test stuff. The code paths do already seem to be tested, but this is a good guard against future breakage. I don't need to review again.

coderd/chatd/chatd.go

coderd/database/dbauthz/dbauthz.go

coderd/chats.go

codersdk/mcp.go

enterprise/dbcrypt/dbcrypt.go

Use COALESCE(@mcp_server_ids::uuid[], '{}'::uuid[]) in the INSERT so callers that pass nil (Go zero value for []uuid.UUID) get an empty array instead of SQL NULL. This fixes NOT NULL constraint violations in existing tests that construct InsertChatParams without the new mcp_server_ids field.

- Add comprehensive dbcrypt tests for all 11 MCP server methods: 8 subtests for config encrypt/decrypt (Insert, Get*, Update) and 3 subtests for user token encrypt/decrypt (Upsert, Get*). Each verifies plaintext on read and ciphertext in raw DB. - Improve invalid MCP server ID error message to list which specific IDs are invalid, in both postChats and postChatMessages. - Fix CleanupDeletedMCPServerIDsFromChats RBAC assertion to use ResourceChat (it modifies chats) instead of ResourceDeploymentConfig.

- Add //nolint:gosec directives for test credential constants. - Fix context.Context parameter ordering (must be first) in requireMCPServerConfigRawEncrypted helper. - Fix mangled const block and closing paren.

Add fixture file 000447_mcp_server_configs.up.sql with sample rows for both mcp_server_configs and mcp_server_user_tokens tables. This fixes the TestMigrateUpWithFixtures failure that requires all new tables to have fixture data.

github-actions bot assigned kylecarbs Mar 18, 2026

kylecarbs marked this pull request as draft March 18, 2026 15:05

kylecarbs changed the title ~~feat(chatd): add MCP server configuration backend~~ feat: add MCP server configuration backend for chats Mar 18, 2026

kylecarbs requested a review from johnstcn March 19, 2026 11:10

kylecarbs added 8 commits March 19, 2026 11:11

fix: add sqlc rename rules for MCP types

9e77ae5

Add rename entries in sqlc.yaml so generated Go types use idiomatic casing (MCPServerConfig, MCPServerIDs, etc.) instead of sqlc's default CamelCase (McpServerConfig, McpServerIds).

kylecarbs force-pushed the mcp-server-configs branch from aac8787 to cc74088 Compare March 19, 2026 11:13

kylecarbs added 2 commits March 19, 2026 11:22

chore: renumber migration 000444 → 000447 after rebase on main

974269a

fix: resolve fmt and lint issues

c22e304

- Fix revive if-return lint in dbcrypt helpers: return decryptField directly instead of redundant if-err-return-nil pattern. - Fix timeout context used after t.Parallel: create per-subtest contexts in TestMCPServerConfigsAvailability.

kylecarbs force-pushed the mcp-server-configs branch from cc74088 to c22e304 Compare March 19, 2026 11:26

johnstcn previously requested changes Mar 19, 2026

View reviewed changes

mafredri reviewed Mar 19, 2026

View reviewed changes

kylecarbs added 2 commits March 19, 2026 12:04

kylecarbs requested review from johnstcn and mafredri March 19, 2026 12:23