A federated knowledge system served through an MCP (Model Context Protocol) server. Multiple GitHub repos contribute structured markdown knowledge files. A central MCP server indexes all sources into PostgreSQL + pgvector, and exposes the merged knowledge to any LLM tool that supports MCP.
Beyond search and retrieval, the Knowledge Network includes a knowledge lifecycle system — tools for onboarding, capturing knowledge from any source, automated quality review, and self-improvement loops that let knowledge grow organically from daily work.
MCP Clients (claude.ai, Claude Desktop, Claude Code, Cursor, etc.)
|
| Streamable HTTP + OAuth 2.1 / PAT auth
v
MCP Server (Node.js / TypeScript)
├── get_knowledge(query) — hybrid semantic + keyword search
├── get_alignment_context(text) — find relevant company values/stances
├── get_preferences(context) — user preferences from GitHub
├── list_sources() — available knowledge repos
├── onboard() — first-time setup + system instructions (planned)
└── capture_knowledge(content) — format and route knowledge contributions (planned)
| |
PostgreSQL + pgvector GitHub API + OAuth
(chunks, embeddings, (identity, file fetch,
sessions, analytics) access control)
- A PR merges to
mainin a knowledge repo - A GitHub Action detects changed
.mdfiles - The Action calls
POST /api/indexwith the file list - The server fetches each file, extracts frontmatter, splits by
##sections - Each section is embedded via Voyage AI (1024-dim vectors)
- Chunks are upserted into pgvector with deterministic IDs
- An MCP client calls
get_knowledge("FERPA compliance") - The query is embedded via Voyage AI
- Two searches run in parallel: pgvector cosine similarity + PostgreSQL full-text search
- Results are merged using Reciprocal Rank Fusion (RRF)
- Top results are returned with full content and metadata
| Layer | Technology |
|---|---|
| Language | TypeScript (strict mode, ESM) |
| Runtime | Node.js 22 |
| Package manager | pnpm |
| MCP SDK | @modelcontextprotocol/sdk |
| HTTP framework | Express 5 |
| Database | PostgreSQL 17 + pgvector |
| Embeddings | Voyage AI voyage-3 (1024 dimensions) |
| DB client | pg (no ORM) |
| Caching | lru-cache (response + preferences) |
| Testing | Vitest |
| Validation | Zod |
| Markdown parsing | gray-matter + remark |
- Docker Desktop or OrbStack
- pnpm (
corepack enable && corepack prepare pnpm@latest --activate) - Caddy reverse proxy running on the host (see Caddy setup below)
git clone <repo-url>
cd knowledge-network
pnpm installcp .env.example .envEdit .env and fill in your API keys:
| Variable | Where to get it |
|---|---|
VOYAGE_API_KEY |
dash.voyageai.com — sign up and create an API key |
GITHUB_TOKEN |
github.com/settings/tokens — classic token with repo (read) and read:org scopes |
WEBHOOK_SECRET |
Any random string (secures the indexing webhook) |
GITHUB_CLIENT_ID |
GitHub OAuth App — see docs/guides/oauth-setup.md |
GITHUB_CLIENT_SECRET |
GitHub OAuth App — see docs/guides/oauth-setup.md |
TOKEN_ENCRYPTION_KEY |
Run: node -e "console.log(require('crypto').randomBytes(32).toString('hex'))" |
The other variables (DATABASE_URL, MANIFEST_PATH, PORT, NODE_ENV, OAUTH_ISSUER, etc.) have working defaults for local development. See .env.example for the full list.
docker compose up -dThis starts two containers:
- knowledge-network-dev — Node.js 22 app server (auto-installs dependencies, runs migrations on startup)
- knowledge-network-db — PostgreSQL 17 with pgvector
The app is available at https://knowledge-network.localhost (via Caddy).
First startup takes a minute or two while it installs dependencies inside the container. Check progress with:
docker compose logs app --tail 20You should see Knowledge Network MCP server listening on port 4000 when it's ready.
docker compose exec app pnpm seedThis indexes the 6 seed documents in seed/company/ into the database. Requires a valid VOYAGE_API_KEY.
# Health check
curl -sk https://knowledge-network.localhost/health
# Run unit tests
docker compose exec app pnpm testsrc/
├── index.ts # MCP server + Express entry point
├── types.ts # Shared TypeScript types (Chunk, Frontmatter, AuthContext, etc.)
├── manifest.ts # Repo manifest loader (repos.yaml)
├── seed.ts # Seeds the database from seed/ files
├── db/
│ ├── schema.sql # PostgreSQL schema (pgvector, chunks, repos, sessions, etc.)
│ ├── connection.ts # Database connection pool
│ ├── migrate.ts # Schema migration runner
│ └── migrations/ # Incremental SQL migrations
├── indexing/
│ ├── parser.ts # Markdown frontmatter extraction (gray-matter + Zod)
│ ├── chunker.ts # Splits documents by ## sections
│ ├── embedder.ts # Voyage AI embedding client
│ └── indexer.ts # Orchestrator: parse -> chunk -> embed -> upsert
├── search/
│ ├── hybrid-search.ts # pgvector + tsvector with RRF merging
│ └── cache.ts # LRU response cache (1hr TTL)
├── tools/
│ ├── get-knowledge.ts # Search the knowledge base
│ ├── get-alignment-context.ts # Find relevant company values
│ ├── get-preferences.ts # Fetch user preferences from GitHub
│ └── list-sources.ts # List available knowledge repos
├── auth/
│ ├── github.ts # GitHub token verification + access control
│ ├── middleware.ts # Unified auth middleware (OAuth + PAT)
│ └── crypto.ts # Token hashing + encryption helpers
├── oauth/
│ ├── metadata.ts # .well-known discovery endpoints
│ ├── register.ts # Dynamic client registration
│ ├── authorize.ts # OAuth authorize (redirects to GitHub)
│ ├── callback.ts # GitHub OAuth callback handler
│ ├── token.ts # Token exchange and refresh
│ └── revoke.ts # Token revocation
├── admin/
│ ├── repos.ts # Repo listing and re-indexing
│ ├── cache.ts # Cache management
│ ├── sessions.ts # Session listing and revocation
│ └── analytics.ts # Usage analytics and query log
├── webhook/
│ └── index-handler.ts # POST /api/index — HMAC-secured webhook
├── telemetry/
│ ├── logger.ts # Query telemetry logging
│ └── structured-logger.ts # Structured JSON logging (pino)
├── validation/
│ └── validate-frontmatter.ts # Frontmatter validation script
└── __tests__/ # Unit tests (~89 tests)
├── parser.test.ts
├── chunker.test.ts
├── embedder.test.ts
├── manifest.test.ts
├── cache.test.ts
├── webhook.test.ts
├── tool-formatting.test.ts
├── validation.test.ts
├── middleware.test.ts
├── crypto.test.ts
├── oauth-register.test.ts
├── oauth-token.test.ts
├── oauth-revoke.test.ts
├── admin.test.ts
├── analytics.test.ts
├── query-preprocessing.test.ts
├── recursive-chunking.test.ts
└── fixtures/ # Test fixtures (sample markdown files)
docs/
├── strategy/
│ ├── vision-and-problem-statement.md
│ ├── user-research-and-personas.md
│ ├── jobs-to-be-done.md
│ ├── business-case.md
│ └── product-strategy-and-roadmap.md
├── requirements/
│ ├── PRD.md
│ └── technical-review-2026-03-31.md
├── guides/
│ ├── oauth-setup.md
│ ├── admin-guide.md
│ └── content-schema.md
└── implementation-plan/
seed/ # Seed knowledge documents
├── repos.yaml # Repo manifest
└── company/ # Company knowledge files
├── mission-values.md
├── positioning.md
├── ethical-stances.md
├── higher-ed-landscape.md
└── compliance/
├── ferpa.md
└── accessibility.md
All commands run inside Docker unless noted otherwise.
| Command | What it does |
|---|---|
docker compose up -d |
Start the dev environment |
docker compose down |
Stop all containers |
docker compose logs app --tail 30 |
View app logs |
docker compose exec app pnpm test |
Run unit tests (~89 tests) |
docker compose exec app pnpm seed |
Index seed documents into the database |
docker compose exec app pnpm migrate |
Run database migrations |
docker compose exec app pnpm validate seed |
Validate frontmatter in seed files |
docker compose exec app pnpm typecheck |
Run TypeScript type checking |
Do not run pnpm dev directly on the host. The app requires Node 22 and a PostgreSQL database, both of which are provided by Docker.
The Knowledge Network isn't just a place to look things up — it's a living system where knowledge flows in from many sources, passes through quality gates, and becomes shared organizational intelligence.
People on the team build their own creative systems for discovering useful knowledge — monitoring CI/CD pipelines, scanning meeting transcripts, comparing AI-drafted emails to what they actually send, running retrospectives on development decisions. The Knowledge Network doesn't standardize how people discover insights. It standardizes the last mile: once you have something worth capturing, the capture_knowledge MCP tool formats it, applies company terminology, and routes it to the right place.
Captured knowledge passes through two review layers:
- Automated critic — Checks length (optimized for token efficiency when served via MCP), terminology consistency (e.g., "customers" not "clients", "faculty members" not "teachers"), format compliance, and contradiction detection against existing knowledge.
- Human review — Domain experts review and approve contributions before they merge into the shared knowledge base.
The onboard MCP tool introduces new users to the Knowledge Network and writes system instructions to their AI assistant's config file (CLAUDE.md, .cursorrules, etc.) so that Digication-related questions are automatically routed to the MCP instead of relying on the LLM's general training data.
Knowledge is organized into multiple repos by function — company-wide values, sales, engineering, HR, etc. — each maintained by the team that knows the domain best. Personal preference repos give individuals a private space for communication style and working preferences, with clear boundaries between personal content and company IP.
The system is designed to support self-improvement loops where AI agents learn from their own output and from human corrections. Examples include comparing draft emails to sent versions, extracting patterns from meeting transcripts, and capturing engineering insights from CI/CD results. A patterns library documents reusable setups that people can adapt.
For the full lifecycle design, see docs/implementation-plan/knowledge-network-lifecycle/.
The MCP server supports two authentication methods:
- OAuth 2.1 (primary) — Used by claude.ai, Claude Desktop, CoWork, and other browser-based MCP clients. Users log in via GitHub and are connected automatically. No tokens to copy.
- GitHub PAT (backward compatible) — Used by Claude Code, Cursor, and CLI-based tools. Requires a personal access token passed in the
Authorizationheader.
Both methods verify the user's identity through GitHub and determine which knowledge repos they can access based on their GitHub permissions.
Claude.ai and CoWork use the OAuth flow. Users don't need to manage tokens -- they authorize once through GitHub and they're connected.
- Go to claude.ai > Settings > Connectors > Add custom connector
- Enter the server URL:
https://knowledge-network.up.railway.app/mcp - Claude discovers the OAuth endpoints automatically and prompts you to authorize
- Click "Authorize" and log in with your GitHub account
- You're connected! Try asking: "What is Digication's mission?"
Claude Desktop and CoWork work the same way -- add the server URL in MCP settings, and OAuth handles the rest.
For detailed setup instructions (creating OAuth Apps, environment variables, troubleshooting), see docs/guides/oauth-setup.md.
Claude Code uses a GitHub PAT (personal access token) for authentication:
- Create a token at github.com/settings/tokens with
repo(read) andread:orgscopes - Add the MCP server:
claude mcp add knowledge-network https://knowledge-network.up.railway.app/mcp --transport http --header "Authorization: Bearer YOUR_GITHUB_TOKEN"Or add it manually to ~/.claude.json:
{
"mcpServers": {
"knowledge-network": {
"type": "http",
"url": "https://knowledge-network.up.railway.app/mcp",
"headers": {
"Authorization": "Bearer ${GITHUB_TOKEN}"
}
}
}
}If using the ${GITHUB_TOKEN} environment variable approach, add this to your ~/.zshrc (or ~/.bashrc):
export GITHUB_TOKEN=your-token-hereAfter setup, restart Claude Code. You can test by asking: "What is Digication's FERPA compliance policy?"
Any MCP client that supports Streamable HTTP can connect. Point it at:
- URL:
https://knowledge-network.up.railway.app/mcp - Auth header:
Authorization: Bearer <your-github-pat>
Admin endpoints (analytics, session management, re-indexing) require membership in a GitHub team. See docs/guides/admin-guide.md for setup and usage.
| Endpoint | Method | Auth | Description |
|---|---|---|---|
/health |
GET | None | Health check — returns status, version, repo count |
/.well-known/oauth-protected-resource |
GET | None | OAuth resource metadata (discovery) |
/.well-known/oauth-authorization-server |
GET | None | OAuth server metadata (discovery) |
/oauth/register |
POST | None | Dynamic client registration |
/oauth/authorize |
GET | None | Start OAuth flow (redirects to GitHub) |
/oauth/callback |
GET | None | GitHub OAuth callback |
/oauth/token |
POST | None | Token exchange and refresh |
/oauth/revoke |
POST | Bearer | Revoke a token |
/mcp |
POST | Bearer | MCP protocol endpoint (OAuth or PAT) |
/api/index |
POST | HMAC | Webhook for GitHub Actions to trigger re-indexing |
/admin/repos |
GET | Admin | List repos with indexing status |
/admin/repos/:name/reindex |
POST | Admin | Trigger repo re-index |
/admin/cache/clear |
POST | Admin | Clear search response cache |
/admin/analytics |
GET | Admin | Usage analytics summary |
/admin/analytics/queries |
GET | Admin | Recent query log |
/admin/sessions |
GET | Admin | List active OAuth sessions |
/admin/sessions/:id |
DELETE | Admin | Revoke an OAuth session |
Knowledge files are markdown with YAML frontmatter. Only domain and owner are required:
---
domain: compliance # Required: knowledge area
owner: legal-team # Required: who maintains this
tags: [ferpa, privacy] # Optional: searchable tags
audience: [all] # Optional: who this is for (default: all)
classification: internal # Optional: public | internal | restricted (default: internal)
confidence: current # Optional: established | current | evolving | exploratory
---
# Document Title
## Section One
Content here becomes a searchable chunk.
## Section Two
Each ## section is indexed separately with its own embedding.See docs/guides/content-schema.md for the full specification.
The dev environment uses a shared Caddy reverse proxy for HTTPS routing at *.localhost domains. If Caddy is not already set up on your machine:
- Create
~/caddy/docker-compose.yml:
services:
caddy:
image: lucaslorentz/caddy-docker-proxy:ci-alpine
container_name: caddy
restart: unless-stopped
ports:
- "80:80"
- "443:443"
volumes:
- /var/run/docker.sock:/var/run/docker.sock
- caddy_data:/data
networks:
- web
networks:
web:
external: true
volumes:
caddy_data:- Create the shared network and start Caddy:
docker network create web 2>/dev/null || true
cd ~/caddy && docker compose up -dAfter this, any Docker service with the right Caddy labels will be automatically accessible at https://<name>.localhost.
The app is deployed to Railway at https://knowledge-network.up.railway.app.
The Railway project ("knowledge-network") has two services:
- knowledge-network — the Node.js app, built with Nixpacks from this repo
- Postgres — managed PostgreSQL with pgvector extension
| Variable | Value | Notes |
|---|---|---|
DATABASE_URL |
(auto-injected by Railway) | Points to the managed Postgres instance |
PORT |
(auto-injected by Railway) | Do not set manually |
VOYAGE_API_KEY |
(secret) | Voyage AI API key |
GITHUB_TOKEN |
(secret) | GitHub PAT with repo read access |
WEBHOOK_SECRET |
(secret) | Shared secret for webhook HMAC verification |
MANIFEST_PATH |
seed/repos.yaml |
Path to the repo manifest |
NODE_ENV |
production |
Enables SSL for database connections |
GITHUB_CLIENT_ID |
(secret) | GitHub OAuth App client ID (Phase 2) |
GITHUB_CLIENT_SECRET |
(secret) | GitHub OAuth App client secret (Phase 2) |
TOKEN_ENCRYPTION_KEY |
(secret) | 64-char hex key for token encryption (Phase 2) |
OAUTH_ISSUER |
https://knowledge-network.up.railway.app |
Must match the public URL (Phase 2) |
GITHUB_ORG |
Digication |
GitHub org for admin team checks (Phase 2) |
ADMIN_GITHUB_TEAM |
knowledge-network-admins |
GitHub team slug for admin access (Phase 2) |
LOG_LEVEL |
info |
Options: debug, info, warn, error (Phase 2) |
Railway is configured to deploy from the main branch. To deploy manually:
railway upOr push to main if auto-deploy is connected via the Railway dashboard.
Deployment is configured in railway.json:
- Builder: Nixpacks (auto-detects Node.js/pnpm)
- Build: runs
pnpm build(TypeScript compile + copy schema.sql) - Start:
node dist/index.js - Health check: polls
/health - Restart policy: restarts on failure (up to 3 times)
- Install the Railway CLI:
npm install -g @railway/cli - Log in:
railway login - Initialize:
railway init(from the project directory) - Add PostgreSQL:
railway add --database postgres - Set environment variables:
railway variables set VOYAGE_API_KEY=<key> railway variables set GITHUB_TOKEN=<token> railway variables set WEBHOOK_SECRET=<secret> railway variables set MANIFEST_PATH=seed/repos.yaml railway variables set NODE_ENV=production railway variables set GITHUB_CLIENT_ID=<oauth-client-id> railway variables set GITHUB_CLIENT_SECRET=<oauth-client-secret> railway variables set TOKEN_ENCRYPTION_KEY=<64-char-hex> railway variables set OAUTH_ISSUER=https://your-app.up.railway.app railway variables set GITHUB_ORG=<your-github-org> railway variables set ADMIN_GITHUB_TEAM=<team-slug> railway variables set LOG_LEVEL=info
- Set
DATABASE_URLto the PostgreSQL connection string from Railway - Deploy:
railway up - Generate a public domain:
railway domain
Two GitHub Actions workflows are included:
| Workflow | Trigger | What it does |
|---|---|---|
.github/workflows/validate-content.yml |
PRs touching .md files |
Validates frontmatter against the schema |
.github/workflows/index-on-merge.yml |
Called by knowledge repos on merge | Sends changed files to the server for re-indexing |
In each knowledge repo that should be indexed:
- Add a workflow that calls
index-on-merge.yml - Set these GitHub Actions secrets:
KN_WEBHOOK_SECRET— same value as the server'sWEBHOOK_SECRETKN_MCP_SERVER_URL— the server URL (e.g.,https://knowledge-network.up.railway.app)