Knowledge Network

A federated knowledge system served through an MCP (Model Context Protocol) server. Multiple GitHub repos contribute structured markdown knowledge files. A central MCP server indexes all sources into PostgreSQL + pgvector, and exposes the merged knowledge to any LLM tool that supports MCP.

Beyond search and retrieval, the Knowledge Network includes a knowledge lifecycle system — tools for onboarding, capturing knowledge from any source, automated quality review, and self-improvement loops that let knowledge grow organically from daily work.

Architecture

MCP Clients (claude.ai, Claude Desktop, Claude Code, Cursor, etc.)
        |
        | Streamable HTTP + OAuth 2.1 / PAT auth
        v
  MCP Server (Node.js / TypeScript)
  ├── get_knowledge(query)        — hybrid semantic + keyword search
  ├── get_alignment_context(text) — find relevant company values/stances
  ├── get_preferences(context)    — user preferences from GitHub
  ├── list_sources()              — available knowledge repos
  ├── onboard()                   — first-time setup + system instructions (planned)
  └── capture_knowledge(content)  — format and route knowledge contributions (planned)
        |                    |
  PostgreSQL + pgvector    GitHub API + OAuth
  (chunks, embeddings,     (identity, file fetch,
   sessions, analytics)     access control)

How indexing works

A PR merges to main in a knowledge repo
A GitHub Action detects changed .md files
The Action calls POST /api/index with the file list
The server fetches each file, extracts frontmatter, splits by ## sections
Each section is embedded via Voyage AI (1024-dim vectors)
Chunks are upserted into pgvector with deterministic IDs

How search works

An MCP client calls get_knowledge("FERPA compliance")
The query is embedded via Voyage AI
Two searches run in parallel: pgvector cosine similarity + PostgreSQL full-text search
Results are merged using Reciprocal Rank Fusion (RRF)
Top results are returned with full content and metadata

Tech stack

Layer	Technology
Language	TypeScript (strict mode, ESM)
Runtime	Node.js 22
Package manager	pnpm
MCP SDK	`@modelcontextprotocol/sdk`
HTTP framework	Express 5
Database	PostgreSQL 17 + pgvector
Embeddings	Voyage AI `voyage-3` (1024 dimensions)
DB client	`pg` (no ORM)
Caching	`lru-cache` (response + preferences)
Testing	Vitest
Validation	Zod
Markdown parsing	gray-matter + remark

Prerequisites

Docker Desktop or OrbStack
pnpm (corepack enable && corepack prepare pnpm@latest --activate)
Caddy reverse proxy running on the host (see Caddy setup below)

Getting started

1. Clone and install

git clone <repo-url>
cd knowledge-network
pnpm install

2. Configure environment

cp .env.example .env

Edit .env and fill in your API keys:

Variable	Where to get it
`VOYAGE_API_KEY`	dash.voyageai.com — sign up and create an API key
`GITHUB_TOKEN`	github.com/settings/tokens — classic token with `repo` (read) and `read:org` scopes
`WEBHOOK_SECRET`	Any random string (secures the indexing webhook)
`GITHUB_CLIENT_ID`	GitHub OAuth App — see docs/guides/oauth-setup.md
`GITHUB_CLIENT_SECRET`	GitHub OAuth App — see docs/guides/oauth-setup.md
`TOKEN_ENCRYPTION_KEY`	Run: `node -e "console.log(require('crypto').randomBytes(32).toString('hex'))"`

The other variables (DATABASE_URL, MANIFEST_PATH, PORT, NODE_ENV, OAUTH_ISSUER, etc.) have working defaults for local development. See .env.example for the full list.

3. Start the dev environment

docker compose up -d

This starts two containers:

knowledge-network-dev — Node.js 22 app server (auto-installs dependencies, runs migrations on startup)
knowledge-network-db — PostgreSQL 17 with pgvector

The app is available at https://knowledge-network.localhost (via Caddy).

First startup takes a minute or two while it installs dependencies inside the container. Check progress with:

docker compose logs app --tail 20

You should see Knowledge Network MCP server listening on port 4000 when it's ready.

4. Seed the knowledge base

docker compose exec app pnpm seed

This indexes the 6 seed documents in seed/company/ into the database. Requires a valid VOYAGE_API_KEY.

5. Verify everything works

# Health check
curl -sk https://knowledge-network.localhost/health

# Run unit tests
docker compose exec app pnpm test

Project structure

src/
├── index.ts                  # MCP server + Express entry point
├── types.ts                  # Shared TypeScript types (Chunk, Frontmatter, AuthContext, etc.)
├── manifest.ts               # Repo manifest loader (repos.yaml)
├── seed.ts                   # Seeds the database from seed/ files
├── db/
│   ├── schema.sql            # PostgreSQL schema (pgvector, chunks, repos, sessions, etc.)
│   ├── connection.ts         # Database connection pool
│   ├── migrate.ts            # Schema migration runner
│   └── migrations/           # Incremental SQL migrations
├── indexing/
│   ├── parser.ts             # Markdown frontmatter extraction (gray-matter + Zod)
│   ├── chunker.ts            # Splits documents by ## sections
│   ├── embedder.ts           # Voyage AI embedding client
│   └── indexer.ts            # Orchestrator: parse -> chunk -> embed -> upsert
├── search/
│   ├── hybrid-search.ts      # pgvector + tsvector with RRF merging
│   └── cache.ts              # LRU response cache (1hr TTL)
├── tools/
│   ├── get-knowledge.ts      # Search the knowledge base
│   ├── get-alignment-context.ts  # Find relevant company values
│   ├── get-preferences.ts    # Fetch user preferences from GitHub
│   └── list-sources.ts       # List available knowledge repos
├── auth/
│   ├── github.ts             # GitHub token verification + access control
│   ├── middleware.ts          # Unified auth middleware (OAuth + PAT)
│   └── crypto.ts             # Token hashing + encryption helpers
├── oauth/
│   ├── metadata.ts           # .well-known discovery endpoints
│   ├── register.ts           # Dynamic client registration
│   ├── authorize.ts          # OAuth authorize (redirects to GitHub)
│   ├── callback.ts           # GitHub OAuth callback handler
│   ├── token.ts              # Token exchange and refresh
│   └── revoke.ts             # Token revocation
├── admin/
│   ├── repos.ts              # Repo listing and re-indexing
│   ├── cache.ts              # Cache management
│   ├── sessions.ts           # Session listing and revocation
│   └── analytics.ts          # Usage analytics and query log
├── webhook/
│   └── index-handler.ts      # POST /api/index — HMAC-secured webhook
├── telemetry/
│   ├── logger.ts             # Query telemetry logging
│   └── structured-logger.ts  # Structured JSON logging (pino)
├── validation/
│   └── validate-frontmatter.ts   # Frontmatter validation script
└── __tests__/                # Unit tests (~89 tests)
    ├── parser.test.ts
    ├── chunker.test.ts
    ├── embedder.test.ts
    ├── manifest.test.ts
    ├── cache.test.ts
    ├── webhook.test.ts
    ├── tool-formatting.test.ts
    ├── validation.test.ts
    ├── middleware.test.ts
    ├── crypto.test.ts
    ├── oauth-register.test.ts
    ├── oauth-token.test.ts
    ├── oauth-revoke.test.ts
    ├── admin.test.ts
    ├── analytics.test.ts
    ├── query-preprocessing.test.ts
    ├── recursive-chunking.test.ts
    └── fixtures/             # Test fixtures (sample markdown files)

docs/
├── strategy/
│   ├── vision-and-problem-statement.md
│   ├── user-research-and-personas.md
│   ├── jobs-to-be-done.md
│   ├── business-case.md
│   └── product-strategy-and-roadmap.md
├── requirements/
│   ├── PRD.md
│   └── technical-review-2026-03-31.md
├── guides/
│   ├── oauth-setup.md
│   ├── admin-guide.md
│   └── content-schema.md
└── implementation-plan/

seed/                         # Seed knowledge documents
├── repos.yaml                # Repo manifest
└── company/                  # Company knowledge files
    ├── mission-values.md
    ├── positioning.md
    ├── ethical-stances.md
    ├── higher-ed-landscape.md
    └── compliance/
        ├── ferpa.md
        └── accessibility.md

Common commands

All commands run inside Docker unless noted otherwise.

Command	What it does
`docker compose up -d`	Start the dev environment
`docker compose down`	Stop all containers
`docker compose logs app --tail 30`	View app logs
`docker compose exec app pnpm test`	Run unit tests (~89 tests)
`docker compose exec app pnpm seed`	Index seed documents into the database
`docker compose exec app pnpm migrate`	Run database migrations
`docker compose exec app pnpm validate seed`	Validate frontmatter in seed files
`docker compose exec app pnpm typecheck`	Run TypeScript type checking

Do not run pnpm dev directly on the host. The app requires Node 22 and a PostgreSQL database, both of which are provided by Docker.

Knowledge lifecycle

The Knowledge Network isn't just a place to look things up — it's a living system where knowledge flows in from many sources, passes through quality gates, and becomes shared organizational intelligence.

How knowledge flows in

People on the team build their own creative systems for discovering useful knowledge — monitoring CI/CD pipelines, scanning meeting transcripts, comparing AI-drafted emails to what they actually send, running retrospectives on development decisions. The Knowledge Network doesn't standardize how people discover insights. It standardizes the last mile: once you have something worth capturing, the capture_knowledge MCP tool formats it, applies company terminology, and routes it to the right place.

Quality gates

Captured knowledge passes through two review layers:

Automated critic — Checks length (optimized for token efficiency when served via MCP), terminology consistency (e.g., "customers" not "clients", "faculty members" not "teachers"), format compliance, and contradiction detection against existing knowledge.
Human review — Domain experts review and approve contributions before they merge into the shared knowledge base.

Onboarding

The onboard MCP tool introduces new users to the Knowledge Network and writes system instructions to their AI assistant's config file (CLAUDE.md, .cursorrules, etc.) so that Digication-related questions are automatically routed to the MCP instead of relying on the LLM's general training data.

Multi-repo architecture

Knowledge is organized into multiple repos by function — company-wide values, sales, engineering, HR, etc. — each maintained by the team that knows the domain best. Personal preference repos give individuals a private space for communication style and working preferences, with clear boundaries between personal content and company IP.

Self-improvement loops

The system is designed to support self-improvement loops where AI agents learn from their own output and from human corrections. Examples include comparing draft emails to sent versions, extracting patterns from meeting transcripts, and capturing engineering insights from CI/CD results. A patterns library documents reusable setups that people can adapt.

For the full lifecycle design, see docs/implementation-plan/knowledge-network-lifecycle/.

Authentication

The MCP server supports two authentication methods:

OAuth 2.1 (primary) — Used by claude.ai, Claude Desktop, CoWork, and other browser-based MCP clients. Users log in via GitHub and are connected automatically. No tokens to copy.
GitHub PAT (backward compatible) — Used by Claude Code, Cursor, and CLI-based tools. Requires a personal access token passed in the Authorization header.

Both methods verify the user's identity through GitHub and determine which knowledge repos they can access based on their GitHub permissions.

Connecting from claude.ai and CoWork

Claude.ai and CoWork use the OAuth flow. Users don't need to manage tokens -- they authorize once through GitHub and they're connected.

Go to claude.ai > Settings > Connectors > Add custom connector
Enter the server URL: https://knowledge-network.up.railway.app/mcp
Claude discovers the OAuth endpoints automatically and prompts you to authorize
Click "Authorize" and log in with your GitHub account
You're connected! Try asking: "What is Digication's mission?"

Claude Desktop and CoWork work the same way -- add the server URL in MCP settings, and OAuth handles the rest.

For detailed setup instructions (creating OAuth Apps, environment variables, troubleshooting), see docs/guides/oauth-setup.md.

Connecting Claude Code

Claude Code uses a GitHub PAT (personal access token) for authentication:

Create a token at github.com/settings/tokens with repo (read) and read:org scopes
Add the MCP server:

claude mcp add knowledge-network https://knowledge-network.up.railway.app/mcp --transport http --header "Authorization: Bearer YOUR_GITHUB_TOKEN"

Or add it manually to ~/.claude.json:

{
  "mcpServers": {
    "knowledge-network": {
      "type": "http",
      "url": "https://knowledge-network.up.railway.app/mcp",
      "headers": {
        "Authorization": "Bearer ${GITHUB_TOKEN}"
      }
    }
  }
}

If using the ${GITHUB_TOKEN} environment variable approach, add this to your ~/.zshrc (or ~/.bashrc):

export GITHUB_TOKEN=your-token-here

After setup, restart Claude Code. You can test by asking: "What is Digication's FERPA compliance policy?"

Connecting other MCP clients (Cursor, etc.)

Any MCP client that supports Streamable HTTP can connect. Point it at:

URL: https://knowledge-network.up.railway.app/mcp
Auth header: Authorization: Bearer <your-github-pat>

Admin access

Admin endpoints (analytics, session management, re-indexing) require membership in a GitHub team. See docs/guides/admin-guide.md for setup and usage.

Endpoints

Endpoint	Method	Auth	Description
`/health`	GET	None	Health check — returns status, version, repo count
`/.well-known/oauth-protected-resource`	GET	None	OAuth resource metadata (discovery)
`/.well-known/oauth-authorization-server`	GET	None	OAuth server metadata (discovery)
`/oauth/register`	POST	None	Dynamic client registration
`/oauth/authorize`	GET	None	Start OAuth flow (redirects to GitHub)
`/oauth/callback`	GET	None	GitHub OAuth callback
`/oauth/token`	POST	None	Token exchange and refresh
`/oauth/revoke`	POST	Bearer	Revoke a token
`/mcp`	POST	Bearer	MCP protocol endpoint (OAuth or PAT)
`/api/index`	POST	HMAC	Webhook for GitHub Actions to trigger re-indexing
`/admin/repos`	GET	Admin	List repos with indexing status
`/admin/repos/:name/reindex`	POST	Admin	Trigger repo re-index
`/admin/cache/clear`	POST	Admin	Clear search response cache
`/admin/analytics`	GET	Admin	Usage analytics summary
`/admin/analytics/queries`	GET	Admin	Recent query log
`/admin/sessions`	GET	Admin	List active OAuth sessions
`/admin/sessions/:id`	DELETE	Admin	Revoke an OAuth session

Knowledge file format

Knowledge files are markdown with YAML frontmatter. Only domain and owner are required:

---
domain: compliance          # Required: knowledge area
owner: legal-team           # Required: who maintains this
tags: [ferpa, privacy]      # Optional: searchable tags
audience: [all]             # Optional: who this is for (default: all)
classification: internal    # Optional: public | internal | restricted (default: internal)
confidence: current         # Optional: established | current | evolving | exploratory
---

# Document Title

## Section One

Content here becomes a searchable chunk.

## Section Two

Each ## section is indexed separately with its own embedding.

See docs/guides/content-schema.md for the full specification.

Caddy setup

The dev environment uses a shared Caddy reverse proxy for HTTPS routing at *.localhost domains. If Caddy is not already set up on your machine:

Create ~/caddy/docker-compose.yml:

services:
  caddy:
    image: lucaslorentz/caddy-docker-proxy:ci-alpine
    container_name: caddy
    restart: unless-stopped
    ports:
      - "80:80"
      - "443:443"
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock
      - caddy_data:/data
    networks:
      - web

networks:
  web:
    external: true

volumes:
  caddy_data:

Create the shared network and start Caddy:

docker network create web 2>/dev/null || true
cd ~/caddy && docker compose up -d

After this, any Docker service with the right Caddy labels will be automatically accessible at https://<name>.localhost.

Production deployment (Railway)

The app is deployed to Railway at https://knowledge-network.up.railway.app.

Services

The Railway project ("knowledge-network") has two services:

knowledge-network — the Node.js app, built with Nixpacks from this repo
Postgres — managed PostgreSQL with pgvector extension

Environment variables on Railway

Variable	Value	Notes
`DATABASE_URL`	(auto-injected by Railway)	Points to the managed Postgres instance
`PORT`	(auto-injected by Railway)	Do not set manually
`VOYAGE_API_KEY`	(secret)	Voyage AI API key
`GITHUB_TOKEN`	(secret)	GitHub PAT with repo read access
`WEBHOOK_SECRET`	(secret)	Shared secret for webhook HMAC verification
`MANIFEST_PATH`	`seed/repos.yaml`	Path to the repo manifest
`NODE_ENV`	`production`	Enables SSL for database connections
`GITHUB_CLIENT_ID`	(secret)	GitHub OAuth App client ID (Phase 2)
`GITHUB_CLIENT_SECRET`	(secret)	GitHub OAuth App client secret (Phase 2)
`TOKEN_ENCRYPTION_KEY`	(secret)	64-char hex key for token encryption (Phase 2)
`OAUTH_ISSUER`	`https://knowledge-network.up.railway.app`	Must match the public URL (Phase 2)
`GITHUB_ORG`	`Digication`	GitHub org for admin team checks (Phase 2)
`ADMIN_GITHUB_TEAM`	`knowledge-network-admins`	GitHub team slug for admin access (Phase 2)
`LOG_LEVEL`	`info`	Options: debug, info, warn, error (Phase 2)

Deploying

Railway is configured to deploy from the main branch. To deploy manually:

railway up

Or push to main if auto-deploy is connected via the Railway dashboard.

Railway configuration

Deployment is configured in railway.json:

Builder: Nixpacks (auto-detects Node.js/pnpm)
Build: runs pnpm build (TypeScript compile + copy schema.sql)
Start: node dist/index.js
Health check: polls /health
Restart policy: restarts on failure (up to 3 times)

Setting up a new Railway environment

Install the Railway CLI: npm install -g @railway/cli
Log in: railway login
Initialize: railway init (from the project directory)
Add PostgreSQL: railway add --database postgres

Set environment variables:

railway variables set VOYAGE_API_KEY=<key>
railway variables set GITHUB_TOKEN=<token>
railway variables set WEBHOOK_SECRET=<secret>
railway variables set MANIFEST_PATH=seed/repos.yaml
railway variables set NODE_ENV=production
railway variables set GITHUB_CLIENT_ID=<oauth-client-id>
railway variables set GITHUB_CLIENT_SECRET=<oauth-client-secret>
railway variables set TOKEN_ENCRYPTION_KEY=<64-char-hex>
railway variables set OAUTH_ISSUER=https://your-app.up.railway.app
railway variables set GITHUB_ORG=<your-github-org>
railway variables set ADMIN_GITHUB_TEAM=<team-slug>
railway variables set LOG_LEVEL=info

Set DATABASE_URL to the PostgreSQL connection string from Railway
Deploy: railway up
Generate a public domain: railway domain

CI/CD

Two GitHub Actions workflows are included:

Workflow	Trigger	What it does
`.github/workflows/validate-content.yml`	PRs touching `.md` files	Validates frontmatter against the schema
`.github/workflows/index-on-merge.yml`	Called by knowledge repos on merge	Sends changed files to the server for re-indexing

Setting up a knowledge repo for auto-indexing

In each knowledge repo that should be indexed:

Add a workflow that calls index-on-merge.yml
Set these GitHub Actions secrets:
- KN_WEBHOOK_SECRET — same value as the server's WEBHOOK_SECRET
- KN_MCP_SERVER_URL — the server URL (e.g., https://knowledge-network.up.railway.app)

Name		Name	Last commit message	Last commit date
Latest commit History 43 Commits
.claude		.claude
.github/workflows		.github/workflows
docs		docs
seed		seed
src		src
.dockerignore		.dockerignore
.env.example		.env.example
.gitignore		.gitignore
.mcp.json		.mcp.json
CLAUDE.md		CLAUDE.md
README.md		README.md
docker-compose.yml		docker-compose.yml
package.json		package.json
pnpm-lock.yaml		pnpm-lock.yaml
project-brief.md		project-brief.md
railway.json		railway.json
tsconfig.json		tsconfig.json
vitest.config.ts		vitest.config.ts

Folders and files

Latest commit

History

Repository files navigation

Knowledge Network

Architecture

How indexing works

How search works

Tech stack

Prerequisites

Getting started

1. Clone and install

2. Configure environment

3. Start the dev environment

4. Seed the knowledge base

5. Verify everything works

Project structure

Common commands

Knowledge lifecycle

How knowledge flows in

Quality gates

Onboarding

Multi-repo architecture

Self-improvement loops

Authentication

Connecting from claude.ai and CoWork

Connecting Claude Code

Connecting other MCP clients (Cursor, etc.)

Admin access

Endpoints

Knowledge file format

Caddy setup

Production deployment (Railway)

Services

Environment variables on Railway

Deploying

Railway configuration

Setting up a new Railway environment

CI/CD

Setting up a knowledge repo for auto-indexing

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages