| Version | Supported |
|---|---|
| 0.6.x | Yes (active) |
| 0.5.x | Critical fixes only |
| <= 0.4 | No |
Patches land on the latest minor; older minors get critical-only patches on best-effort.
If you discover a security vulnerability, please report it responsibly:
- Do not open a public GitHub issue
- Email security concerns to xavier@xmpuspus.dev
- Include a description of the vulnerability, steps to reproduce, and potential impact
- You will receive a response within 48 hours
- Every endpoint that triggers an LLM call is gated by
Depends(require_auth):/chat,/chat/stream,/api/arena/match,/api/arena/vote,/api/tools/*,/api/graph/build,/api/debug/explain. - When
KB_ARENA_API_TOKENis set, requests must includeAuthorization: Bearer <token>. The comparison is constant-time (hmac.compare_digest). - When
KB_ARENA_DEMO_MODE=true, every LLM-triggering endpoint returns 503 regardless of authentication. This is the default state when no API keys are configured, so a freshly installed instance cannot drain credits even if exposed. - Read-only endpoints (
/api/leaderboard,/api/benchmark/results,/api/corpora,/api/retriever-lab/*,/health,/ready) are intentionally unauthenticated — they read JSON from disk and never invoke the LLM.
ChatRequest.query,ArenaMatchRequest.questionare capped at 4 000 chars.- History list capped at 20 turns; corpus and strategy names are alphanumeric-only.
- Arena and tools endpoints use Pydantic models — no raw
request.json()paths. - All YAML loads use
yaml.safe_load; nopickle, noeval, noexecon user input.
- LLM-generated Cypher is rejected if it matches
_WRITE_CYPHER_RE, which includes APOC write paths (apoc.create,apoc.merge,apoc.refactor,apoc.cypher.runWrite,apoc.periodic.iterate,apoc.export, etc.) andLOAD CSV. - Defense in depth: every query path opens the Neo4j session with
default_access_mode=neo4j.READ_ACCESS. The driver enforces read-only at the Bolt protocol level — the regex is the second line, not the only line. - Production extraction (
build-graph) uses parameterized Cypher only.
WebParservalidates every URL before fetching. Schemes other thanhttp(s)are rejected. Hosts are DNS-resolved and the resolved IP is checked against private, loopback, link-local, multicast, and reserved ranges.- AWS instance metadata, GCE metadata, and EC2 instance-data hosts are blocked by name regardless of DNS.
follow_redirectsis disabled at the httpx client; redirects are validated per hop with a hard cap of 5.- GitHub clones use
--depth 1 --single-branchand a 120 s timeout.
KB_ARENA_BENCHMARK_COST_CAP_USDdefaults to 10.0 (was 0/unlimited). Benchmarks halt as soon as cumulative spend exceeds the cap.- Demo mode (auto-enabled when no API key is configured) returns 503 from every LLM-triggering endpoint.
- The benchmark runner distinguishes retryable transients (rate limit, 5xx, network) from permanent errors (auth, validation) — bad keys fail fast instead of burning 7 minutes of retries per run.
- 60 req/min per client, bounded-memory deque per IP, with eviction at 10 000 cold keys to prevent memory growth.
KB_ARENA_TRUSTED_PROXY_HEADERmay be set to honourX-Forwarded-Forfirst hop when running behind nginx / Cloudflare.
- CORS is configured via
KB_ARENA_CORS_ORIGINS; the default localhost list never expands to*. - The bundled
docker-compose.ymlbinds Neo4j to127.0.0.1and refuses to start withoutKB_ARENA_NEO4J_PASSWORD.
Dockerfileruns as a non-rootkbarenauser (UID 1000).HEALTHCHECKpolls/healthevery 15 s.KB_ARENA_DEMO_MODE=trueis set by default in the image — public deploys cannot accidentally enable chat without explicitly overriding it AND setting an API token.
- All direct dependencies pinned to exact
==versions inpyproject.toml. uv.lockis checked in; CI is being migrated touv sync --frozenso transitive deps are reproducible. (Tracked in our roadmap.)
- All API request bodies are validated by Pydantic v2 with strict type checking
- Strategy names are validated against the registry — unknown strategies return a structured error
- Question YAML files are validated against the
QuestionPydantic model at load time
- In-memory rate limiter resets on process restart. For production behind a load balancer use a Redis-backed limiter (open issue).
- LLM responses are escaped by React's built-in rendering, but custom integrations should sanitize before rendering as HTML.
/api/debug/explainis gated behindKB_ARENA_DEBUG=true. Don't enable debug mode in production.