Overview
Distributed KV, Built As One System
NoKV
NoKV.Open as a serious local engine.
NoKV starts as a serious standalone engine and grows into a multi-Raft distributed KV database without swapping out its storage core. That is the hook: WAL, LSM, MVCC, migration, replication, and control-plane behavior are treated as one system, not a pile of loosely connected features.
Run a local cluster first, then follow the standalone → seeded → cluster path.
Use NoKV in three different ways
- Embed it locally through
NoKV.Open. - Start a multi-node cluster with
scripts/dev/cluster.sh. - Take an existing standalone workdir and migrate it into a replicated region.
What makes this project worth reading
- One storage layer instead of separate standalone and distributed engines.
- Formal lifecycle and migration protocol instead of dump/import glue.
- System-level verification under restart, degraded Coordinator, chaos, and failpoints.
What Matters
Why NoKV
NoKV is not trying to be a feature checklist.
It is trying to answer a narrower and harder question well: can one storage core grow from an embedded engine into a distributed multi-Raft KV without turning migration, metadata, and recovery into glue code?
- One storage layer across standalone and distributed modes.
- Explicit lifecycle and migration semantics instead of hidden bootstrap magic.
- Verification aimed at restart, degraded control plane, and publish-boundary correctness.
One data plane, two deployment shapes
NoKV does not fork into separate standalone and distributed engines. The distributed layer grows on top of the same underlying DB workdir.
That is why migration can be a protocol instead of a dump/import afterthought.Replication with clear ownership
Store owns the node runtime, Peer owns a region replica runtime, RaftAdmin is the execution plane, and Coordinator stays in the control plane.
Logical region snapshots
Raft durable snapshot metadata is split from logical region state snapshots, which keeps migration, add-peer install, and recovery semantics clean.
This is a correctness-first design, not a one-shot performance shortcut.System-level validation
The project is tested beyond unit semantics: migration flow, restart safety, degraded Coordinator behavior, transport chaos, and context propagation are all exercised.
The goal is to verify lifecycle and recovery behavior, not just happy-path RPCs.../benchmark/README.md. The docs site keeps architecture and operating guidance separate from benchmark storytelling.
Fastest Path
Try NoKV In Five Minutes
Boot a cluster, front it with Redis, inspect the runtime.
If you only want one practical path, this is the shortest route from clone to “I can see the system running”. It uses the local cluster helper, the Redis-compatible gateway, and the built-in runtime inspection commands.
Start the cluster
Use the shared topology file and bring up the local Coordinator + store layout.
Expose a familiar interface
Run the Redis-compatible gateway so you can talk to NoKV with an off-the-shelf client.
Inspect the system
Query stats and region ownership so the demo ends with visibility instead of blind writes.
# 1. Start a local cluster from the shared topology file
./scripts/dev/cluster.sh --config ./raft_config.example.json
# 2. In another terminal, front it with the Redis-compatible gateway
go run ./cmd/nokv-redis \
--addr 127.0.0.1:6380 \
--raft-config ./raft_config.example.json
# 3. Talk to NoKV with any Redis client
redis-cli -p 6380 set hello world
redis-cli -p 6380 get hello
# 4. Inspect the running cluster
go run ./cmd/nokv stats --expvar http://127.0.0.1:9100
go run ./cmd/nokv regions --workdir ./artifacts/cluster/store-1
Read This Next
Documentation Guide
If you only read three pages, read these first:
- Getting Started for the shortest path to a running cluster.
- Raftstore for runtime ownership and distributed boundaries.
- Migration for the standalone → cluster bridge that makes NoKV distinct.
Getting Started
Run NoKV locally, understand the topology file, and boot your first store or local cluster.
Raftstore
Read the distributed runtime layout: server wiring, store ownership, peer lifecycle, snapshots, and recovery surfaces.
Migration
Follow the standalone → seeded → cluster path, including SST snapshot install and membership rollout.
Testing
See how deterministic integration, failpoints, restart recovery, and distributed fault matrix coverage are organized.
Choose Your Route
Read By Interest
Storage Engine
Read this route if you care about WAL discipline, MemTable/flush, manifest semantics, and ValueLog GC.
Architecture · WAL · Flush · Value Log
Distributed Runtime
Read this route if Store/Peer ownership, transport, snapshots, and Coordinator are the parts you want to reason about.
Migration & Operations
Read this route if the bridge from standalone workdir to replicated region is the part you want to demo or operate.
Common Paths
Jump Points
Layer View
Architecture Sketch
%%{init: {
"themeVariables": { "fontSize": "18px" },
"flowchart": { "nodeSpacing": 45, "rankSpacing": 62, "curve": "basis" }
}}%%
graph TD
Client["Client / App / Redis"] -->|RPC / RESP| Server["Node Server"]
Client -->|Route / TSO / control queries| Coordinator["Coordinator"]
subgraph "Distributed Runtime"
Server --> Store["Store runtime root"]
Store --> Peer["Peer runtime"]
Store --> Admin["RaftAdmin"]
Store --> Meta["Local recovery metadata"]
Peer --> Raft["Raft durable state"]
Peer --> Snap["Logical region snapshot"]
end
subgraph "Shared Data Plane"
Peer --> DB["NoKV DB"]
Snap --> DB
DB --> LSM["LSM / WAL / VLog / MVCC"]
end