MedBeads is an Immutable, Agent-Native Data Infrastructure designed to address the "Context Mismatch" in medical AI. By restructuring medical records from mutable relational databases into a Merkle Directed Acyclic Graph (DAG), MedBeads provides explicit causal linking, tamper-evidence, and deterministic context retrieval for autonomous agents.
The Context Mismatch Problem: Current Electronic Medical Records (EMRs) and FHIR standards are designed for human review, relying on implicit context and probabilistic search (like Vector RAG) which can lead to AI hallucinations. MedBeads shifts this paradigm:
- From Probabilistic to Deterministic: Instead of guessing context, AI agents traverse explicit cryptographic links.
- From Mutable to Immutable: Every record ("Bead") is content-addressed and unchangeable, guaranteeing auditability.
- From Verbose to Token-Efficient: The structured graph serves as a compressed "AI-native language."
graph TD
User((User))
subgraph Frontend
UI["React UI (Vite)<br/>Port: 5174"]
end
subgraph Backend
Core["Go Core Server<br/>Port: 8080"]
API["Python AI API<br/>Port: 8000"]
end
subgraph Storage
Objects["Object Storage<br/>(CAS)"]
SQL["Metadata DB<br/>(SQLite)"]
end
subgraph External
Gemini[Gemini AI API]
end
User -->|Browser| UI
UI -->|Data/Search| Core
UI -->|AI Analysis| API
API -->|Get Context| Core
API -->|Generate| Gemini
Core -->|Read/Write| Objects
Core -->|Read/Write| SQL
medbeads/
├── core/ # Go Backend Server
│ ├── main.go # Entry Point
│ ├── medbeads_data/ # Data Storage (Mounted in Docker)
│ └── Dockerfile # Core Dockerfile
│
├── api/ # Python AI API Server
│ ├── main.py # FastAPI Entry Point
│ ├── ai.py # Gemini AI Logic
│ └── Dockerfile # API Dockerfile
│
├── ui/ # React Frontend
│ ├── src/ # Source Code
│ └── Dockerfile # UI Dockerfile
│
├── FHIR_sample/ # Sample Data (Synthea)
├── docker-compose.yml # Docker Composition
└── start.sh # Local Helper Script
To use AI features, you need to configure your Google Gemini API Key.
- Copy the example environment file:
cp api/.env.example api/.env
- Edit
api/.envand set your API key:GEMINI_API_KEY=your_actual_api_key_here
The easiest way to run MedBeads is using Docker. This will start the Core, API, and UI services.
- Docker Engine
- Docker Compose
-
Build and start the containers:
docker-compose up --build
-
Open your browser and access the UI:
-
All services:
- UI (Visualizer): http://localhost:5174
- AI API: http://localhost:8000
- Core Engine: http://localhost:8080
-
Stop the application:
Ctrl+C
The repository includes 3 sample patients for immediate demo. To add more patients from the FHIR samples, run the following command while Docker is running:
# Add more patients (e.g., 5 additional)
uv run --with requests scripts/mass_ingest.py FHIR_sample --limit 5If you prefer to run services individually without Docker, follow these steps. Since the repository does not contain pre-generated data, Step 2 (Data Ingestion) is required for the first run.
- Go 1.21+
- Python 3.12+ (managed via
uv) - Node.js 20+
You can use the helper script to verify the environment, ingest sample data, and start all servers at once:
./start.sh-
Start Core Engine (Go): This service manages the data storage and index.
cd core go run main.go # Server runs on localhost:8080
-
Ingest Initial Data (Python): (Required if database is empty) Convert FHIR sample data into Beads and send to the Core Engine. Run this in a new terminal while Core is running:
# Ingest 5 sample patients uv run --with requests scripts/mass_ingest.py medbeads/FHIR_sample --limit 5 -
Start AI API (Python): This service provides AI analysis features.
cd api uv run uvicorn main:app --host 0.0.0.0 --port 8000 -
Start UI (React): The frontend visualization interface.
cd ui npm install npm run dev # Access at http://localhost:5174
-
FHIR Source Data
- Located in
medbeads/FHIR_sample/(general samples) orsample_data/fhir/(security clearance test data). - Contains raw FHIR JSON files.
- Located in
-
Ingestion Process (Python)
- Run
python scripts/mass_ingest.py(or viauv run). - The script reads JSON files, converts them into Beads (Merkle Graph Nodes), and sends them to the Core Server via API.
- Important: Beads must be ingested through the API to be indexed in SQLite. Simply copying object files will not register them in the database.
- Run
-
Storage (Core Engine)
- Content Addressable Storage (CAS): Raw data is stored as immutable files in
medbeads/core/medbeads_data/objects/. - Metadata Index (SQLite): Searchable index is stored in
medbeads/core/medbeads_data/metadata.db.
- Content Addressable Storage (CAS): Raw data is stored as immutable files in
-
Docker Startup Ingestion
- When running via Docker (
deploy/hf/Dockerfile), the startup script automatically:- Starts the Core server temporarily
- Ingests FHIR data from
sample_data/fhir/usingmass_ingest.py - Sets up security clearance rules
- Restarts services via supervisord
- When running via Docker (
MedBeads supports Security Clearance to control who can view specific medical records. This uses a Blacklist model (default: all can view, explicit deny for specific roles).
| Role | Label (日本語) | Description |
|---|---|---|
patient |
患者本人 | The patient themselves |
family |
家族 | Family members |
primary_care |
主治医 | Primary care physician |
specialist |
専門医 | Consulting specialists |
nurse |
看護師 | Nursing staff |
insurance |
保険会社 | Insurance companies |
researcher |
研究者 | Research access |
emergency |
緊急時 | Emergency override (bypasses all restrictions) |
system |
システム | System/AI processes (full access) |
The sample_data/fhir/ directory contains 5 test patients with various clearance scenarios:
| Patient | Scenario | Clearance |
|---|---|---|
| Patient A (30s F) | Gynecology | Hide from family |
| Patient B (50s M) | Cancer suspicion | Temporarily hide from patient/family (2 weeks) |
| Patient C (40s M) | Psychiatry | Hide from insurance |
| Patient D (60s F) | General internal medicine | No restrictions |
| Patient E (20s M) | Complex/Emergency | Multiple restrictions (drug screen, alcohol) |
Use the Viewer Role Selector in the UI header to switch between roles and observe how restricted records are displayed or hidden.
To populate the repository with initial seed data (e.g., half of the samples):
- Start the Core Server:
cd core && go run main.go
- Run the ingestion script (in another terminal):
uv run --with requests medbeads/scripts/mass_ingest.py medbeads/FHIR_sample --limit 5
- (Optional) Force commit the generated data:
git add -f core/medbeads_data/metadata.db core/medbeads_data/objects/
If you use MedBeads in your research, please cite our paper:
@article{medbeads2025,
title={MedBeads: Immutable Agent-Native Data Infrastructure for Medical AI},
author={Nakajima, Takahito},
journal={medRxiv (under review)},
year={2026},
note={DOI: TBD}
}We thank Synthea for providing the synthetic FHIR patient data used in this project.
