Skip to content

feat: C4 architecture diagrams (L1-L3) — closes #203#204

Merged
RaghavChamadiya merged 25 commits into
mainfrom
feature/c4-diagrams
May 17, 2026
Merged

feat: C4 architecture diagrams (L1-L3) — closes #203#204
RaghavChamadiya merged 25 commits into
mainfrom
feature/c4-diagrams

Conversation

@RaghavChamadiya
Copy link
Copy Markdown
Collaborator

@RaghavChamadiya RaghavChamadiya commented May 17, 2026

Closes #203.

Adds a new C4 architecture view at /repos/[id]/c4 — System Context → Containers → Components, with drill-down, URL-synced state, and a per-node inspector that surfaces module health, top contributors, and the generated wiki page inline. Does not replace the existing dependency graph; ships as a sibling tab.

Backend

  • New external_systems table + Alembic migration. Manifest parsers for npm, pypi, cargo, go, nuget extract a first-class ExternalSystem registry at ingestion time; the existing external:* graph nodes link to it.
  • New c4_builder service derives L2 containers (workspace packages, top-dir fallback) and L3 components (sub-modules) from the persisted graph — no on-disk rescan at request time.
  • New endpoints: GET /api/graph/{repo_id}/c4/{l1,l2,l3}.

Frontend

  • New shared subpackage @repowise-dev/ui/c4: <C4Diagram> (React Flow + ELK layered layout), 5 typed node kinds (System / Person / External / Container / Component), relation edges, level tabs, breadcrumb, legend, basic and rich inspector panels, layout + keyboard hooks. All visual + state logic lives here so the hosted frontend can reuse it.
  • New web route /repos/[id]/c4 with nuqs-synced ?level= / ?container= params, plus a sidebar entry.
  • Rich inspector wires docs (markdown excerpt + full page link), module health (files, symbols, doc coverage, hotspots, dead, primary owner + silo), and top contributors. "Has docs" badge on container/component nodes via a single batch fetch (no N+1).

Export

  • Backend Mermaid C4 emitter (C4Context / C4Container / C4Component) + GET /api/graph/{repo_id}/c4/mermaid?level=&container_id=.
  • UI export menu in the toolbar: SVG and PNG built locally from the current layout (no new deps, vector-perfect — not a screenshot), Mermaid copy/download fetched from the backend so the export matches what the API serves.

Migration note

Existing indexed repos need a re-index to populate external_systems. Without it, L1 will show an empty external-systems list; L2 / L3 still work (they don't depend on external_system_id).

Test plan

  • repowise init --index-only on the repowise repo itself; /c4 shows 5 containers (cli, core, server, ui, web) with cross-edges
  • L1 → L2 → L3 drill-down preserved across refresh via URL params; Esc drills out
  • Inspector shows health + contributors + inline wiki excerpt; "Open in dependency graph" + "Open full page" navigate correctly
  • Container/component nodes with a wiki page show the BookOpen badge
  • Export menu: SVG opens cleanly in a browser; PNG rasterizes at 2×; Mermaid output parses on mermaid.live for all three levels
  • npm run type-check green on packages/ui and packages/web
  • Backend unit tests for parsers + c4_builder; integration tests for the three endpoints

Persists third-party dependencies declared in repo manifests so C4 L1
(System Context) can render real, versioned external systems instead of
the opaque external:* string nodes currently produced by import
resolution. GraphNode gains a nullable external_system_id FK so file
imports can be linked back to their declared dependency.

Migration 0018 creates the external_systems table and the FK column;
downgrade restores 0017 state cleanly (verified on sqlite).
Adds repowise.core.ingestion.external_systems, a self-contained subpackage
that walks a repo, finds package.json / pyproject.toml / requirements*.txt,
and returns ExternalSystemRecord rows with name + version + ecosystem
+ category. The category is a small dictionary-based heuristic (no LLM)
that flags frameworks vs services vs tools vs plain libraries — used by
the C4 renderer to group/icon external systems sensibly.

Each ecosystem lives in its own ~100-line module so adding a new
ecosystem is a new file, not edits across the suite. Parsers never raise
on malformed input (a broken manifest must not break ingestion).
Completes external system extraction for the languages repowise already
indexes. Each parser stays under ~100 lines and skips first-party
references — Cargo path/git deps, Go local replace directives,
.NET ProjectReference — so workspace members never leak into the
external systems list.
PipelineResult.external_systems now carries the list of declared deps
parsed from manifests during _run_ingestion. persist_pipeline_result
upserts them and then walks graph_nodes named 'external:<dep>' to fill
in the new external_system_id FK so the C4 builder can render real
typed boxes instead of bare strings.

Suffix-then-prefix matching handles common forms:
    external:fastapi          -> fastapi          (exact)
    external:fastapi.responses -> fastapi         (first dotted seg)
    external:@scope/pkg/sub    -> @scope/pkg/sub  (exact then segment)

bulk_upsert_external_systems also returns id_map so callers can do
the FK linkage without an extra round-trip; collapsing across
multi-manifest declarations picks any id (name/category are stable).
Derives C4 L1/L2/L3 from already-persisted graph + external_systems data
— no on-disk re-scan at request time, so hosted backends without
filesystem access work too.

Container detection re-uses the manifest paths stored in
external_systems.declared_in: every parent of a declared manifest is a
container root. Repos without manifests fall back to top-level dirs.
Components are top-level subdirs inside the container; root-level files
land in a synthetic _root bucket so they don't disappear from the
diagram.

Relations are rolled up from graph_edges: file→file edges are grouped
by (source_container, target_container, edge_type) at L2 and by their
component equivalents at L3. External-system edges piggyback on the
external_system_id FK populated during ingestion — unresolved
external:* nodes are dropped (a blank box is worse than no box).

All dataclasses live in models.py and are SQLAlchemy/Pydantic-free so
they can be unit-tested without a session and re-used by the Phase 3
Mermaid emitter.
Three new routes under the existing /api/graph prefix:
    GET /api/graph/{repo_id}/c4/l1
    GET /api/graph/{repo_id}/c4/l2
    GET /api/graph/{repo_id}/c4/l3?container_id=...

Router is a thin adapter over services/c4_builder — the only logic here
is dataclass → Pydantic conversion plus a 404 when an L3 container_id
doesn't resolve. Schemas live in server/schemas.py next to the rest of
the response models so OpenAPI generation picks them up uniformly.
Parser tests use tmp_path fixtures with realistic but minimal manifests
and cover: dependency extraction, dev/optional bucketing, version
parsing, malformed-input safety, workspace/local-path skipping, scope
prefix handling for npm, Poetry vs PEP 621 for pypi, // indirect and
local replace for Go, ProjectReference vs PackageReference for nuget,
and the classifier's framework/service/tool/library buckets.

c4_builder tests seed a two-container monorepo (packages/core +
packages/web) with one cross-edge and two external deps, then assert
L1 lists all externals, L2 detects both containers + the cross edge
+ the external edges (with no self-loops), and L3 returns the right
components for one container while filtering externals to only those
actually referenced from inside it. L3 also returns None for unknown
containers.
Hits the C4 router through an httpx AsyncClient + ASGITransport stack
with a fresh in-memory SQLite per test. Each test seeds the graph
directly via crud helpers (no fake LLM, no ingestion pipeline) so the
endpoint behaviour can be asserted in isolation.

Covers:
    - L1 returns system + user actor + every external
    - L1 on an empty repo still returns the system + zero externals
    - L2 detects containers, rolls file edges into container edges,
      links external_system_id-backed nodes to ext:* targets
    - L3 requires container_id (422 without it)
    - L3 returns components for a known container + filters externals
    - L3 returns 404 for an unknown container

Adds the c4 router to the test app and inlines create_test_repo to keep
the module collectable on its own — several existing server tests fail
to collect standalone because of how pytest resolves tests.unit.server,
and the new file shouldn't inherit that breakage.
@RaghavChamadiya RaghavChamadiya requested a review from swati510 as a code owner May 17, 2026 11:46
…ems table in test

The c4_builder package __init__.py was silently excluded by the local
_*.py pattern in .git/info/exclude — locally tests passed because the
file was present, but CI saw an implicit namespace package with no
build_l1/build_l2/build_l3 attributes.

Also update test_base_includes_all_models to include the new
external_systems table introduced in the ExternalSystem migration.
@RaghavChamadiya RaghavChamadiya merged commit 7171739 into main May 17, 2026
5 checks passed
@RaghavChamadiya RaghavChamadiya deleted the feature/c4-diagrams branch May 17, 2026 12:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Feature: Add C4 Architecture Diagrams (L1-L3)

2 participants