feat: C4 architecture diagrams (L1-L3) — closes #203#204
Merged
Conversation
Persists third-party dependencies declared in repo manifests so C4 L1 (System Context) can render real, versioned external systems instead of the opaque external:* string nodes currently produced by import resolution. GraphNode gains a nullable external_system_id FK so file imports can be linked back to their declared dependency. Migration 0018 creates the external_systems table and the FK column; downgrade restores 0017 state cleanly (verified on sqlite).
Adds repowise.core.ingestion.external_systems, a self-contained subpackage that walks a repo, finds package.json / pyproject.toml / requirements*.txt, and returns ExternalSystemRecord rows with name + version + ecosystem + category. The category is a small dictionary-based heuristic (no LLM) that flags frameworks vs services vs tools vs plain libraries — used by the C4 renderer to group/icon external systems sensibly. Each ecosystem lives in its own ~100-line module so adding a new ecosystem is a new file, not edits across the suite. Parsers never raise on malformed input (a broken manifest must not break ingestion).
Completes external system extraction for the languages repowise already indexes. Each parser stays under ~100 lines and skips first-party references — Cargo path/git deps, Go local replace directives, .NET ProjectReference — so workspace members never leak into the external systems list.
PipelineResult.external_systems now carries the list of declared deps
parsed from manifests during _run_ingestion. persist_pipeline_result
upserts them and then walks graph_nodes named 'external:<dep>' to fill
in the new external_system_id FK so the C4 builder can render real
typed boxes instead of bare strings.
Suffix-then-prefix matching handles common forms:
external:fastapi -> fastapi (exact)
external:fastapi.responses -> fastapi (first dotted seg)
external:@scope/pkg/sub -> @scope/pkg/sub (exact then segment)
bulk_upsert_external_systems also returns id_map so callers can do
the FK linkage without an extra round-trip; collapsing across
multi-manifest declarations picks any id (name/category are stable).
Derives C4 L1/L2/L3 from already-persisted graph + external_systems data — no on-disk re-scan at request time, so hosted backends without filesystem access work too. Container detection re-uses the manifest paths stored in external_systems.declared_in: every parent of a declared manifest is a container root. Repos without manifests fall back to top-level dirs. Components are top-level subdirs inside the container; root-level files land in a synthetic _root bucket so they don't disappear from the diagram. Relations are rolled up from graph_edges: file→file edges are grouped by (source_container, target_container, edge_type) at L2 and by their component equivalents at L3. External-system edges piggyback on the external_system_id FK populated during ingestion — unresolved external:* nodes are dropped (a blank box is worse than no box). All dataclasses live in models.py and are SQLAlchemy/Pydantic-free so they can be unit-tested without a session and re-used by the Phase 3 Mermaid emitter.
Three new routes under the existing /api/graph prefix:
GET /api/graph/{repo_id}/c4/l1
GET /api/graph/{repo_id}/c4/l2
GET /api/graph/{repo_id}/c4/l3?container_id=...
Router is a thin adapter over services/c4_builder — the only logic here
is dataclass → Pydantic conversion plus a 404 when an L3 container_id
doesn't resolve. Schemas live in server/schemas.py next to the rest of
the response models so OpenAPI generation picks them up uniformly.
Parser tests use tmp_path fixtures with realistic but minimal manifests and cover: dependency extraction, dev/optional bucketing, version parsing, malformed-input safety, workspace/local-path skipping, scope prefix handling for npm, Poetry vs PEP 621 for pypi, // indirect and local replace for Go, ProjectReference vs PackageReference for nuget, and the classifier's framework/service/tool/library buckets. c4_builder tests seed a two-container monorepo (packages/core + packages/web) with one cross-edge and two external deps, then assert L1 lists all externals, L2 detects both containers + the cross edge + the external edges (with no self-loops), and L3 returns the right components for one container while filtering externals to only those actually referenced from inside it. L3 also returns None for unknown containers.
Hits the C4 router through an httpx AsyncClient + ASGITransport stack
with a fresh in-memory SQLite per test. Each test seeds the graph
directly via crud helpers (no fake LLM, no ingestion pipeline) so the
endpoint behaviour can be asserted in isolation.
Covers:
- L1 returns system + user actor + every external
- L1 on an empty repo still returns the system + zero externals
- L2 detects containers, rolls file edges into container edges,
links external_system_id-backed nodes to ext:* targets
- L3 requires container_id (422 without it)
- L3 returns components for a known container + filters externals
- L3 returns 404 for an unknown container
Adds the c4 router to the test app and inlines create_test_repo to keep
the module collectable on its own — several existing server tests fail
to collect standalone because of how pytest resolves tests.unit.server,
and the new file shouldn't inherit that breakage.
…ems table in test The c4_builder package __init__.py was silently excluded by the local _*.py pattern in .git/info/exclude — locally tests passed because the file was present, but CI saw an implicit namespace package with no build_l1/build_l2/build_l3 attributes. Also update test_base_includes_all_models to include the new external_systems table introduced in the ExternalSystem migration.
swati510
approved these changes
May 17, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #203.
Adds a new C4 architecture view at
/repos/[id]/c4— System Context → Containers → Components, with drill-down, URL-synced state, and a per-node inspector that surfaces module health, top contributors, and the generated wiki page inline. Does not replace the existing dependency graph; ships as a sibling tab.Backend
external_systemstable + Alembic migration. Manifest parsers for npm, pypi, cargo, go, nuget extract a first-classExternalSystemregistry at ingestion time; the existingexternal:*graph nodes link to it.c4_builderservice derives L2 containers (workspace packages, top-dir fallback) and L3 components (sub-modules) from the persisted graph — no on-disk rescan at request time.GET /api/graph/{repo_id}/c4/{l1,l2,l3}.Frontend
@repowise-dev/ui/c4:<C4Diagram>(React Flow + ELK layered layout), 5 typed node kinds (System / Person / External / Container / Component), relation edges, level tabs, breadcrumb, legend, basic and rich inspector panels, layout + keyboard hooks. All visual + state logic lives here so the hosted frontend can reuse it./repos/[id]/c4with nuqs-synced?level=/?container=params, plus a sidebar entry.Export
C4Context/C4Container/C4Component) +GET /api/graph/{repo_id}/c4/mermaid?level=&container_id=.Migration note
Existing indexed repos need a re-index to populate
external_systems. Without it, L1 will show an empty external-systems list; L2 / L3 still work (they don't depend onexternal_system_id).Test plan
repowise init --index-onlyon the repowise repo itself;/c4shows 5 containers (cli, core, server, ui, web) with cross-edgesnpm run type-checkgreen onpackages/uiandpackages/web