feat(security): #495 — behavioral SBOM at compile time + perry audit --sbom#953
Merged
Conversation
This was referenced May 17, 2026
…t --sbom\`
Every Perry compile now writes a per-module behavioral manifest to
<project>/.perry-cache/audit.json. The manifest captures, for each
source module, the stdlib symbols actually called by the lowered HIR:
{
"version": 1,
"modules": [{
"source": "/repo/main.ts",
"package": null, // or "@scope/pkg" if under node_modules/
"stdlib": {
"fs": ["readFileSync"],
"path": ["join"]
}
}]
}
JSON shape is versioned (`version: 1`); keys are sorted (BTreeMap),
method lists are deduplicated + sorted — so the bytes are
deterministic across builds. `perry audit --sbom > before.txt`,
package.json change, rebuild, diff = a meaningful supply-chain
review tool.
- #501 will consume the SBOM to enforce host-controlled per-package
capabilities ("this dep must not call child_process.*").
- #496 (--lockdown) will flag violations from the same data.
- #502 (URL/host egress allowlist) will graft `literal_hosts` onto
the same shape.
The audit walk runs in the platform-agnostic
`compile_command` driver, walking each `Module` in `ctx.native_modules`
*after* `collect_modules` finalizes the dep graph. Every backend
(LLVM / WASM / ArkTS / HarmonyOS / Glance / SwiftUI / JS) inherits
the SBOM emission from one choke point.
`perry-hir::audit::audit_module(&Module, source)` returns a
`ModuleAudit`. Traversal visits `init` + every function body + every
class method body. Two source channels collected:
- `NativeMethodCall { module, method, .. }` — the general-shape
variant for stdlib calls after alias resolution.
- Specialized HIR variants for hot paths
(`FsReadFileSync`, `FsWriteFileSync`, `FsExistsSync`, …,
`PathJoin`, `PathDirname`, `PathResolve`, …,
`ProcessEnv`, `ProcessCwd`, `ProcessArgv`,
`Process{Stdin,Stdout,Stderr}IsTTY`, `ProcessStdout{Columns,Rows}`,
`TtyIsAtty`, `FileURLToPath`) mapped through
`specialized_stdlib_call()` to their equivalent (namespace, method)
pair. Without this, a host that only calls `fs.readFileSync` would
appear to make zero stdlib calls, defeating the SBOM.
Re-uses the existing `perry audit` subcommand (which talks to the
remote security scanner) with a new `--sbom` flag that short-circuits
before the remote call. Reads the manifest, walks up the directory
tree to find `.perry-cache/audit.json` (same shape `perry compile`
walks up to find package.json), groups modules by owning npm package
in text mode, dumps raw JSON in `--format json` mode.
9 unit tests in `perry-hir::audit::tests`:
- `empty_module_has_no_records`
- `top_level_native_call_recorded`
- `duplicate_calls_dedupe`
- `nested_call_recorded` (inside `Stmt::If`)
- `package_name_extracted_from_node_modules_path`
- `scoped_package_name_extracted`
- `nested_node_modules_returns_innermost`
- `user_source_has_no_package`
- `serializes_to_stable_json` — pins the byte-deterministic shape
End-to-end smoke (cleaned up after the test):
- `import fs, path; fs.readFileSync(...); path.join(...)` →
audit.json correctly captures `"fs": ["readFileSync"]` +
`"path": ["join"]`.
- `perry audit --sbom` pretty-prints the same data grouped by
package.
- Literal `fetch`/`http.get` URLs — #502 territory; will graft on as
a `literal_hosts` key under the same versioned shape.
- Native-library symbol references (FFI registry).
- `perry audit --sbom --diff` — the deterministic JSON shape
already enables the workflow via plain `diff`; built-in `--diff`
is a follow-up.
- [x] `.perry-cache/audit.json` written every build
- [partial] Per-module breakdown: stdlib symbols ✓; literal hosts/URLs + native lib symbols deferred (#502, FFI registry follow-up)
- [x] `perry audit --sbom` prints human-readable summary
- [partial] `perry audit --diff` — deterministic JSON enables the workflow today; built-in `--diff` deferred
- [x] Foundation for issues that consume this manifest (#501, #496)
3ed7771 to
92c7ad0
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #495.
Summary
Every Perry compile now writes a per-module behavioral manifest to
<project>/.perry-cache/audit.json— a JSON document capturing the stdlib symbols each source module actually calls. The manifest is the foundation for the rest of the supply-chain hardening series (#501, #496, #502) and gives reviewers a way to see exactly what surface a dependency touches without rebuilding the binary.Zero runtime cost — the walk runs at compile time over the lowered HIR.
Cross-platform — runs in the platform-agnostic
compile_commanddriver, so every backend (LLVM / WASM / ArkTS / HarmonyOS / Glance / SwiftUI / JS) inherits SBOM emission from one choke point.Example output
main.ts:→
.perry-cache/audit.json:{ "version": 1, "modules": [{ "source": "/repo/main.ts", "package": null, "stdlib": { "fs": ["readFileSync"], "path": ["join"] } }] }→
perry audit --sbom:JSON shape is versioned (
version: 1) and byte-deterministic across builds (BTreeMap keys + sorted method lists), enabling the supply-chain review workflow:perry audit --sbom > before.txt, package.json change, rebuild,diff before.txt after.txt.Walker
perry-hir::audit::audit_module(&Module, source)walksinit+ every function body + every class method body, capturing two HIR channels:NativeMethodCall { module, method, .. }— the general-shape variant for stdlib calls after alias resolution.FsReadFileSync,FsWriteFileSync,FsExistsSync, …,PathJoin,PathDirname,PathResolve, …,ProcessEnv,ProcessCwd,ProcessArgv,Process{Stdin,Stdout,Stderr}IsTTY,TtyIsAtty,FileURLToPath) mapped throughspecialized_stdlib_call()to their equivalent (namespace, method) pair. Without this, a host that only callsfs.readFileSyncwould appear to make zero stdlib calls, defeating the SBOM — the smoke test in development surfaced this exact gap.CLI
Adds a
--sbomflag to the existingperry auditsubcommand. When passed, short-circuits before the remote security-scan call and prints the local manifest instead. Walks up the directory tree to find.perry-cache/audit.json(same shapeperry compilewalks up to findpackage.json).The pre-existing
perry auditremote-scan behavior is preserved — only the new--sbomflag activates the SBOM viewer.Test coverage
9 unit tests in
perry-hir::audit::testscover the walker:node_modules// user-source pathsEnd-to-end smoke confirms the produced
audit.jsonround-trips throughperry audit --sbom.Out of scope (#495 follow-ups documented in code + docs)
fetch/http.getURLs (security: compile-time URL/host egress allowlist #502 territory).perry audit --sbom --diff— the deterministic JSON enablesdiff/git difftoday;--diffitself is a follow-up.The manifest shape's
version: 1field exists so consumers can detect when new top-level keys land.Acceptance
.perry-cache/audit.jsonwritten every buildperry audit --sbomprints human-readable summaryperry audit --diff— deterministic JSON enables the workflow today via plaindiff; built-in--diffdeferred--lockdowncompile flag — refuse risky linkage #496 lockdown mode)Notes
No
Cargo.tomlversion bump, noCLAUDE.mdversion line touch, noCHANGELOG.mdentry — maintainer folds those in at merge time.