Skip to content

pkglib/pkg: hash-dir, update-hashes subcommand, dep-tag propagation#4210

Draft
rucoder wants to merge 4 commits intolinuxkit:masterfrom
rucoder:rucoder/pkg-hash-dir
Draft

pkglib/pkg: hash-dir, update-hashes subcommand, dep-tag propagation#4210
rucoder wants to merge 4 commits intolinuxkit:masterfrom
rucoder:rucoder/pkg-hash-dir

Conversation

@rucoder
Copy link
Copy Markdown
Contributor

@rucoder rucoder commented Mar 15, 2026

Problem

Building EVE-OS (and similar multi-package Linuxkit projects) requires that:

  1. A source-code change to a leaf package (e.g. pkg/zfs) propagates a new hash to all downstream consumers (pkg/pillar, pkg/dom0-ztools, etc.) so Docker cache tags change and images are rebuilt.
  2. Make can drive individual package builds as file targets — skipping the actual Docker build when nothing changed — rather than always running linuxkit pkg build.
  3. The contentHash (dirty-tree computation) must not be pathologically slow on repos with large untracked directories (e.g. linux-firmware, build artefacts).
  4. Versioned packages that have no build.yml (only build-2.4.yml, build-2.3.yml, …) must work without a symlink.

Issues: #4180, related to #4195.

Solution (4 commits)

1. pkglib: scope contentHash to package directory

contentHash previously ran git ls-files on the entire repo. Scope it to pkg/<name>/ so it only hashes files relevant to the package being tagged.

2. pkglib: add --hash-dir flag; write YAML build manifests from show-tag

Add a --hash-dir <path> persistent flag shared by show-tag and build.

Write side (show-tag --hash-dir): after computing a tag, write a YAML manifest to <hash-dir>/<pkgname>.hash with write-if-changed semantics (mtime only updates when the tag actually changes, so downstream make targets are not spuriously re-triggered):

tag: lfedge/eve-zfs:cbda1fe2...-2.4
build-yml: build-2.4.yml
deps:
  - path: pkg/alpine
    tag: lfedge/eve-alpine:abc123...

Read side (pkg build --hash-dir): @lkt:pkg: and @lkt:pkgs: references are resolved from the stored hash files rather than by recursive NewFromConfig calls. This eliminates dependency cycles and correctly propagates version-specific build variants (e.g. ZFS_VERSION selecting build-2.4.yml) through downstream combined hashes.

pkg build auto-detects build-yml: when --hash-dir is set and --build-yml was not given, build reads the effective build-yml from the package's hash file. No per-package --build-yml flag needed in the calling Makefile.

3. pkglib: filter @lkt:pkgs: wildcard to Dockerfile-declared ARGs; relax build.yml requirement

When resolving @lkt:pkgs:../* wildcards, parse the anchor package's Dockerfile to collect declared ARG names and skip any glob-matched package whose computed key is not declared. This eliminates oscillating hashes caused by packages referencing themselves or unrelated neighbours.

Also: return "" (lenient) instead of erroring when a versioned package has no default build.yml — makes ZFS-style packages first-class citizens.

4. pkg: add update-hashes subcommand

New subcommand: linuxkit pkg update-hashes --hash-dir <dir> [path[:build-yml]]...

  • Accepts explicit build-yml per package (pkg/zfs:build-2.4.yml) for versioned packages.
  • Builds a dependency graph from @lkt:pkgs: build args, topologically sorts it (Kahn's algorithm), and calls show-tag --hash-dir in dependency order so each package's hash file is ready before its consumers are processed.
  • --strict-deps: error if a dep is missing from the provided list.

This is the entry-point that a Makefile calls once per invocation to refresh all hash files before per-package linuxkit pkg build calls.

Usage (EVE Makefile pattern)

# Stage 1 — always run; fast (hash computation only)
update-hashes: $(LINUXKIT)
    $(LINUXKIT) pkg update-hashes --hash-dir .gen-deps \
        $(foreach p,$(PKGS),$p:$(call get_pkg_build_yml,$(notdir $p)))

# Stage 2 — PHONY; linuxkit checks docker cache, skips actual build if cached
.PHONY: eve-%
eve-%: update-hashes | $(LINUXKIT)
    $(LINUXKIT) pkg build --hash-dir .gen-deps pkg/$*

Testing

  • pkg/zfs source change → show-tag pkg/zfs new hash → update-hashes propagates new hash to pkg/pillar, pkg/dom0-ztools, pkg/vtpm.
  • ZFS_VERSION=2.4 update-hashespkg/zfs gets build-2.4.yml recorded in hash file → pkg build --hash-dir picks it up automatically.
  • No hang/infinite recursion with circular @lkt:pkgs:../* patterns.
  • contentHash completes in normal time on repos with large untracked trees.

🤖 Generated with Claude Code

contentHash previously ran git ls-files on the entire repo, causing
extremely slow dirty-hash computation when large untracked directories
exist anywhere in the repo (e.g. linux-firmware trees, build artifacts).

Scope both the tracked and untracked file listings to the specific
package path so contentHash only hashes files relevant to the package.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Signed-off-by: Mikhail Malyshev <mike.malyshev@gmail.com>
@rucoder rucoder changed the title pkglib/pkg: hash-dir, update-hashes subcommand, dep-tag propagation [WIP] pkglib/pkg: hash-dir, update-hashes subcommand, dep-tag propagation Mar 15, 2026
@rucoder rucoder force-pushed the rucoder/pkg-hash-dir branch from ecb198f to 4fa150a Compare March 16, 2026 23:06
rucoder and others added 3 commits March 17, 2026 00:28
Add --hash-dir <path> persistent flag to `linuxkit pkg` (shared by
show-tag and build subcommands).

Write side (show-tag --hash-dir):
  After computing a package tag, write a YAML manifest to
  <hash-dir>/<pkgname>.hash with write-if-changed semantics: the file
  mtime is only updated when the tag actually changes, so downstream make
  DEPS_FORCE rules are not spuriously triggered.

  Format:
    tag: lfedge/eve-zfs:cbda1fe2...-2.4
    build-yml: build-2.4.yml

Read side (pkg build --hash-dir and hash computation):
  @lkt:pkg: and @lkt:pkgs: dep references are resolved by reading the
  stored tag from <hash-dir>/<pkgname>.hash rather than calling
  NewFromConfig recursively. This eliminates dependency cycles and
  correctly propagates version-specific build variants (e.g. ZFS_VERSION
  selecting build-2.4.yml) through downstream combined hashes.

  When a dep's hash file is absent:
  - --strict-deps: error ("run update-hashes first")
  - default (lenient): fall back to NewFromConfig with build.yml

Hash propagation:
  NewFromConfig now incorporates resolved @lkt: dep tags into the
  combined package hash when --hash-dir is set. This means that when
  pkg/zfs hash changes (e.g. due to ZFS_VERSION bump), downstream
  packages (pillar, dom0-ztools, vtpm) also get new combined hashes,
  giving full tag-based cache correctness without DEPS_FORCE as a crutch.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Signed-off-by: Mikhail Malyshev <mike.malyshev@gmail.com>
… build.yml requirement

- Export PkgImageName and DockerfileARGNames for use in pkg_updatehashes.go
- In @lkt:pkgs: wildcard resolution, parse the anchor package's Dockerfile to
  collect declared ARG names; skip any glob-matched package whose computed key
  is not declared — eliminates circular/oscillating combined hashes caused by
  packages referencing themselves or unrelated neighbours
- Use hash manifest's stored build-yml (not DefaultPkgBuildYML) when resolving
  image names for versioned packages like pkg/zfs that have no build.yml
- In resolveDepTag: check build.yml existence before NewFromConfig fallback;
  return "" (lenient) instead of erroring when no default build.yml exists —
  makes versioned packages first-class citizens without a build.yml symlink
- Export ReadHashManifest for use outside pkglib

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Signed-off-by: Mikhail Malyshev <mike.malyshev@gmail.com>
…h file

update-hashes:
- New cobra subcommand 'linuxkit pkg update-hashes --hash-dir <dir> [path[:build-yml]]...'
- Parses all package specs (with explicit build-yml for versioned packages)
- Builds dep graph from @lkt:pkg:/pkgs: build args filtered by Dockerfile ARGs
- Topological sort (Kahn's algorithm) ensures deps processed before consumers
- Calls NewFromConfig per package in topo order; writes HashManifest with
  tag, build-yml, and deps: [{path, tag}] populated
- --strict-deps flag errors when a dep package is missing from the list

pkg build:
- When --hash-dir is set and --build-yml was not explicitly provided,
  reads effective build-yml from the package's hash file automatically;
  simplifies EVE Makefile (no per-package --build-yml needed in eve-% rule)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Signed-off-by: Mikhail Malyshev <mike.malyshev@gmail.com>
@rucoder rucoder force-pushed the rucoder/pkg-hash-dir branch from 4fa150a to 516fd4e Compare March 17, 2026 00:32
@rucoder rucoder marked this pull request as draft March 24, 2026 14:57
@rucoder rucoder changed the title [WIP] pkglib/pkg: hash-dir, update-hashes subcommand, dep-tag propagation pkglib/pkg: hash-dir, update-hashes subcommand, dep-tag propagation Mar 24, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant