perf(ci): version refactor (remove Xref/status.sh)#19424
perf(ci): version refactor (remove Xref/status.sh)#19424
Conversation
|
Skipping CI for Draft Pull Request. |
There was a problem hiding this comment.
Hey - I've found 2 issues, and left some high level feedback:
- The build cache statistics added to
go_buildrely ongo build -vline counts and a fixed/tmp/go-build-compiled-countfile, which can be both inaccurate (non-1:1 with compiled units) and racy if multiple builds run in parallel; consider using a unique temp file per invocation and/or gating this logging behind a flag or env var. - The
git ls-files | xargs touchstep in the composite action mutates mtimes for every tracked file, which could interfere with incremental build tooling or other languages in the same job; consider narrowing this to Go source/test files or to the directories actually used by Go.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- The build cache statistics added to `go_build` rely on `go build -v` line counts and a fixed `/tmp/go-build-compiled-count` file, which can be both inaccurate (non-1:1 with compiled units) and racy if multiple builds run in parallel; consider using a unique temp file per invocation and/or gating this logging behind a flag or env var.
- The `git ls-files | xargs touch` step in the composite action mutates mtimes for every tracked file, which could interfere with incremental build tooling or other languages in the same job; consider narrowing this to Go source/test files or to the directories actually used by Go.
## Individual Comments
### Comment 1
<location path="scripts/go-tool.sh" line_range="108" />
<code_context>
- invoke_go_build -o "$output" "${dirs[@]}"
+ local total
+ total=$(go list -deps "${dirs[@]}" 2>/dev/null | wc -l | tr -d ' ')
+ local compiled_count=/tmp/go-build-compiled-count
+ invoke_go_build -o "$output" "${dirs[@]}" 2> >(tee >(wc -l | tr -d ' ' > "$compiled_count") >&2)
+ wait
</code_context>
<issue_to_address>
**issue (bug_risk):** Use a per-process or per-invocation temp file instead of a fixed path for compiled_count
A fixed `/tmp` path means concurrent runs of this script (or multiple `go_build` calls in one job) will race on the same `compiled_count` file, causing corrupted or incorrect counts. Please use a unique temp file per invocation (e.g., via `mktemp` or including `$$`/`$BASHPID` and job name in the filename) and clean it up afterward.
</issue_to_address>
### Comment 2
<location path="scripts/go-tool.sh" line_range="112-115" />
<code_context>
+ invoke_go_build -o "$output" "${dirs[@]}" 2> >(tee >(wc -l | tr -d ' ' > "$compiled_count") >&2)
+ wait
+ local compiled
+ compiled=$(cat "$compiled_count" 2>/dev/null | tr -d ' ')
+ local cached=$((total - compiled))
+ echo >&2 "Build cache: ${cached}/${total} deps cached ($compiled compiled)"
+ if [[ "$total" -gt 0 ]] && [[ $((cached * 100 / total)) -lt 50 ]]; then
+ echo >&2 "WARNING: Build cache hit rate below 50% (${cached}/${total}). Significant code changes invalidate the cached build — expect slower compilation."
+ fi
</code_context>
<issue_to_address>
**issue (bug_risk):** Harden arithmetic against empty or non-numeric totals/compiled counts
If `go list` or the build step fails (or their output format changes), `total`/`compiled` can be empty or non-numeric. Then `cached=$((total - compiled))` and the percentage calculation will throw a bash arithmetic error and abort. Please default missing values to 0 and ensure both vars are integers (e.g. `${total:-0}`, `${compiled:-0}` with a simple regex check), or skip the cache-reporting logic when the counts aren’t valid.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
|
Images are ready for the commit at a564561. To use with deploy scripts, first |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #19424 +/- ##
==========================================
+ Coverage 49.26% 49.28% +0.02%
==========================================
Files 2735 2735
Lines 206138 206170 +32
==========================================
+ Hits 101545 101604 +59
+ Misses 97046 97022 -24
+ Partials 7547 7544 -3
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
️✅ There are no secrets present in this pull request anymore.If these secrets were true positive and are still valid, we highly recommend you to revoke them. 🦉 GitGuardian detects secrets in your source code to help developers and security teams secure the modern development process. You are seeing this because you or someone else with access to this repository has authorized GitGuardian to scan your pull request. |
a6fff2b to
63a3bd8
Compare
a6cc9da to
21b91d0
Compare
|
@sourcery-ai review |
There was a problem hiding this comment.
Hey - I've found 2 issues, and left some high level feedback:
- go-tool.sh unconditionally generates pkg/version/internal/zversion.go and reads the *VERSION files for every invocation; consider short‑circuiting for tools that don’t need version data (or only generating when TOOL is build/test) to avoid unexpected failures when those files are absent and to minimize unnecessary writes.
- Because zversion.go is now generated into the repo tree, please ensure it’s consistently excluded from version control and tooling (e.g., .gitignore, linters, IDEs) so local
go test/go buildruns don’t leave a perpetually dirty working tree or confuse static analysis.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- go-tool.sh unconditionally generates pkg/version/internal/zversion.go and reads the *VERSION files for every invocation; consider short‑circuiting for tools that don’t need version data (or only generating when TOOL is build/test) to avoid unexpected failures when those files are absent and to minimize unnecessary writes.
- Because zversion.go is now generated into the repo tree, please ensure it’s consistently excluded from version control and tooling (e.g., .gitignore, linters, IDEs) so local `go test` / `go build` runs don’t leave a perpetually dirty working tree or confuse static analysis.
## Individual Comments
### Comment 1
<location path="scripts/go-tool.sh" line_range="37" />
<code_context>
+ # Konflux/release builds set BUILD_TAG to the full version string.
+ # Use it directly (the Docker build context has no .git directory).
+ main_version="${BUILD_TAG}"
+ git_short_sha="$(echo "$BUILD_TAG" | grep -oP 'g\K[0-9a-f]+$' || echo "")"
else
- die "Malformed status.sh output line ${line}"
</code_context>
<issue_to_address>
**issue (bug_risk):** Using `grep -P` may break in minimal environments that lack PCRE-enabled grep
Because this script runs in build tooling and containers, relying on `grep -P` can cause failures where PCRE-enabled grep isn’t available (e.g., busybox images). Please switch to a POSIX-compatible pattern (e.g., via `sed`, `awk`, or `grep` + `cut`) for extracting the short SHA so it works reliably across environments.
</issue_to_address>
### Comment 2
<location path="operator/Makefile" line_range="351-354" />
<code_context>
BUILDKIT_PROGRESS=plain ../scripts/docker-build.sh \
-t ${IMG} \
$(if $(GOARCH),--build-arg TARGET_ARCH=$(GOARCH)) \
+ --build-arg BUILD_TAG=$(shell cd .. && make --quiet --no-print-directory tag) \
-f $< \
..
</code_context>
<issue_to_address>
**suggestion:** The `BUILD_TAG` make invocation is duplicated and could be centralized
This `$(shell cd .. && make --quiet --no-print-directory tag)` call is now used in multiple targets (e.g., `docker-build` and `docker-buildx`). To keep tag computation consistent and easier to change later, please extract it into a Make variable (e.g., `BUILD_TAG_CMD` or `IMAGE_TAG`) and reference that in each target instead.
Suggested implementation:
```
# Force re-building the file to make sure the current environment is correctly reflected.
.PHONY: build/Dockerfile.gen
sed -e 's,$${ROX_IMAGE_FLAVOR},$(ROX_IMAGE_FLAVOR),g; s,$${BUILD_IMAGE_VERSION},$(shell sed 's/\s*\#.*//' ../BUILD_IMAGE_VERSION),' < $< > $@
# Centralized build tag computation so all targets use the same tag logic.
BUILD_TAG ?= $(shell cd .. && make --quiet --no-print-directory tag)
.PHONY: docker-build
docker-build: build/Dockerfile.gen ## Build docker image with the operator.
BUILDKIT_PROGRESS=plain ../scripts/docker-build.sh \
-t ${IMG} \
$(if $(GOARCH),--build-arg TARGET_ARCH=$(GOARCH)) \
--build-arg BUILD_TAG=$(BUILD_TAG) \
-f $< \
..
```
Any other targets in `operator/Makefile` that currently invoke:
`$(shell cd .. && make --quiet --no-print-directory tag)`
(e.g., in `docker-buildx` or elsewhere) should be updated to use `$(BUILD_TAG)` instead, for example:
`--build-arg BUILD_TAG=$(BUILD_TAG)` or `-t ${IMG}:$(BUILD_TAG)` depending on how the tag is used.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
Go's test cache computes a test binary ID that includes -X ldflags. Since -X MainVersion and -X GitShortSha change on every commit, the test binary ID changes too, causing 100% test cache misses (Phase 1). Fix: generate pkg/version/internal/zversion.go with an init() function instead of passing version data via -X ldflags. For tests, use only the base tag (e.g. "4.11.x") which is stable across commits. For builds, use the full git-describe version. This eliminates -X ldflags entirely from the link step, stabilizing the link ActionID. Result: test binary IDs are constant across commits (fixing Phase 1), and build cache hits improve from ~0% to ~99.9% (only 2/4403 packages recompile — version packages whose output is byte-identical anyway). This is the companion to PR #19395 (mtime fix for Phase 2). Together they enable 4.5-52x test speedup. Addresses concerns from PR #15609: -coverprofile caching is fixed in Go 1.25 (we're on 1.25.5), and GOTAGS=release shares 93% of cache entries with default — separation is counterproductive. Partially generated by AI. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
21b91d0 to
a564561
Compare
|
@sourcery-ai review |
SourceryAI
left a comment
There was a problem hiding this comment.
Hey - I've found 2 issues, and left some high level feedback:
- The
generate_version_filetest path (git describe --tags --abbrev=0 --exclude '*-nightly-*') hard-fails when no suitable tag exists; consider a fallback (e.g., a default version string or branch+SHA) so local or ephemeral environments without tags don't breakgo test. - The
git_short_shaextraction fromBUILD_TAGviased -n 's/.*g\([0-9a-f]\{1,\}\)$/\1/p'assumes a very specific version suffix; if Konflux/release tags change format, this will silently yield an empty SHA, so it may be safer to validate and log/die on unexpected formats or to pass the SHA explicitly as a separate build arg.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- The `generate_version_file` test path (`git describe --tags --abbrev=0 --exclude '*-nightly-*'`) hard-fails when no suitable tag exists; consider a fallback (e.g., a default version string or branch+SHA) so local or ephemeral environments without tags don't break `go test`.
- The `git_short_sha` extraction from `BUILD_TAG` via `sed -n 's/.*g\([0-9a-f]\{1,\}\)$/\1/p'` assumes a very specific version suffix; if Konflux/release tags change format, this will silently yield an empty SHA, so it may be safer to validate and log/die on unexpected formats or to pass the SHA explicitly as a separate build arg.
## Individual Comments
### Comment 1
<location path="scripts/go-tool.sh" line_range="25-12" />
<code_context>
+generate_version_file() {
</code_context>
<issue_to_address>
**question:** Consider guarding git invocations when TOOL=test to avoid failures in build contexts without a .git directory
In the `TOOL == "test"` path, `generate_version_file` always runs `git describe` from `${REPO_ROOT}`. In Docker or other environments without a `.git` directory, this will hit `die "git describe failed"`, causing tests to hard-fail even when version info isn’t essential.
To avoid that, you could either (a) skip version generation in test mode when `.git` is absent, or (b) let `BUILD_TAG` override the `TOOL == "test"` behavior so tests still run in non‑git environments.
</issue_to_address>
### Comment 2
<location path="operator/Dockerfile" line_range="53-54" />
<code_context>
+
+# go-tool.sh needs BUILD_TAG to generate version info when there's no .git
+# directory. The Makefile sets this via --build-arg from `make tag`.
+ARG BUILD_TAG
+ENV BUILD_TAG=${BUILD_TAG}
# We've been historically building operator without CGO both upstream and downstream.
</code_context>
<issue_to_address>
**question (bug_risk):** Clarify behavior when BUILD_TAG is not provided to the Docker build
With the current setup, a plain `docker build` without `--build-arg BUILD_TAG=...` will leave `BUILD_TAG` empty. In images without `.git`, `go-tool.sh` will then fail when it tries `git describe`. Consider either enforcing that `BUILD_TAG` is set (e.g., `RUN test -n "$BUILD_TAG"`) or adding a non-`.git`-based fallback for version generation.
</issue_to_address>Hi @davdhacs! 👋
Thanks for trying out Sourcery by commenting with @sourcery-ai review! 🚀
Install the sourcery-ai bot to get automatic code reviews on every pull request ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.| @@ -12,46 +12,55 @@ die() { | |||
| } | |||
There was a problem hiding this comment.
question: Consider guarding git invocations when TOOL=test to avoid failures in build contexts without a .git directory
In the TOOL == "test" path, generate_version_file always runs git describe from ${REPO_ROOT}. In Docker or other environments without a .git directory, this will hit die "git describe failed", causing tests to hard-fail even when version info isn’t essential.
To avoid that, you could either (a) skip version generation in test mode when .git is absent, or (b) let BUILD_TAG override the TOOL == "test" behavior so tests still run in non‑git environments.
| ARG BUILD_TAG | ||
| ENV BUILD_TAG=${BUILD_TAG} |
There was a problem hiding this comment.
question (bug_risk): Clarify behavior when BUILD_TAG is not provided to the Docker build
With the current setup, a plain docker build without --build-arg BUILD_TAG=... will leave BUILD_TAG empty. In images without .git, go-tool.sh will then fail when it tries git describe. Consider either enforcing that BUILD_TAG is set (e.g., RUN test -n "$BUILD_TAG") or adding a non-.git-based fallback for version generation.
There was a problem hiding this comment.
Hey - I've left some high level feedback:
- In go-tool.sh, the non-test path without BUILD_TAG assumes a functional git repo and tags; you might want a more explicit error (e.g., detecting missing .git or no tags) to avoid opaque
git describe failedmessages when the script is used in reduced build contexts. - The BUILD_TAG parsing for GitShortSha in go-tool.sh assumes a
...g<sha>suffix and silently falls back to an empty SHA if the format changes; consider either validating the expected format with a clearer failure or logging that GitShortSha will be empty to make debugging version issues easier.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- In go-tool.sh, the non-test path without BUILD_TAG assumes a functional git repo and tags; you might want a more explicit error (e.g., detecting missing .git or no tags) to avoid opaque `git describe failed` messages when the script is used in reduced build contexts.
- The BUILD_TAG parsing for GitShortSha in go-tool.sh assumes a `...g<sha>` suffix and silently falls back to an empty SHA if the format changes; consider either validating the expected format with a clearer failure or logging that GitShortSha will be empty to make debugging version issues easier.Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
|
suggestion from @janisz : try smaller PRs: 1. static version for tests. 2. goembed with the existing files |
|
Closing — this was an earlier approach (zversion.go) to stabilize build ldflags. Superseded by the discovery that Go doesn't cache linked binaries in GOCACHE for The only cache benefit from stable ldflags is for test caching (Phase 1 = test binary identity includes link ActionID). This is handled by #19617 with hardcoded test-only ldflags. See #19643 closing comment for full details. |
Problem:
-Xldflags break Go's build and test cacheGo's build cache keys every action (compile, link, test) on an ActionID — a hash of all inputs that affect the output. The link ActionID includes the full
-ldflagsstring (exec.go:linkActionID()— seefmt.Fprintf(h, "link\n")...fmt.Fprintf(h, "ldflags %q\n", ldflags)).The old
status.sh+//XDef:mechanism injected version data via-Xldflags:Since
MainVersionandGitShortShachange on every commit, the ldflags string changes, which changes the link ActionID. This has two effects:Build cache miss at link step: every binary re-links on every commit, even if no source code changed. Go's compile cache uses dependency contentID (output hash) rather than actionID, so compile steps can still cache-hit. But the link step always re-runs because its ActionID includes the changed ldflags. On CI, this means ~100s spent re-linking binaries that haven't actually changed.
Test cache miss at Phase 1: Go's test result cache computes a test binary identity from the test binary's build ActionID, which includes the link ActionID (
test.go:builderTest()— the test binary hash flows throughpackageActionID()→linkActionID()). When the link ActionID changes, the test binary identity changes, and Go discards all cached test results — every test re-runs from scratch.Fix: generated
zversion.goinstead of-XldflagsReplace
-Xldflags with a generatedpkg/version/internal/zversion.gofile that sets version data viainit(). This eliminates-Xfrom the link step entirely, making ldflags stable across commits (-s -wonly).For tests,
go-tool.shwrites only the base tag (e.g."4.11.x") — stable across commits on the same release branch. For builds, it writes the fullgit describeoutput (e.g."4.11.x-431-gbbcc7d3be2").The generated
zversion.gois a source file, so it flows through the compile ActionID (content-hashed). When its content doesn't change (test mode), the compile output is byte-identical, giving the same contentID, so all 4400+ downstream packages get cache hits at both compile and link. Only 2 packages recompile (the version packages themselves, whose output is byte-identical anyway). Result: build compilation step drops from 100s → 27s (3.7x), and this improvement works independently of the test cache fix.Companion PRs
-buildvcs(split out from this PR).Test caching requires both Phase 1 (this PR) and Phase 2 (#19395). The build cache improvement from this PR works on its own.
Prior art: PR #15609
PR #15609 was closed due to two blockers (comment). Both are now resolved:
-coverprofilebroke test caching (Go <1.25): fixed in Go 1.25 (golang/go#23565). We are on Go 1.25.5.Measured results (both PRs combined)
Build compilation step (this PR alone): 100s → 27s (3.7x), with 99.95% cache hit rate (2/4403 packages recompile).
Testing and quality
Automated testing
How I validated my change
go-build.shgenerates full version,go-test.shgenerates stable base tagBUILD_TAGmechanism