ci(experiment): CPU/cache diagnostics matrix by davdhacs · Pull Request #21044 · stackrox/stackrox

davdhacs · 2026-06-09T17:37:10Z

Description

Experiment to investigate whether Intel Xeon vs AMD EPYC GHA runners produce
different Go compile actionIDs, causing test cache misses when the GOCACHE is
shared across runner types. We observed identical code, identical cache keys,
but 11% vs 94% test cache hit rate depending on which CPU the runner had.

Scales the go job to 10 matrix copies (GOTAGS="" only) to sample runner
hardware diversity. Each copy logs CPU model, GOCACHE size, test cache hit
rate, and compile actionIDs for canary packages to the Step Summary.

Not intended to merge. Experiment only.

User-facing documentation

CHANGELOG.md is updated OR update is not needed

Testing and quality

CI results are inspected

Automated testing

modified existing tests

How I validated my change

Experiment PR — validation is the CI run output itself. Will compare Step
Summaries across the 10 copies to correlate CPU model with cache hit rate
and compile actionID differences.

Investigate whether Intel Xeon vs AMD EPYC runners produce different Go compile actionIDs, causing test cache misses across runner types. Changes to the go job in unit-tests.yaml: - Scale to 10 matrix copies (GOTAGS="" only) to sample runner hardware - Add post-test diagnostics step logging: CPU model, GOCACHE size, test cache hit rate, and compile actionIDs for canary packages - Strip non-essential steps (codecov, junit2jira, operator/integration) to keep experiment focused and fast Results will appear in each job's Step Summary for easy comparison. Prompt: set up a branch for collecting data on build time, cpuinfo, compilation cache, and used cached test results with a matrix dimension to sample CPU types Partially generated by AI. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

openshift-ci · 2026-06-09T17:37:14Z

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

coderabbitai · 2026-06-09T17:37:21Z

Important

Review skipped

Auto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Enterprise

Run ID: 92bb9cfc-ddcc-46f9-93a4-6686cb157222

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

🔍 Trigger review

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch davdhacs/cpu-cache-experiment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

github-actions · 2026-06-09T17:45:17Z

🚀 Build Images Ready

Images are ready for commit 8bf0f81. To use with deploy scripts:

export MAIN_IMAGE_TAG=4.12.x-134-g8bf0f814cb

…experiment

Runs 3 test packages (pkg/set, central/cluster/util, central/notifiers/slack) with gocachetest=1 on copy 1 before the main test run. The GODEBUG output goes to Step Summary showing Phase 1 vs Phase 2 miss reason. Also removes -trimpath from the main test (back to matching CI behavior) since we need to diagnose the current 78% hit rate, not test a fix.

The previous fix only stabilized BUILD_TAG and SHORTCOMMIT, leaving CollectorVersion, FactVersion, and ScannerVersion changing per commit. These unstable ldflags propagate through go-test.sh → status.sh → -X flags → link actionID → test binary ID, invalidating the test cache for every package that transitively depends on pkg/version/internal. This was the root cause of the 78% cache hit rate on stale branches (vs 96% on fresh branches where the versions happen to match the cache). Fix: add env var overrides in status.sh for STABLE_COLLECTOR_VERSION, STABLE_FACT_VERSION, STABLE_SCANNER_VERSION, set to 0.0.0 in the unit-tests workflow. Normal builds (without the env vars) are unaffected. Expected result: cache hit rate should match the 96% baseline regardless of how far behind master the branch is. Partially generated by AI.

…vious run

…ldflags

davdhacs added the ai-assisted label Jun 9, 2026

openshift-ci Bot added the do-not-merge/work-in-progress label Jun 9, 2026

github-actions Bot added area/ci ai-review coderabbit-review labels Jun 9, 2026

davdhacs added 8 commits June 9, 2026 12:01

ci: trigger second experiment run for more CPU sampling

f524e03

Merge remote-tracking branch 'origin/master' into davdhacs/cpu-cache-…

1468eb1

…experiment

ci: trigger run 3 — branch now merged with latest master

6c993e7

ci(experiment): add -trimpath to test cache hit rate

e9631e7

ci: trigger second run to test -trimpath cache hits

756cdd4

ci: trigger run 2 — cache should now have stabilized ldflags from pre…

7eea672

…vious run

davdhacs force-pushed the davdhacs/cpu-cache-experiment branch from 1448025 to 7eea672 Compare June 10, 2026 03:48

davdhacs added 3 commits June 9, 2026 22:20

ci: trigger follow-up — should hit cache saved with stabilized 0.0.0 …

096c175

…ldflags

ci(experiment): single copy to save full cache with stabilized ldflags

71ef842

ci: follow-up — confirm stabilized ldflag cache hit rate

8bf0f81

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ci(experiment): CPU/cache diagnostics matrix#21044

ci(experiment): CPU/cache diagnostics matrix#21044
davdhacs wants to merge 12 commits into
masterfrom
davdhacs/cpu-cache-experiment

davdhacs commented Jun 9, 2026

Uh oh!

openshift-ci Bot commented Jun 9, 2026

Uh oh!

coderabbitai Bot commented Jun 9, 2026 •

edited

Loading

Review skipped

Uh oh!

github-actions Bot commented Jun 9, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

davdhacs commented Jun 9, 2026

Description

User-facing documentation

Testing and quality

Automated testing

How I validated my change

Uh oh!

openshift-ci Bot commented Jun 9, 2026

Uh oh!

coderabbitai Bot commented Jun 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

Uh oh!

github-actions Bot commented Jun 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🚀 Build Images Ready

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

coderabbitai Bot commented Jun 9, 2026 •

edited

Loading

github-actions Bot commented Jun 9, 2026 •

edited

Loading