perf(ci): remove container from go, go-postgres, go-bench jobs#19743
perf(ci): remove container from go, go-postgres, go-bench jobs#19743
Conversation
Fix tests and scripts to work on ubuntu-latest host runners (no container) in addition to CI container environments. - rate_limited_logger.go: add /home/runner/work/ path prefix (container used /__w/) - download_zip_test.go: accept ErrPermission for /root/ paths (non-root runner returns permission denied, not not-found) - detection_test.go: delete circular test - lib.sh: escape & in bash 5.2+ replacements (verified 4.4-5.3) - helpers.bash: add yq_multidoc helper (yq 4.x adds --- separators) - roxctl-netpol-generate-*.bats: use yq_multidoc All changes are backward-compatible with container environments. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
grep -v returns exit code 1 when no lines match (e.g. yq output is only --- separators), which would look like a yq failure. sed always returns 0 regardless of whether lines were deleted. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Run Go test jobs directly on ubuntu-latest instead of in the CI container image. This saves ~55s container init overhead per job and enables faster cache I/O (no overlay filesystem). Changes per job: - Remove container: block - Add actions/setup-go with GOTOOLCHAIN=auto override - Replace su postgres with docker run postgres - Add pg_isready retry loop (docker postgres needs warmup time) - go-bench: skip on PRs (if: github.event_name == 'push') Depends on #19712 (host-runner test fixes). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
Skipping CI for Draft Pull Request. |
There was a problem hiding this comment.
Hey - I've left some high level feedback:
- The
docker run ... --name postgresin bothgo-postgresandgo-benchjobs never stops/removes the container explicitly; consider adding a cleanup step (e.g.,docker stop postgres || true) to avoid name/port conflicts on reused runners or when debugging failures. - The Postgres setup and readiness loop are duplicated between
go-postgresandgo-bench; consider extracting this into a reusable action or composite step to keep the workflow DRY and make future changes to the DB bootstrap logic less error-prone.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- The `docker run ... --name postgres` in both `go-postgres` and `go-bench` jobs never stops/removes the container explicitly; consider adding a cleanup step (e.g., `docker stop postgres || true`) to avoid name/port conflicts on reused runners or when debugging failures.
- The Postgres setup and readiness loop are duplicated between `go-postgres` and `go-bench`; consider extracting this into a reusable action or composite step to keep the workflow DRY and make future changes to the DB bootstrap logic less error-prone.Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
Split the monolithic go job into 5 shards (pkg-helm, pkg-other, central-1, central-2, rest) and go-postgres into 2 shards (main, migrator) for parallel execution. Also split integration and operator tests into separate jobs (go-integration, go-operator-integration) so they run in parallel with unit tests instead of sequentially after them. Depends on #19743 (container removal) and #19742 (SKIP_DEPS + postgres subtargets). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
Images are ready for the commit at 48fa9ab. To use with deploy scripts, first |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #19743 +/- ##
=======================================
Coverage 49.60% 49.60%
=======================================
Files 2766 2766
Lines 208567 208567
=======================================
+ Hits 103454 103462 +8
+ Misses 97436 97431 -5
+ Partials 7677 7674 -3
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
📝 WalkthroughWalkthroughCI workflow switched from container-based execution to ubuntu-latest with setup-go action and Docker containers for Postgres. Test removal for container detection, file path trimming update for GitHub Actions, relaxed error validation in path-traversal test, Bash XML escaping fix, and new YAML multi-document parsing helper with corresponding test assertion updates. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes 🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
The go-bench PR skip is handled separately in #19758. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…tainer-go-tests # Conflicts: # .github/workflows/unit-tests.yaml
🚀 Build Images ReadyImages are ready for commit 7442774. To use with deploy scripts: export MAIN_IMAGE_TAG=4.11.x-616-g7442774484 |
Use --network=host instead of -p 5432:5432, drop --locale=C to match production en_US.utf8 default, add comments explaining trust auth, and use matrix.pg consistently for the postgres version.
TestViolations in central/splunk depends on C-locale byte-order collation for alert ID comparisons. The en_US.utf8 default from the official postgres image sorts uppercase differently, breaking the test.
The container removal changes the GOCACHE path from /github/home/ to /home/runner/, which actions/cache treats as a different cache version. Force-saving on this branch seeds a cache with the new path so we can validate warm-run performance. Revert before merge.
There was a problem hiding this comment.
Hey - I've left some high level feedback:
- The Postgres readiness step runs a single
pg_isreadywith--timeout=60; consider a short retry loop (e.g.for i in {1..10}; do pg_isready ...) to make the workflow more robust against slow container startups rather than failing the entire job on a single check. - Both go-postgres and go-bench jobs use a fixed
--name postgresfor the container; if you ever run multiple Postgres instances in the same job or reuse this pattern elsewhere, it may be safer to suffix the name with the matrix values or${{ github.job }}to avoid name collisions.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- The Postgres readiness step runs a single `pg_isready` with `--timeout=60`; consider a short retry loop (e.g. `for i in {1..10}; do pg_isready ...`) to make the workflow more robust against slow container startups rather than failing the entire job on a single check.
- Both go-postgres and go-bench jobs use a fixed `--name postgres` for the container; if you ever run multiple Postgres instances in the same job or reuse this pattern elsewhere, it may be safer to suffix the name with the matrix values or `${{ github.job }}` to avoid name collisions.Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
| go-bench: | ||
| strategy: | ||
| matrix: | ||
| pg: [ '15' ] |
There was a problem hiding this comment.
If we add matrix here then we need to change required jobs. Let's skip it for now and hardcode 15 as any failures should be caught in unit tests, here we only validate if bench tests are runable.
| docker run --rm -d --name postgres --network=host \ | ||
| -e POSTGRES_HOST_AUTH_METHOD=trust \ | ||
| -e POSTGRES_INITDB_ARGS="--locale=C" \ | ||
| docker.io/library/postgres:${{ matrix.pg }} |
There was a problem hiding this comment.
can we use our own image? Or maybe the db we are using in upstream?
stackrox/image/postgres/Dockerfile
Line 2 in 2948960
The problem with docker.io is that we need to login first to prevent rate-limits. Al
Description
Run Go unittest jobs directly on GHA runner instead of in the
apollo-cicontainer image. Saves ~55s container init overhead per job.Changes per job:
setup-go+GOTOOLCHAIN=autosu postgrestodocker run --network=host postgres, add readiness checkmatrix.pgfor consistencyPostgres config:
POSTGRES_HOST_AUTH_METHOD=trust— passwordless TCP (replaces container's unix socket peer auth)POSTGRES_INITDB_ARGS="--locale=C"— byte-order collation required by tests (e.g.central/splunkID comparisons)--network=host— binds directly to localhost, no port mapping needed. Postgres container EXPOSE's the default postgres port.Timing comparison
Testing and quality
How I validated my change
CI validation with cold and warm cache runs (linked above). All 5 Go test jobs pass. Warm cached performance matches or exceeds master.