Skip to content

perf(ci): remove container from go, go-postgres, go-bench jobs#19743

Open
davdhacs wants to merge 15 commits intomasterfrom
davdhacs/remove-container-go-tests
Open

perf(ci): remove container from go, go-postgres, go-bench jobs#19743
davdhacs wants to merge 15 commits intomasterfrom
davdhacs/remove-container-go-tests

Conversation

@davdhacs
Copy link
Copy Markdown
Contributor

@davdhacs davdhacs commented Apr 1, 2026

Description

Run Go unittest jobs directly on GHA runner instead of in the apollo-ci container image. Saves ~55s container init overhead per job.

Changes per job:

  • go: remove container, add setup-go + GOTOOLCHAIN=auto
  • go-postgres: remove container, switch from su postgres to docker run --network=host postgres, add readiness check
  • go-bench: same as go-postgres, add matrix.pg for consistency

Postgres config:

  • POSTGRES_HOST_AUTH_METHOD=trust — passwordless TCP (replaces container's unix socket peer auth)
  • POSTGRES_INITDB_ARGS="--locale=C" — byte-order collation required by tests (e.g. central/splunk ID comparisons)
  • --network=host — binds directly to localhost, no port mapping needed. Postgres container EXPOSE's the default postgres port.

Timing comparison

Job Master (container, cached) PR cold PR warm (cached)
go ("") 8.6m 43m 6.7m
go (release) 8.2m 45m 6.8m
go-postgres ("", 15) 4.2m 34m 1.8m
go-postgres (release, 15) 3.5m 33m 2.1m
go-bench (15) 16.9m 20m 15.8m

Testing and quality

How I validated my change

CI validation with cold and warm cache runs (linked above). All 5 Go test jobs pass. Warm cached performance matches or exceeds master.

davdhacs and others added 5 commits March 31, 2026 10:06
Fix tests and scripts to work on ubuntu-latest host runners (no
container) in addition to CI container environments.

- rate_limited_logger.go: add /home/runner/work/ path prefix
  (container used /__w/)
- download_zip_test.go: accept ErrPermission for /root/ paths
  (non-root runner returns permission denied, not not-found)
- detection_test.go: delete circular test
- lib.sh: escape & in bash 5.2+ replacements (verified 4.4-5.3)
- helpers.bash: add yq_multidoc helper (yq 4.x adds --- separators)
- roxctl-netpol-generate-*.bats: use yq_multidoc

All changes are backward-compatible with container environments.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
grep -v returns exit code 1 when no lines match (e.g. yq output is
only --- separators), which would look like a yq failure. sed always
returns 0 regardless of whether lines were deleted.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Run Go test jobs directly on ubuntu-latest instead of in the CI
container image. This saves ~55s container init overhead per job
and enables faster cache I/O (no overlay filesystem).

Changes per job:
- Remove container: block
- Add actions/setup-go with GOTOOLCHAIN=auto override
- Replace su postgres with docker run postgres
- Add pg_isready retry loop (docker postgres needs warmup time)
- go-bench: skip on PRs (if: github.event_name == 'push')

Depends on #19712 (host-runner test fixes).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@openshift-ci
Copy link
Copy Markdown

openshift-ci bot commented Apr 1, 2026

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

Copy link
Copy Markdown
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've left some high level feedback:

  • The docker run ... --name postgres in both go-postgres and go-bench jobs never stops/removes the container explicitly; consider adding a cleanup step (e.g., docker stop postgres || true) to avoid name/port conflicts on reused runners or when debugging failures.
  • The Postgres setup and readiness loop are duplicated between go-postgres and go-bench; consider extracting this into a reusable action or composite step to keep the workflow DRY and make future changes to the DB bootstrap logic less error-prone.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- The `docker run ... --name postgres` in both `go-postgres` and `go-bench` jobs never stops/removes the container explicitly; consider adding a cleanup step (e.g., `docker stop postgres || true`) to avoid name/port conflicts on reused runners or when debugging failures.
- The Postgres setup and readiness loop are duplicated between `go-postgres` and `go-bench`; consider extracting this into a reusable action or composite step to keep the workflow DRY and make future changes to the DB bootstrap logic less error-prone.

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

davdhacs added a commit that referenced this pull request Apr 1, 2026
Split the monolithic go job into 5 shards (pkg-helm, pkg-other,
central-1, central-2, rest) and go-postgres into 2 shards
(main, migrator) for parallel execution.

Also split integration and operator tests into separate jobs
(go-integration, go-operator-integration) so they run in parallel
with unit tests instead of sequentially after them.

Depends on #19743 (container removal) and #19742 (SKIP_DEPS +
postgres subtargets).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@rhacs-bot
Copy link
Copy Markdown
Contributor

rhacs-bot commented Apr 1, 2026

Images are ready for the commit at 48fa9ab.

To use with deploy scripts, first export MAIN_IMAGE_TAG=4.11.x-533-g48fa9abf75.

@codecov
Copy link
Copy Markdown

codecov bot commented Apr 1, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 49.60%. Comparing base (39d15cc) to head (7442774).

Additional details and impacted files
@@           Coverage Diff           @@
##           master   #19743   +/-   ##
=======================================
  Coverage   49.60%   49.60%           
=======================================
  Files        2766     2766           
  Lines      208567   208567           
=======================================
+ Hits       103454   103462    +8     
+ Misses      97436    97431    -5     
+ Partials     7677     7674    -3     
Flag Coverage Δ
go-unit-tests 49.60% <ø> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@davdhacs davdhacs changed the base branch from davdhacs/host-runner-test-fixes to master April 1, 2026 19:58
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Apr 1, 2026

📝 Walkthrough

Walkthrough

CI workflow switched from container-based execution to ubuntu-latest with setup-go action and Docker containers for Postgres. Test removal for container detection, file path trimming update for GitHub Actions, relaxed error validation in path-traversal test, Bash XML escaping fix, and new YAML multi-document parsing helper with corresponding test assertion updates.

Changes

Cohort / File(s) Summary
CI Workflow Configuration
.github/workflows/unit-tests.yaml
Removed container setup from go, go-postgres, and go-bench jobs; added actions/setup-go action with cache: false and GOTOOLCHAIN=auto environment override. Replaced local Postgres initialization with Docker container launching (docker.io/library/postgres), updated readiness checks to use docker exec polling, and added execution condition for go-bench to run only on push events.
Test Removal
pkg/containers/detection_test.go
Removed entire TestContainerDetection function that validated IsRunningInContainer() behavior against GITHUB_ACTIONS environment variable.
Logging Path Handling
pkg/logging/rate_limited_logger.go
Updated getTrimmedFilePath to recognize and trim GitHub Actions workspace prefix (/home/runner/work/stackrox/stackrox/) in addition to existing path prefixes, affecting rate-limit key derivation.
Test Validation
roxctl/common/zipdownload/download_zip_test.go
Modified TestExtractZipToFolder_PreventPathTraversal to accept both fs.ErrNotExist and fs.ErrPermission as valid outcomes when validating malicious files were not written, replacing strict ErrNotExist assertion.
Script XML Escaping
scripts/ci/lib.sh
Fixed JUnit XML description escaping in _save_junit_record by using escaped replacement strings (\&amp;, \&lt;, etc.) in Bash pattern substitution for proper XML entity generation.
Test Helpers & Assertions
tests/roxctl/bats-tests/helpers.bash, tests/roxctl/bats-tests/local/roxctl-netpol-generate-*.bats
Added new yq_multidoc() helper function to remove YAML document separators (^---$). Replaced yq e with yq_multidoc e invocations in roxctl-netpol-generate-development.bats and roxctl-netpol-generate-release.bats for multi-document YAML parsing; removed obsolete commented-out yq v4-specific assertions.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and concisely summarizes the main change: removing container configuration from three CI jobs (go, go-postgres, go-bench) in favor of running directly on ubuntu-latest.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Description check ✅ Passed The PR description clearly explains the changes, timing improvements, and testing validation, meeting template requirements.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch davdhacs/remove-container-go-tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

davdhacs and others added 2 commits April 1, 2026 15:58
The go-bench PR skip is handled separately in #19758.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…tainer-go-tests

# Conflicts:
#	.github/workflows/unit-tests.yaml
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 8, 2026

🚀 Build Images Ready

Images are ready for commit 7442774. To use with deploy scripts:

export MAIN_IMAGE_TAG=4.11.x-616-g7442774484

davdhacs added 6 commits April 8, 2026 11:11
Use --network=host instead of -p 5432:5432, drop --locale=C to match
production en_US.utf8 default, add comments explaining trust auth, and
use matrix.pg consistently for the postgres version.
TestViolations in central/splunk depends on C-locale byte-order
collation for alert ID comparisons. The en_US.utf8 default from the
official postgres image sorts uppercase differently, breaking the test.
The container removal changes the GOCACHE path from /github/home/ to
/home/runner/, which actions/cache treats as a different cache version.
Force-saving on this branch seeds a cache with the new path so we can
validate warm-run performance. Revert before merge.
@davdhacs davdhacs marked this pull request as ready for review April 8, 2026 22:25
@davdhacs davdhacs requested a review from a team as a code owner April 8, 2026 22:25
Copy link
Copy Markdown
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've left some high level feedback:

  • The Postgres readiness step runs a single pg_isready with --timeout=60; consider a short retry loop (e.g. for i in {1..10}; do pg_isready ...) to make the workflow more robust against slow container startups rather than failing the entire job on a single check.
  • Both go-postgres and go-bench jobs use a fixed --name postgres for the container; if you ever run multiple Postgres instances in the same job or reuse this pattern elsewhere, it may be safer to suffix the name with the matrix values or ${{ github.job }} to avoid name collisions.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- The Postgres readiness step runs a single `pg_isready` with `--timeout=60`; consider a short retry loop (e.g. `for i in {1..10}; do pg_isready ...`) to make the workflow more robust against slow container startups rather than failing the entire job on a single check.
- Both go-postgres and go-bench jobs use a fixed `--name postgres` for the container; if you ever run multiple Postgres instances in the same job or reuse this pattern elsewhere, it may be safer to suffix the name with the matrix values or `${{ github.job }}` to avoid name collisions.

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

go-bench:
strategy:
matrix:
pg: [ '15' ]
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we add matrix here then we need to change required jobs. Let's skip it for now and hardcode 15 as any failures should be caught in unit tests, here we only validate if bench tests are runable.

docker run --rm -d --name postgres --network=host \
-e POSTGRES_HOST_AUTH_METHOD=trust \
-e POSTGRES_INITDB_ARGS="--locale=C" \
docker.io/library/postgres:${{ matrix.pg }}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we use our own image? Or maybe the db we are using in upstream?

FROM quay.io/sclorg/postgresql-${PG_VERSION}-c9s:latest AS final

The problem with docker.io is that we need to login first to prevent rate-limits. Al

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants