ci: add background disk usage monitor to job-preamble by davdhacs · Pull Request #19397 · stackrox/stackrox

davdhacs · 2026-03-12T16:16:48Z

Description

Adds a background disk usage monitor to the job-preamble action. A subshell polls df every 30 seconds and appends available disk space to /dev/shm/disk-monitor.log (RAM-backed tmpfs, survives disk full). The existing record_job_info post-step dumps the log and kills the monitor.

Example output:

17:59:00  92GB
17:59:30  92GB
18:00:00  92GB
18:00:30  92GB

No nohup or gacts/run-and-post-run needed — regular composite action steps don't wait for background children.

User-facing documentation

CHANGELOG.md is updated OR update is not needed
documentation PR is created and is linked above OR is not needed

Testing and quality

the change is production ready: the change is GA, or otherwise the functionality is gated by a feature flag
CI results are inspected

Automated testing

How I validated my change

Verified jobs complete without blocking (subshell backgrounding works)
Verified post-step outputs disk usage timeline in step logs
Verified monitor keeps logging under disk pressure (tested with fill-to-2GB workflow)
Verified shellcheck passes

Generated with Claude Code

openshift-ci · 2026-03-12T16:16:53Z

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

sourcery-ai

Hey - I've found 2 issues, and left some high level feedback:

The parsing of avail values assumes a strictly numeric prefix (sed -n 's/.*avail=$[0-9]*$.*/\1/p'), but df -BGB typically emits values with unit suffixes (e.g. 13G/13GB); consider normalizing by stripping all non-digits (e.g. tr -dc '0-9') once and reusing that to avoid brittle parsing.
When first_avail < min_avail (e.g. if disk is freed over time), peak_consumed=$((first_avail - min_avail)) becomes negative; it may be clearer to clamp this to zero or take the absolute/maximum difference to avoid confusing negative "consumption" values in the summary.

Prompt for AI Agents

Please address the comments from this code review:

## Overall Comments
- The parsing of `avail` values assumes a strictly numeric prefix (`sed -n 's/.*avail=\([0-9]*\).*/\1/p'`), but `df -BGB` typically emits values with unit suffixes (e.g. `13G`/`13GB`); consider normalizing by stripping all non-digits (e.g. `tr -dc '0-9'`) once and reusing that to avoid brittle parsing.
- When `first_avail < min_avail` (e.g. if disk is freed over time), `peak_consumed=$((first_avail - min_avail))` becomes negative; it may be clearer to clamp this to zero or take the absolute/maximum difference to avoid confusing negative "consumption" values in the summary.

## Individual Comments

### Comment 1
<location path=".github/actions/job-preamble/action.yaml" line_range="146-148" />
<code_context>
+      with:
+        shell: bash
+        run: |
+          LOGFILE="/tmp/disk-usage-monitor.log"
+          PIDFILE="/tmp/disk-usage-monitor.pid"
+
+          # Record initial snapshot
</code_context>
<issue_to_address>
**suggestion (bug_risk):** Use per-job unique paths for LOGFILE/PIDFILE to avoid cross-job interference on shared runners

On self-hosted or reused runners, multiple jobs can share /tmp and overlap in time. Fixed LOGFILE/PIDFILE names can cause jobs to interfere with each other (e.g., killing another job’s monitor or overwriting its log). Please derive these paths from something unique like $GITHUB_RUN_ID, $GITHUB_JOB, and/or $$ (e.g., /tmp/disk-usage-monitor-${GITHUB_RUN_ID}-${GITHUB_JOB}.log).

```suggestion
        run: |
          # Use per-job unique paths to avoid cross-job interference on shared runners
          LOGFILE="/tmp/disk-usage-monitor-${GITHUB_RUN_ID:-unknown-run}-${GITHUB_JOB:-unknown-job}-$$.log"
          PIDFILE="/tmp/disk-usage-monitor-${GITHUB_RUN_ID:-unknown-run}-${GITHUB_JOB:-unknown-job}-$$.pid"
```
</issue_to_address>

### Comment 2
<location path=".github/actions/job-preamble/action.yaml" line_range="203-210" />
<code_context>
+          min_ts=""
+          first_avail=""
+          last_avail=""
+          for entry in "${entries[@]}"; do
+            avail_val=$(echo "$entry" | sed -n 's/.*avail=\([0-9]*\).*/\1/p')
+            entry_ts=$(echo "$entry" | cut -d' ' -f1)
+            if [[ -z "$first_avail" ]]; then
+              first_avail="$avail_val"
+            fi
+            last_avail="$avail_val"
+            if [[ "$avail_val" -lt "$min_avail" ]]; then
+              min_avail="$avail_val"
+              min_ts="$entry_ts"
</code_context>
<issue_to_address>
**issue (bug_risk):** Guard against empty or unparsable avail values before doing integer comparisons

If df or the log format ever produce a non-integer or missing avail value, sed will leave avail_val empty and [[ "$avail_val" -lt "$min_avail" ]] will raise a non-integer operand error, which can break this step. Consider validating avail_val with something like [[ "$avail_val" =~ ^[0-9]+$ ]] and skipping or handling entries that don’t match before assigning first_avail/last_avail or doing -lt comparisons.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

.github/actions/job-preamble/action.yaml

rhacs-bot · 2026-03-12T16:50:22Z

Images are ready for the commit at dad2a76.

To use with deploy scripts, first export MAIN_IMAGE_TAG=4.11.x-529-gdad2a76a91.

Adds a background df poll (every 30s) that logs available disk space to /dev/shm (RAM-backed tmpfs, survives disk full). The existing record_job_info post-step kills the monitor and dumps the log. Uses a plain subshell & in a regular step (no gacts/run-and-post-run needed since regular composite action steps don't wait for background children). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

codecov · 2026-03-12T18:29:12Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 49.66%. Comparing base (0ca1671) to head (ed7615d).

Additional details and impacted files

@@           Coverage Diff           @@
##           master   #19397   +/-   ##
=======================================
  Coverage   49.66%   49.66%           
=======================================
  Files        2748     2748           
  Lines      207354   207354           
=======================================
+ Hits       102987   102990    +3     
+ Misses      96711    96709    -2     
+ Partials     7656     7655    -1

Flag	Coverage Δ
go-unit-tests	`49.66% <ø> (+<0.01%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

…age-monitor

coderabbitai · 2026-03-31T15:33:24Z

📝 Walkthrough

Summary by CodeRabbit

Chores
- Added disk space monitoring to CI/CD workflows, which periodically records free-disk metrics during job execution and generates a summary report upon completion.

Walkthrough

A GitHub Actions composite action is modified to add disk usage monitoring that periodically captures free-disk metrics throughout job execution, logs the collected data, and cleans up the monitoring process upon completion.

Changes

Cohort / File(s)	Summary
Disk Monitoring Addition `.github/actions/job-preamble/action.yaml`	Introduces background disk monitoring step that runs a periodic loop writing timestamped filesystem free-space values to a log file, records the process PID, and outputs/terminates the monitor in post-run cleanup.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly and specifically describes the main change: adding a background disk usage monitor to the job-preamble CI action.
Description check	✅ Passed	The description comprehensively covers the change with implementation details, example output, and validation steps; follows the template structure with appropriate sections filled out.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch davdhacs/ci-disk-usage-monitor

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.github/actions/job-preamble/action.yaml:
- Around line 175-176: Post-cleanup currently assumes /dev/shm/disk-monitor.log
and /dev/shm/disk-monitor.pid always exist and that the PID is valid; make the
steps defensive by first checking that /dev/shm/disk-monitor.log exists before
attempting to cat it, and for the PID check that /dev/shm/disk-monitor.pid
exists and contains a readable PID and that the process is alive (e.g., kill -0
or equivalent) before calling kill, and ensure any failure paths are swallowed
(|| true) so post-run cannot fail if the files or process are absent.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Central YAML (base), Organization UI (inherited)

Review profile: CHILL

Plan: Pro

Run ID: d145e9ee-5c0b-4be8-bc17-014d878aecd0

📥 Commits

Reviewing files that changed from the base of the PR and between 44d74f3 and 37b09c8.

📒 Files selected for processing (1)

.github/actions/job-preamble/action.yaml

.github/actions/job-preamble/action.yaml

…age-monitor

openshift-ci bot added the do-not-merge/work-in-progress label Mar 12, 2026

github-actions bot added area/ci ai-review labels Mar 12, 2026

sourcery-ai bot reviewed Mar 12, 2026

View reviewed changes

.github/actions/job-preamble/action.yaml Outdated Show resolved Hide resolved

.github/actions/job-preamble/action.yaml Outdated Show resolved Hide resolved

davdhacs force-pushed the davdhacs/ci-disk-usage-monitor branch 2 times, most recently from 911a901 to 9f87317 Compare March 12, 2026 16:44

davdhacs force-pushed the davdhacs/ci-disk-usage-monitor branch 6 times, most recently from e42a308 to 8f13dab Compare March 12, 2026 17:54

davdhacs force-pushed the davdhacs/ci-disk-usage-monitor branch from 8f13dab to 71e7ef3 Compare March 12, 2026 17:57

davdhacs mentioned this pull request Mar 12, 2026

fix(ci): increase operator build free-disk-space to 40GB #19396

Merged

9 tasks

Merge remote-tracking branch 'origin/master' into davdhacs/ci-disk-us…

37b09c8

…age-monitor

github-actions bot added the coderabbit-review label Mar 31, 2026

coderabbitai bot reviewed Mar 31, 2026

View reviewed changes

.github/actions/job-preamble/action.yaml Outdated Show resolved Hide resolved

davdhacs added 3 commits March 31, 2026 10:13

no errors from post record job info

27d31df

Merge remote-tracking branch 'origin/master' into davdhacs/ci-disk-us…

ed7615d

…age-monitor

Merge remote-tracking branch 'origin/master' into davdhacs/ci-disk-us…

dad2a76

…age-monitor

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ci: add background disk usage monitor to job-preamble#19397

ci: add background disk usage monitor to job-preamble#19397
davdhacs wants to merge 5 commits intomasterfrom
davdhacs/ci-disk-usage-monitor

davdhacs commented Mar 12, 2026 •

edited

Loading

Uh oh!

openshift-ci bot commented Mar 12, 2026

Uh oh!

sourcery-ai bot left a comment

Uh oh!

Uh oh!

Uh oh!

rhacs-bot commented Mar 12, 2026 •

edited

Loading

Uh oh!

codecov bot commented Mar 12, 2026 •

edited

Loading

Uh oh!

coderabbitai bot commented Mar 31, 2026

Summary by CodeRabbit

Walkthrough

Changes

Estimated code review effort

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

davdhacs commented Mar 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

User-facing documentation

Testing and quality

Automated testing

How I validated my change

Uh oh!

openshift-ci bot commented Mar 12, 2026

Uh oh!

sourcery-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

rhacs-bot commented Mar 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov bot commented Mar 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

coderabbitai bot commented Mar 31, 2026

Summary by CodeRabbit

Walkthrough

Changes

Estimated code review effort

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

davdhacs commented Mar 12, 2026 •

edited

Loading

rhacs-bot commented Mar 12, 2026 •

edited

Loading

codecov bot commented Mar 12, 2026 •

edited

Loading