ci: add background disk usage monitor to job-preamble#19397
ci: add background disk usage monitor to job-preamble#19397
Conversation
|
Skipping CI for Draft Pull Request. |
There was a problem hiding this comment.
Hey - I've found 2 issues, and left some high level feedback:
- The parsing of
availvalues assumes a strictly numeric prefix (sed -n 's/.*avail=\([0-9]*\).*/\1/p'), butdf -BGBtypically emits values with unit suffixes (e.g.13G/13GB); consider normalizing by stripping all non-digits (e.g.tr -dc '0-9') once and reusing that to avoid brittle parsing. - When
first_avail < min_avail(e.g. if disk is freed over time),peak_consumed=$((first_avail - min_avail))becomes negative; it may be clearer to clamp this to zero or take the absolute/maximum difference to avoid confusing negative "consumption" values in the summary.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- The parsing of `avail` values assumes a strictly numeric prefix (`sed -n 's/.*avail=\([0-9]*\).*/\1/p'`), but `df -BGB` typically emits values with unit suffixes (e.g. `13G`/`13GB`); consider normalizing by stripping all non-digits (e.g. `tr -dc '0-9'`) once and reusing that to avoid brittle parsing.
- When `first_avail < min_avail` (e.g. if disk is freed over time), `peak_consumed=$((first_avail - min_avail))` becomes negative; it may be clearer to clamp this to zero or take the absolute/maximum difference to avoid confusing negative "consumption" values in the summary.
## Individual Comments
### Comment 1
<location path=".github/actions/job-preamble/action.yaml" line_range="146-148" />
<code_context>
+ with:
+ shell: bash
+ run: |
+ LOGFILE="/tmp/disk-usage-monitor.log"
+ PIDFILE="/tmp/disk-usage-monitor.pid"
+
+ # Record initial snapshot
</code_context>
<issue_to_address>
**suggestion (bug_risk):** Use per-job unique paths for LOGFILE/PIDFILE to avoid cross-job interference on shared runners
On self-hosted or reused runners, multiple jobs can share /tmp and overlap in time. Fixed LOGFILE/PIDFILE names can cause jobs to interfere with each other (e.g., killing another job’s monitor or overwriting its log). Please derive these paths from something unique like $GITHUB_RUN_ID, $GITHUB_JOB, and/or $$ (e.g., /tmp/disk-usage-monitor-${GITHUB_RUN_ID}-${GITHUB_JOB}.log).
```suggestion
run: |
# Use per-job unique paths to avoid cross-job interference on shared runners
LOGFILE="/tmp/disk-usage-monitor-${GITHUB_RUN_ID:-unknown-run}-${GITHUB_JOB:-unknown-job}-$$.log"
PIDFILE="/tmp/disk-usage-monitor-${GITHUB_RUN_ID:-unknown-run}-${GITHUB_JOB:-unknown-job}-$$.pid"
```
</issue_to_address>
### Comment 2
<location path=".github/actions/job-preamble/action.yaml" line_range="203-210" />
<code_context>
+ min_ts=""
+ first_avail=""
+ last_avail=""
+ for entry in "${entries[@]}"; do
+ avail_val=$(echo "$entry" | sed -n 's/.*avail=\([0-9]*\).*/\1/p')
+ entry_ts=$(echo "$entry" | cut -d' ' -f1)
+ if [[ -z "$first_avail" ]]; then
+ first_avail="$avail_val"
+ fi
+ last_avail="$avail_val"
+ if [[ "$avail_val" -lt "$min_avail" ]]; then
+ min_avail="$avail_val"
+ min_ts="$entry_ts"
</code_context>
<issue_to_address>
**issue (bug_risk):** Guard against empty or unparsable avail values before doing integer comparisons
If df or the log format ever produce a non-integer or missing avail value, sed will leave avail_val empty and [[ "$avail_val" -lt "$min_avail" ]] will raise a non-integer operand error, which can break this step. Consider validating avail_val with something like [[ "$avail_val" =~ ^[0-9]+$ ]] and skipping or handling entries that don’t match before assigning first_avail/last_avail or doing -lt comparisons.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
911a901 to
9f87317
Compare
|
Images are ready for the commit at dad2a76. To use with deploy scripts, first |
e42a308 to
8f13dab
Compare
Adds a background df poll (every 30s) that logs available disk space to /dev/shm (RAM-backed tmpfs, survives disk full). The existing record_job_info post-step kills the monitor and dumps the log. Uses a plain subshell & in a regular step (no gacts/run-and-post-run needed since regular composite action steps don't wait for background children). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
8f13dab to
71e7ef3
Compare
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #19397 +/- ##
=======================================
Coverage 49.66% 49.66%
=======================================
Files 2748 2748
Lines 207354 207354
=======================================
+ Hits 102987 102990 +3
+ Misses 96711 96709 -2
+ Partials 7656 7655 -1
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
📝 WalkthroughSummary by CodeRabbit
WalkthroughA GitHub Actions composite action is modified to add disk usage monitoring that periodically captures free-disk metrics throughout job execution, logs the collected data, and cleans up the monitoring process upon completion. Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes 🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In @.github/actions/job-preamble/action.yaml:
- Around line 175-176: Post-cleanup currently assumes /dev/shm/disk-monitor.log
and /dev/shm/disk-monitor.pid always exist and that the PID is valid; make the
steps defensive by first checking that /dev/shm/disk-monitor.log exists before
attempting to cat it, and for the PID check that /dev/shm/disk-monitor.pid
exists and contains a readable PID and that the process is alive (e.g., kill -0
or equivalent) before calling kill, and ensure any failure paths are swallowed
(|| true) so post-run cannot fail if the files or process are absent.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Central YAML (base), Organization UI (inherited)
Review profile: CHILL
Plan: Pro
Run ID: d145e9ee-5c0b-4be8-bc17-014d878aecd0
📒 Files selected for processing (1)
.github/actions/job-preamble/action.yaml
Description
Adds a background disk usage monitor to the job-preamble action. A subshell polls
dfevery 30 seconds and appends available disk space to/dev/shm/disk-monitor.log(RAM-backed tmpfs, survives disk full). The existingrecord_job_infopost-step dumps the log and kills the monitor.Example output:
No
nohuporgacts/run-and-post-runneeded — regular composite action steps don't wait for background children.User-facing documentation
Testing and quality
Automated testing
How I validated my change
Generated with Claude Code