Skip to content

ROX-32459: Bump claircore to v1.5.50 to fix node/vm indexing#19422

Merged
vikin91 merged 3 commits intomasterfrom
piotr/ROX-32459-agent-avoids-proc
Mar 24, 2026
Merged

ROX-32459: Bump claircore to v1.5.50 to fix node/vm indexing#19422
vikin91 merged 3 commits intomasterfrom
piotr/ROX-32459-agent-avoids-proc

Conversation

@vikin91
Copy link
Copy Markdown
Contributor

@vikin91 vikin91 commented Mar 13, 2026

⚠️ Reviewers, please take a look at the performance degradation observations at the bottom of the PR description.

Description

This PR updates node indexing to work with newer ClairCore filesystem URI handling and improves resilience/correctness for the ROX-32459 scenario.

  • Bump github.com/quay/claircore from v1.5.44 to v1.5.50.
  • Convert node indexer filesystem layer input to normalized absolute file:// URIs before claircore.Layer.Init.
  • Add/adjust node indexer tests to validate URI behavior and updated error handling for empty host paths.
  • Update JSONFormat to FormatJSON to make the code compile (introduced here)

Why:

  • Newer ClairCore versions require file:// URI semantics for filesystem layers.
  • Versions after v1.5.44 include improvements relevant to ROX-32459 context (including filtering problematic filesystem access and package-scanner error propagation behavior).

Acknowledgments:

  • The issue documentation and discussion in ROX-32459 provided clear problem framing and validation targets.
  • This implementation was inspired by prior work and rationale in stackrox/stackrox#18341.

User-facing documentation

Testing and quality

  • the change is production ready: the change is GA, or otherwise the functionality is gated by a feature flag
  • CI results are inspected

Automated testing

  • added unit tests
  • added e2e tests
  • added regression tests
  • added compatibility tests
  • modified existing tests

How I validated my change

  • CI
  • By running the reproduction steps from the Jira ticket on a VM

Confirming the bug is fixed

# Building smaller root to save some time
sudo mkdir -p /tmp/roxroot/{etc,var,usr,root,proc}
sudo mount --bind /etc /tmp/roxroot/etc
sudo mount --bind /var /tmp/roxroot/var
sudo mount --bind /usr /tmp/roxroot/usr
sudo mount --bind /root /tmp/roxroot/root
sudo mount --bind /proc /tmp/roxroot/proc  # mounting on purpose to reproduce the issue

# starting zombie
$ (sleep 1 & exec /bin/sleep 180) &

$ time sudo /home/cloud-user/vm-agent-amd64 --port 818 --host-path /tmp/roxroot
2026/03/13 18:52:20 WARN rpm source packages always record 0 epoch; this may cause incorrect matching see-also="https://github.com/rpm-software-management/rpm/issues/2796 https://github.com/rpm-software-management/rpm/discussions/3703 https://github.com/rpm-software-management/rpm/pull/3755"
virtualmachines/roxagent/vsock: 2026/03/13 18:52:24.781263 client.go:79: Info: Sent message with index report containing 517 packages to host

real    1m22.010s
user    0m22.752s
sys     0m54.774s

# some time later the zombie finishes
[1]+  Done                    ( sleep 1 & exec /bin/sleep 180 )

✅ The fix is a solution to the issue. However, it makes the indexing pretty slow...

Observing performance

The change seem to significantly increase indexing times. Here few observations (still need more data for a conclusion).

  1. Scanning limited root, so that /proc is not being searched
sudo mkdir -p /tmp/roxroot/{etc,var,usr,root}
sudo mount --bind /etc /tmp/roxroot/etc
sudo mount --bind /var /tmp/roxroot/var
sudo mount --bind /usr /tmp/roxroot/usr
sudo mount --bind /root /tmp/roxroot/root


time sudo /home/cloud-user/vm-agent-amd64 --port 818 --host-path /tmp/roxroot

2026/03/13 18:32:17 WARN rpm source packages always record 0 epoch; this may cause incorrect matching see-also="https://github.com/rpm-software-management/rpm/issues/2796 https://github.com/rpm-software-management/rpm/discussions/3703 https://github.com/rpm-software-management/rpm/pull/3755"
virtualmachines/roxagent/vsock: 2026/03/13 18:32:21.564264 client.go:79: Info: Sent message with index report containing 517 packages to host

real    0m22.315s
user    0m9.513s
sys     0m11.275s
  1. Scanning full root (including /proc):
[cloud-user@rhel9-1 ~]$ time sudo /home/cloud-user/vm-agent-amd64 --port 818 --host-path /
2026/03/13 18:49:31 WARN rpm source packages always record 0 epoch; this may cause incorrect matching see-also="https://github.com/rpm-software-management/rpm/issues/2796 https://github.com/rpm-software-management/rpm/discussions/3703 https://github.com/rpm-software-management/rpm/pull/3755"
virtualmachines/roxagent/vsock: 2026/03/13 18:49:35.245293 client.go:79: Info: Sent message with index report containing 517 packages to host

real    1m45.496s
user    0m31.055s
sys     1m7.528s
  1. Additional obesrvation

In case multiple agents run in parallel (e.g., one daemon, other one-shot), then the runtimes are very long - I killed one run after 10 minutes of waiting for any result. CPU load spikes.

Suggestion: add lock file to prevent more than one agent running in parallel.

@vikin91
Copy link
Copy Markdown
Contributor Author

vikin91 commented Mar 13, 2026

This change is part of the following stack:

Change managed by git-spice.

@openshift-ci
Copy link
Copy Markdown

openshift-ci bot commented Mar 13, 2026

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@rhacs-bot
Copy link
Copy Markdown
Contributor

rhacs-bot commented Mar 13, 2026

Images are ready for the commit at 2996c00.

To use with deploy scripts, first export MAIN_IMAGE_TAG=4.11.x-328-g2996c00c85.

vikin91 added 3 commits March 16, 2026 10:51
Bump ClairCore to v1.5.50 and switch node index layer paths to normalized file:// URIs so VM/node indexing remains compatible with newer ClairCore URI handling and benefits from /proc access robustness plus correct package-scan error propagation.

User request: "update to 1.5.50 and implement the full ACS change for nodeIndexer/ROX-32459 context."

AI generated the dependency bump, URI conversion, and tests; user validated scope, selected version direction, and reviewed/corrected requirements during implementation.
@vikin91 vikin91 force-pushed the piotr/ROX-32459-agent-avoids-proc branch from f710db5 to 2996c00 Compare March 16, 2026 09:58
@codecov
Copy link
Copy Markdown

codecov bot commented Mar 16, 2026

Codecov Report

❌ Patch coverage is 80.00000% with 3 lines in your changes missing coverage. Please review.
✅ Project coverage is 49.70%. Comparing base (846febd) to head (2996c00).
⚠️ Report is 99 commits behind head on master.

Files with missing lines Patch % Lines
compliance/node/index/indexer.go 78.57% 2 Missing and 1 partial ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##           master   #19422   +/-   ##
=======================================
  Coverage   49.69%   49.70%           
=======================================
  Files        2702     2702           
  Lines      203538   203550   +12     
=======================================
+ Hits       101155   101166   +11     
+ Misses      94856    94855    -1     
- Partials     7527     7529    +2     
Flag Coverage Δ
go-unit-tests 49.70% <80.00%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@vikin91 vikin91 marked this pull request as ready for review March 18, 2026 16:11
@vikin91 vikin91 requested review from a team as code owners March 18, 2026 16:11
@vikin91 vikin91 requested review from guzalv and hdonnay March 18, 2026 16:12
@vikin91 vikin91 mentioned this pull request Mar 20, 2026
9 tasks
@BradLugo
Copy link
Copy Markdown
Contributor

BradLugo commented Mar 23, 2026

This PR updates node indexing to work with newer ClairCore filesystem URI handling and improves resilience/correctness for the ROX-32459 scenario.

AFAICT, these changes don't affect the scenario described in ROX-32459. In any case, I'd like to get this PR merged since I have some dependent work. FWIW, I'm happy to help out with ROX-32459. I agree that 1) we need a way to filter file paths (namely /proc) and 2) add a lock file for coordinating parallel roxagents.

Please let me know how I can help move this PR forward 🙂

@vikin91
Copy link
Copy Markdown
Contributor Author

vikin91 commented Mar 24, 2026

Please let me know how I can help move this PR forward 🙂

@BradLugo I will merge it now.

AFAICT, these changes don't affect the scenario described in ROX-32459.

I think they do, because with this version I am not seeing the bug that was easy to reproduce before it. But I agree that we can merge those and if there is any leftover problem it can be then handled in a follow-up.

I am happy to accept your support on the issues with walking over too many directories. Let's look at that in a separate PR or in a Slack discussion. I will make the next move, just wanted to keep someone from Scanner team informed of what I found.

@vikin91 vikin91 merged commit 16815a1 into master Mar 24, 2026
107 checks passed
@vikin91 vikin91 deleted the piotr/ROX-32459-agent-avoids-proc branch March 24, 2026 13:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants