Skip to content

ROX-34942: increase collect-service-logs timeout to 30m#21045

Merged
davdhacs merged 9 commits into
masterfrom
davdhacs/collect-logs-timeout
Jun 10, 2026
Merged

ROX-34942: increase collect-service-logs timeout to 30m#21045
davdhacs merged 9 commits into
masterfrom
davdhacs/collect-logs-timeout

Conversation

@davdhacs

@davdhacs davdhacs commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

Description

EKS nightly e2e jobs (ROX-34942) fail with Post failed: exit 1 when collect-service-logs.sh exceeds the 900s COLLECT_TIMEOUT on namespaces with many resources. On the last passing run (May 31) post-test log collection took ~40 minutes per part; on June 7 all 850+ tests passed but the job failed because the stackrox namespace log collection timed out at 15 minutes.

Doubles COLLECT_TIMEOUT from 15 to 30 minutes.

Stacked on #20429 (Gradle 9 test discovery fix).

Testing and quality

How I validated my change

Analyzed EKS nightly CI runs from May 28 to June 8:

The 900s timeout is too tight for the stackrox namespace on EKS. 30 minutes provides headroom without being so long that a genuinely stuck collection blocks the job indefinitely.

davdhacs and others added 6 commits May 8, 2026 07:50
Reverts the Gradle 8.13 downgrade (PR #20384) and fixes the root cause:

1. junit-platform-launcher runtimeOnly dependency — Gradle 9 removed
   auto-provisioning. Without this, test tasks fail with "Failed to
   load JUnit Platform."

2. testClassesDirs + classpath wiring in configureEach — Gradle 9's
   register<Test> tasks don't inherit the test source set. Without
   this, custom test tasks (testBAT, testSMOKE, etc.) report NO-SOURCE
   even after compilation succeeds.

Root cause verified locally and on CI:
- compileTestGroovy produces class files
- testSMOKE/testBAT find and execute tests
- 78+ BAT tests pass on KinD

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…radle 9

The protobuf plugin 0.8.19 uses the deprecated 'convention' API
which was removed in Gradle 9. Upgrading to protobuf plugin 0.10+
requires API migration in both build.gradle.kts (sourceSet.java
→ sourceSet.extensions) and protobuf.gradle (DSL changes).

Keep the JUnit Platform and testClassesDirs fixes — they're needed
for both Gradle 8 and 9. The Gradle 9 upgrade needs to be paired
with the protobuf plugin migration as a separate effort.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
EKS nightly e2e jobs fail with "Post failed: exit 1" when
collect-service-logs.sh exceeds 900s on namespaces with many resources.
On the last passing run (May 31) post-test took ~40 minutes; on June 7
all tests passed but the job failed due to this timeout.

Double the per-namespace COLLECT_TIMEOUT to 30 minutes.

Partially generated by AI.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@openshift-ci

openshift-ci Bot commented Jun 9, 2026

Copy link
Copy Markdown

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@coderabbitai

coderabbitai Bot commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

Important

Review skipped

Auto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Enterprise

Run ID: 44e04ac8-6239-4a9f-951b-1d5599e2f225

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch davdhacs/collect-logs-timeout

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions

github-actions Bot commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

🚀 Build Images Ready

Images are ready for commit c447aa9. To use with deploy scripts:

export MAIN_IMAGE_TAG=4.12.x-138-gc447aa95cf

@davdhacs

davdhacs commented Jun 9, 2026

Copy link
Copy Markdown
Contributor Author

/test eks-qa-e2e-tests

@openshift-ci

openshift-ci Bot commented Jun 9, 2026

Copy link
Copy Markdown

@davdhacs: No presubmit jobs available for stackrox/stackrox@davdhacs/gradle9-fix

Details

In response to this:

/test eks-qa-e2e-tests

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@davdhacs davdhacs marked this pull request as ready for review June 9, 2026 19:42
Base automatically changed from davdhacs/gradle9-fix to master June 9, 2026 23:25
@davdhacs davdhacs requested a review from janisz as a code owner June 9, 2026 23:25
@davdhacs

davdhacs commented Jun 9, 2026

Copy link
Copy Markdown
Contributor Author

/test eks-qa-e2e-tests

@davdhacs

davdhacs commented Jun 9, 2026

Copy link
Copy Markdown
Contributor Author

/test eks-qa-e2e-tests

@davdhacs davdhacs requested review from msugakov and tommartensen June 9, 2026 23:33
davdhacs and others added 2 commits June 9, 2026 17:33
@davdhacs

Copy link
Copy Markdown
Contributor Author

/test eks-qa-e2e-tests

@davdhacs davdhacs added the auto-retest PRs with this label will be automatically retested if prow checks fails label Jun 10, 2026
@msugakov msugakov changed the title fix(ci): increase collect-service-logs timeout to 30m ROX-34942: increase collect-service-logs timeout to 30m Jun 10, 2026
@rhacs-bot

Copy link
Copy Markdown
Contributor

/retest

@davdhacs davdhacs merged commit c447aa9 into master Jun 10, 2026
105 checks passed
@davdhacs davdhacs deleted the davdhacs/collect-logs-timeout branch June 10, 2026 17:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ai-assisted ai-review area/ci auto-retest PRs with this label will be automatically retested if prow checks fails coderabbit-review

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants