Skip to content

Conversation

@umohnani8
Copy link
Contributor

- What I did
Replace require.NoError() with assert.NoError() in background goroutines that stream pod logs. require.NoError() calls t.FailNow() which internally calls runtime.Goexit(), terminating only the current goroutine without stopping the test. This causes undefined behavior and test failures.

assert.NoError() marks the test as failed without calling Goexit(), making it safe to use in goroutines.

This fixes the test failure where log streaming goroutines would silently fail without properly reporting errors.

- How to verify it
Failures in the TestMissingImageIsRebuilt should be reported in the logs

- Description for the changelog
Fix logs for TestMissingImageIsRebuilt

Replace require.NoError() with assert.NoError() in background goroutines
that stream pod logs. require.NoError() calls t.FailNow() which internally
calls runtime.Goexit(), terminating only the current goroutine without
stopping the test. This causes undefined behavior and test failures.

assert.NoError() marks the test as failed without calling Goexit(),
making it safe to use in goroutines.

This fixes the test failure where log streaming goroutines would silently
fail without properly reporting errors.
@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Dec 9, 2025
@openshift-ci-robot
Copy link
Contributor

@umohnani8: This pull request explicitly references no jira issue.

Details

In response to this:

- What I did
Replace require.NoError() with assert.NoError() in background goroutines that stream pod logs. require.NoError() calls t.FailNow() which internally calls runtime.Goexit(), terminating only the current goroutine without stopping the test. This causes undefined behavior and test failures.

assert.NoError() marks the test as failed without calling Goexit(), making it safe to use in goroutines.

This fixes the test failure where log streaming goroutines would silently fail without properly reporting errors.

- How to verify it
Failures in the TestMissingImageIsRebuilt should be reported in the logs

- Description for the changelog
Fix logs for TestMissingImageIsRebuilt

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@umohnani8
Copy link
Contributor Author

/test e2e-gcp-op-ocl

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Dec 9, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: umohnani8

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Dec 9, 2025
@umohnani8
Copy link
Contributor Author

/test e2e-gcp-op-ocl

2 similar comments
@umohnani8
Copy link
Contributor Author

/test e2e-gcp-op-ocl

@umohnani8
Copy link
Contributor Author

/test e2e-gcp-op-ocl

Signed-off-by: Urvashi <umohnani@redhat.com>
@umohnani8
Copy link
Contributor Author

/test e2e-gcp-op-ocl

The previous commit (48d070c) fixed the goroutine call sites by changing
require.NoError() to assert.NoError(), but missed a require.NoError() call
INSIDE streamMachineOSBuilderPodLogsToFile() that is still executed in the
goroutine's call stack.

When streamMachineOSBuilderPodLogsToFile() is called from a goroutine and
encounters an error listing pods, it calls require.NoError(t, err) which
internally calls t.FailNow() -> runtime.Goexit(). This terminates only the
current goroutine without properly failing the test, causing undefined
behavior and test hangs.

This commit:
1. Replaces require.NoError() with proper error returns in
   streamMachineOSBuilderPodLogsToFile()
2. Adds bounds checking for the pods.Items slice
3. Adds debug logging to waitForBuildToComplete() that logs build status
   every 30 seconds to help diagnose future timeout issues

This should resolve the persistent test timeouts that appeared after the
original goroutine safety fix.
@umohnani8
Copy link
Contributor Author

/test e2e-gcp-op-ocl

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Dec 18, 2025

@umohnani8: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/bootstrap-unit 2af73a9 link false /test bootstrap-unit
ci/prow/e2e-gcp-op-ocl 2af73a9 link false /test e2e-gcp-op-ocl

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants