Ci fast fail on job failure#5759
Conversation
67e0a3b to
12a497a
Compare
mythical-fred
left a comment
There was a problem hiding this comment.
Solid design. The two-step fallback (consolidate-outputs always runs, defaults to empty run_id) means the job truly never fails and never blocks. The test-integration-runtime.yml gap is intentional — that job runs against deployed infra, not build artifacts.
One soft note: the artifact list fetch uses per_page=100. If a future run ever exceeds 100 artifacts, the check silently falls back to rebuilding (not a correctness bug, just a missed optimization). Worth a comment or bumping to 200.
| needs: [invoke-build-rust, invoke-build-java] | ||
| needs: [check-prior-build, invoke-build-rust, invoke-build-java] | ||
| if: | | ||
| always() && |
There was a problem hiding this comment.
is there some special semantic associated with always() in gha? otherwise this is just if true && a && b and can be rewritten as if a && b
There was a problem hiding this comment.
always() is needed here, without it, GHA silently skips dependent jobs when a needs: job is skipped, so the result == 'skipped' checks would never be reached and test jobs would be silently skipped whenever the build phase is skipped.
always() is not true, it prevents GHA from skipping the job before evaluating the condition.
| needs: [invoke-build-rust] | ||
| needs: [check-prior-build, invoke-build-rust] | ||
| if: | | ||
| always() && |
| needs: [invoke-build-rust, invoke-build-java] | ||
| needs: [check-prior-build, invoke-build-rust, invoke-build-java] | ||
| if: | | ||
| always() && |
| invoke-generate-sbom: | ||
| name: Generate SBOMs | ||
| needs: [invoke-build-docker] | ||
| if: | |
| name: Integration Tests | ||
| needs: [invoke-build-docker] | ||
| needs: [check-prior-build, invoke-build-docker] | ||
| if: | |
| name: Integration Tests | ||
| needs: [invoke-build-java] | ||
| needs: [check-prior-build, invoke-build-java] | ||
| if: | |
| name: Java Tests | ||
| needs: [invoke-build-java] | ||
| needs: [check-prior-build, invoke-build-java] | ||
| if: | |
| name: Publish Crates (Dry Run) | ||
| needs: [invoke-build-rust] | ||
| needs: [check-prior-build, invoke-build-rust] | ||
| if: | |
| type: string | ||
| required: false | ||
| default: "" | ||
| workflow_dispatch: |
There was a problem hiding this comment.
does this also need a run id similar to test-integration-platform.yaml?
There was a problem hiding this comment.
yes, added in the latest commit.
|
note that github won't allow this to merge when you have merge commits in the branch since we require a linear history on main |
Add a check-prior-build job to ci.yml that queries the GitHub API for a prior run on the same commit SHA. If all required binary and Docker digest artifacts are found unexpired, the Rust, Java, and Docker build jobs are skipped and test workflows download artifacts from that prior run instead of rebuilding from scratch. Each test workflow now accepts an artifacts_run_id input passed through from ci.yml, and uses it as the run-id in actions/download-artifact so cross-run artifact fetches work with the built-in action and no third-party dependencies. Build jobs that are skipped leave result=skipped (not failure), so sentinel cancel jobs and the final main job are unaffected.
- Replace digests-linux-{amd64,arm64} check with docker-image-ready
artifact, which is only uploaded after the sha-{sha} tag is verified
in GHCR; this prevents skip_docker=true when the tag doesn't exist yet
- Bump artifact list fetch from per_page=100 to per_page=200
- Add actions: read + contents: read permissions to test-unit and
test-adapters workflows for cross-run artifact downloads
- Add actions: read + contents: read + packages: read permissions to
oss-platform-tests job for cross-run artifact downloads and private
GHCR image pulls
- Remove --cap-add=PERFMON from oss-platform-tests service options
(not available in the k8s runner environment)
c5f6349 to
0b0244e
Compare
Add a 'check-prior-build' job that queries the GitHub API for a prior run on the same commit SHA. If all required binary and Docker digest artifacts are found unexpired, the Rust, Java, and Docker build jobs are skipped and test workflows download artifacts from that prior run instead of rebuilding.