fix(ci): increase GKE disk to 120GB#19218
Conversation
default is 2 minutes; we're exceeding 85% disk and so GC was removing prefetched images
|
/test gke-latest-qa-e2e-tests |
|
Images are ready for the commit at 8961197. To use with deploy scripts, first |
node-accessible is disk minus OS/etc (~42GB gke-latest) GC hits at 85% and evicts images not used in 2 minutes. Neither the 85% nor the 2 minutes can be increased.
|
/test gke-latest-qa-e2e-tests |
This reverts commit 5aab5eb.
|
/test gke-latest-qa-e2e-tests |
The prefetcher pulls images by tag via the CRI API, which stores them indexed by tag name. When tests reference the image as tag@sha256:<manifest-list-digest>, containerd 2.x cannot resolve it with imagePullPolicy: Never because the manifest list digest is not indexed as a named image by the CRI pull-by-tag path. This caused ErrImageNeverPull on every node regardless of disk size, as the image was present on disk but not findable by digest. Images referenced by tag only (busybox-1-33-1, nginx-1-12-1, etc.) worked fine with the same Never pull policy. Remove the @sha256: digest from TEST_IMAGE so it matches how the prefetcher stores the image. Keep TEST_IMAGE_SHA available for API queries that need the digest. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
/test gke-latest-qa-e2e-tests |
After the prefetcher completes, deploy a short-lived DaemonSet that runs ctr on each node to label all prefetched images with io.cri-containerd.pinned=pinned. This tells kubelet's image GC to skip these images regardless of disk pressure. The DaemonSet uses an init container for the actual work and a main container that exits immediately. The DaemonSet and its ConfigMap are deleted after completion to avoid leaving pods running. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
/test gke-latest-qa-e2e-tests |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #19218 +/- ##
=======================================
Coverage 49.61% 49.62%
=======================================
Files 2680 2680
Lines 202195 202195
=======================================
+ Hits 100327 100332 +5
+ Misses 94390 94387 -3
+ Partials 7478 7476 -2
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
The previous commit used `apk add containerd-ctr` but the package is in alpine's community repo which isn't enabled by default. The install silently failed (stderr redirected to /dev/null), so ctr was never installed and images were never actually pinned. Add the community repo URL explicitly via -X flag and remove the stderr suppression so failures are visible. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
| failed=0 | ||
| while IFS= read -r img || [ -n "$img" ]; do | ||
| case "$img" in "#"*|"") continue ;; esac | ||
| if ctr -a "$socket" -n k8s.io images label "$img" "io.cri-containerd.pinned=pinned" >/dev/null 2>&1; then |
There was a problem hiding this comment.
Why can the image-prefetcher not do this?
There was a problem hiding this comment.
I'm changing it here for testing quickly so I didn't need to alter the prefetcher and then use a dev build of the prefetcher.
if it fixes it, I imagine we'd put it into the prefetcher.
- I forgot to set this as a work-in-progress/draft. (I set it as a draft now)
|
/test gke-latest-qa-e2e-tests |
…time Installing containerd-ctr via apk at runtime is too slow (pulls full containerd package + deps from community repo), causing the 5-minute rollout timeout to be exceeded. Use ghcr.io/containerd/containerd:2.0 which ships with ctr already installed, eliminating the package install step entirely. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
/test gke-latest-qa-e2e-tests |
Previous attempts to install/provide ctr in the pinning DaemonSet failed: apk add was too slow, and ghcr.io/containerd/containerd:2.0 was too large to pull within the 5-minute timeout. Instead, use the image-prefetcher image (already cached on every node from the prefetch step) with hostPID and nsenter to execute the host's own ctr binary. This requires no image pull and no package install. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
/test gke-latest-qa-e2e-tests |
1 similar comment
|
/test gke-latest-qa-e2e-tests |
Previous approaches failed because: - apk add containerd-ctr: too slow (>5min timeout) - ghcr.io/containerd/containerd:2.0: too large to pull in time - nsenter via image-prefetcher image: no nsenter/sh available Use kubectl debug node/ which mounts the host filesystem at /host, giving access to the host's ctr binary via chroot. No image pull delays since busybox:1.36 is tiny, and no DaemonSet rollout needed. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
/test gke-latest-qa-e2e-tests |
The previous approach tried to pin images by the tag names from the prefetch list, but containerd stores multi-arch images under different references (manifest list digests, platform digests). Only 15-19 of 72 images were found by tag name. Instead, list ALL images in containerd's k8s.io namespace via `ctr images list -q` and pin every one. This catches all references regardless of how containerd indexed them. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
/test gke-latest-qa-e2e-tests |
kubectl debug with -it requires a TTY which is not available in CI, causing output capture to silently fail. Remove -it so the command runs non-interactively and its output is properly captured. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
/test gke-latest-qa-e2e-tests |
kubectl debug node/ without -it doesn't execute the command (just creates the pod and returns). With -it it needs a TTY unavailable in CI. Instead, use kubectl run with --overrides to create a pod per node with: - nodeName: targets specific node - hostPID: true: enables nsenter to enter host namespaces - nsenter -t 1 -m -u -n -p: runs the host's ctr directly - busybox:1.36: tiny image (~4MB), has nsenter built in Pods are launched in parallel, then we kubectl wait for completion and collect logs. This gives proper output capture and error handling. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
/test gke-latest-qa-e2e-tests |
kubectl run --overrides had shell quoting issues: $img, $p, $f in the pin_cmd variable were expanded as empty strings when embedded in the JSON. Also errors were suppressed with >/dev/null 2>&1. Switch to kubectl apply with a heredoc YAML manifest which avoids all quoting issues. Shell variables in the script body are escaped with \$ so they're interpreted by the container, not the CI shell. Also add kubectl describe on failure for better diagnostics. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
/test gke-latest-qa-e2e-tests |
There was a problem hiding this comment.
Hey - I've found 1 issue, and left some high level feedback:
- _image_prefetcher_pin_images currently pins all images in the containerd k8s.io namespace on every node; consider scoping this to just the prefetched images (e.g., via labels or a known list) to avoid unbounded pinning and unexpected interference with kubelet GC.
- In _image_prefetcher_pin_images, the second parameter (name) is unused and pod names are derived only from the node name suffix (${node##*-}), which can lead to confusion or collisions; consider either using the full node name or including the prefetch set name to make pod names unique and the signature meaningful.
- In BaseSpecification.groovy, TEST_IMAGE was changed to a tag-only reference while TEST_IMAGE_NAME_WITH_SHA still points to it and TEST_IMAGE_SHA remains separate, which makes the constant names misleading; consider renaming or restructuring these constants so that the "*_WITH_SHA" variant actually includes the digest and usages remain clear.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- _image_prefetcher_pin_images currently pins all images in the containerd k8s.io namespace on every node; consider scoping this to just the prefetched images (e.g., via labels or a known list) to avoid unbounded pinning and unexpected interference with kubelet GC.
- In _image_prefetcher_pin_images, the second parameter (name) is unused and pod names are derived only from the node name suffix (${node##*-}), which can lead to confusion or collisions; consider either using the full node name or including the prefetch set name to make pod names unique and the signature meaningful.
- In BaseSpecification.groovy, TEST_IMAGE was changed to a tag-only reference while TEST_IMAGE_NAME_WITH_SHA still points to it and TEST_IMAGE_SHA remains separate, which makes the constant names misleading; consider renaming or restructuring these constants so that the "*_WITH_SHA" variant actually includes the digest and usages remain clear.
## Individual Comments
### Comment 1
<location path="scripts/ci/lib.sh" line_range="820" />
<code_context>
+
+ # Launch pin pods on all nodes in parallel
+ for node in $nodes; do
+ local pod_name="pin-images-${node##*-}"
+ cat <<PINEOF | kubectl apply -n "$ns" -f -
+apiVersion: v1
</code_context>
<issue_to_address>
**issue (bug_risk):** Using only the node name suffix for pod_name risks collisions across nodes with similar suffixes.
With `${node##*-}`, different nodes that share a suffix (e.g., `gke-cluster-a-pool-1-abc` and `gke-cluster-b-pool-2-abc`) will generate the same `pod_name` (`pin-images-abc`). In that case `kubectl apply` will update a single pod instead of one per node. Consider using the full node name or a stable hash of it in `pod_name` to ensure uniqueness while keeping names deterministic.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
|
/test gke-latest-qa-e2e-tests |
66d3fd4 to
b634840
Compare
|
/test gke-latest-qa-e2e-tests |
b634840 to
332545a
Compare
|
/test gke-latest-qa-e2e-tests |
332545a to
6559da1
Compare
|
/test gke-latest-qa-e2e-tests |
io.cri-containerd.pinned has known bugs (containerd#9328, #10270) that make it unreliable for preventing image GC. Replace the pinning approach with a re-pull step: after the prefetcher completes, run ctr images pull on each node for any images whose tag reference was lost to GC. Since layers are still cached, re-pulls are near-instant. Uses the same image list configmap as the prefetcher. Only re-pulls images that are missing (ctr images check); skips images that are still present. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
6559da1 to
3092616
Compare
|
/test gke-latest-qa-e2e-tests |
Point to the branch-rox-33305-prevent-gc image of image-prefetcher which pins images via the containerd native API immediately after each CRI pull, preventing kubelet GC from evicting them. This replaces the post-hoc repull/pin approaches which all failed due to various containerd/CRI issues. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
/test gke-latest-qa-e2e-tests |
|
/test gke-latest-qa-e2e-tests |
|
@davdhacs: The following tests failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
default image garbage-collection expiration is 2 minutes (and cannot be increased). we're exceeding 85% disk and so GC was removing prefetched images
WIP: Latest commit tests setting a containerd "don't delete me" tag on the images after the prefetched pulls them.
I removed the SHAs from the image refs for the test pulls because the prefetcher fetches by the multi-arch SHA which doesn't match the arch-specific image SHA used in the test(s).
I increased the node instance's disk to 120GB from 80GB, but it still hit failures in some tests with the images not found:
logs showing image delete: https://console.cloud.google.com/logs/query;query=resource.labels.cluster_name%3D%22rox-ci-qa-e2e-test-2027140122893357056%22%0ASEARCH%2528%22'qa-image-management'%22%2529;cursorTimestamp=2026-02-26T22:29:38.105656477Z;duration=PT12H?authuser=0&project=acs-san-stackroxci
metrics showing used_bytes: https://console.cloud.google.com/monitoring/metrics-explorer;duration=PT12H?project=acs-san-stackroxci&pageState=%7B%22xyChart%22:%7B%22constantLines%22:%5B%5D,%22dataSets%22:%5B%7B%22plotType%22:%22LINE%22,%22pointConnectionMethod%22:%22GAP_DETECTION%22,%22prometheusQuery%22:%22max%20by%20(%5C%22node_name%5C%22)(max_over_time(%7B%5C%22__name__%5C%22%3D%5C%22kubernetes.io%2Fnode%2Fephemeral_storage%2Fused_bytes%5C%22,%5C%22monitored_resource%5C%22%3D%5C%22k8s_node%5C%22,%5C%22cluster_name%5C%22%3D~%5C%22rox-ci-qa-e2e-test-2027140122893357056%5C%22%7D%5B$%7B__interval%7D%5D))%22,%22targetAxis%22:%22Y1%22,%22unitOverride%22:%22%22%7D%5D,%22options%22:%7B%22mode%22:%22COLOR%22%7D,%22y1Axis%22:%7B%22label%22:%22%22,%22scale%22:%22LINEAR%22%7D%7D%7D