Skip to content

chore(helm): disable liveness probes by default, allow all probe settings#21789

Merged
dannykopping merged 2 commits into
mainfrom
dk/liveness
Feb 2, 2026
Merged

chore(helm): disable liveness probes by default, allow all probe settings#21789
dannykopping merged 2 commits into
mainfrom
dk/liveness

Conversation

@dannykopping

Copy link
Copy Markdown
Contributor

Liveness checks are currently causing pods to be killed during long-running migrations.

They are generally not advisable for our workloads; if a pod becomes unresponsive we need to know about it (due to a deadlock, etc) and not paper over the issue by killing the pod.

I've also made all probe settings configurable.

Signed-off-by: Danny Kopping <danny@coder.com>
@dannykopping dannykopping self-assigned this Jan 30, 2026
@dannykopping dannykopping added the helm Area: helm chart label Jan 30, 2026
@coder-tasks

coder-tasks Bot commented Jan 30, 2026

Copy link
Copy Markdown
Contributor

Documentation Check

Updates Needed

  • docs/install/kubernetes.md - Update section 7 "All Kubernetes objects must define liveness and readiness probes" (lines 261-268) to reflect that:
    • Liveness probes are now disabled by default (explain the rationale from the PR description: they can cause pods to be killed during long-running migrations and can hide issues like deadlocks)
    • Readiness probes remain enabled by default
    • Document the new configuration options available for both probes (enabled, periodSeconds, timeoutSeconds, successThreshold, failureThreshold)
    • Add a link to the helm chart values.yaml or provide an example of how to customize these settings

Recommendation

Add a new subsection explaining probe configuration best practices, including:

  • When to enable/disable liveness probes
  • How to configure probe timing for production deployments
  • Link to the Kubernetes probe documentation

Automated review via Coder Tasks

Signed-off-by: Danny Kopping <danny@coder.com>
- Both the control plane and workspaces set resource request/limits by
default.

7. **All Kubernetes objects must define liveness and readiness probes**

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was pretty odd to include this under "Security" recommendations in the first place; removing.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ericpaulsen I know we're going deep into the archives on this one, but do you remember why this was in the list of security requirements?

@ericpaulsen ericpaulsen Feb 2, 2026

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dannykopping @spikecurtis i wrote this K8s security reference directly in line with K8s requirements customers were providing us, i.e. #7 was a specific ask on if our helm chart provided liveness & readiness probe config. though this may be more relevant for a K8s resiliency doc instead of a security one.

either way, that's the context.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool thanks @ericpaulsen 👍 since they're still provided (but liveness is disabled by default) I think that still satisfies the ask.

- Both the control plane and workspaces set resource request/limits by
default.

7. **All Kubernetes objects must define liveness and readiness probes**

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ericpaulsen I know we're going deep into the archives on this one, but do you remember why this was in the list of security requirements?

@dannykopping dannykopping enabled auto-merge (squash) February 2, 2026 13:24
@dannykopping dannykopping merged commit d0c67cc into main Feb 2, 2026
47 of 49 checks passed
@dannykopping dannykopping deleted the dk/liveness branch February 2, 2026 13:33
@github-actions github-actions Bot locked and limited conversation to collaborators Feb 2, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

helm Area: helm chart

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants