-
Notifications
You must be signed in to change notification settings - Fork 0
Description
Description
With our SDP 25.7 release we rolled out support for our Listener operator across the entire platform. This issue collects various items of feedback that we have received since that all aim to make usage of the Listener easier and more stable.
In particular our users might be facing the issue that their deployments stop working (unexpectedly) if their nodes receive new addresses. This is - in theory - documented but not in a way that makes it obvious what's going to happen. We could improve the documentation but instead we propose some changes that make it harder to do the wrong thing in the first place and to make the defaults "safer".
Value
- We want these changes so users are not surprised by broken clusters without any obvious changes on their own
- We want this so people can control themselves how much movement of services they want to allow or whether they favor stability over availability
- We want this because we expect that more people will be using Listener going forward and we anticipate support requests do to this which wo want to preempt
Dependencies
- This will require CRD changes for the ListenerClass
- Optional changes to stackablectl
- Demos & Getting Started guides need to be changed
Detailed description
Currently we default to using the listener class external-unstable in various places (demos & getting started). Together with the preset stable-nodes which we roll out this translates to services being created as NodePort and these are pinned to specific Kubernetes nodes by default for some/all (TODO: clarify) products & roles.
When a node now "rotates" as happens frequently in cloud scenarios these Listener PVs can't find "their" node anymore and can't be mounted to the Pods which in effect makes the affected service unavailable until the PV is deleted manually.
Tasks
- Think again about feat: Propagate externalTrafficPolicy from ListenerClass to Services listener-operator#196, as we might want to use something like user_provided.unwrap_or(None), so that it should work out of the box on IONOS => fix!: Default ListenerClass externalTrafficPolicy to null operator-rs#1107
- Change demos & getting started to use
external-stableaccording to https://github.com/stackabletech/decisions/issues/7 - Extend ListenerClass CRD with an extra field that allows configuring whether pinning should or should not be used for NodePorts
- Change the Helm default preset from
stable-nodestoephemeral-nodes - Adapt the documentation for these new changes and potentially add a section on why you would want to mark a class as "pinned" or not
- Test the changes on clusters with changing nodes
- OPTIONAL: Change
stackablectlto allow configuring the ListenerClass presets via the CLI
Acceptance Criteria
- Clusters using ephemeral nodes (changing IPs) work out of the box with all Stackable commands and docs
- Clusters without LoadBalancers might fail some of these and that's fine, at least the failure happens right at deployment time and not at a random time later
- I can read documentation that guides me in picking the right settings which doesn't assume knowledge about the Listener or other complex concepts. It should only talk about things like "cloud", "ips", "kubernetes nodes" etc.
PRs / Issues
- feat!: Add new
ListenerClass.pinnedNodePortsfield operator-rs#1105 - feat!: Make NodePort pinning configurable listener-operator#340
- chore: Update used ListenerClasses according to decision demos#312
- fix!: Default ListenerClass externalTrafficPolicy to null operator-rs#1107
- feat: Support configuring ListenerClass preset (with sensible defaults) stackable-cockpit#414
- fix!: Default ListenerClass externalTrafficPolicy to null listener-operator#347
- docs: Mention ListenerClass presets in installation guide listener-operator#348
- This also closes Application pods depending on listener get stuck in pending when GKE nodes restart listener-operator#342 (comment)
Release Notes
- The listener-operator default preset changed from
stable-nodestoephemeral-nodes, to not deploy NodePorts that pin Pods to the Kubernetes nodes any more. Previously, yourexternal-stableNodePorts pined the Pod to a specific node, which caused problems with node rotation. - You can configure the preset explicitly using
helm --set preset=stable-nodes/ephemeral-nodes/noneorstackablectl --listener-class-preset stable-nodes/ephemeral-nodes/none. stackablectl automatically detects k3s and kind clusters and usesstable-nodesfor them (as LoadBalancers aren't available) - When using NodePorts you can now configure whether pods should be pinned to specific nodes using
.spec.pinnedNodePortson the ListenerClass - You can read on the details in this issue
Other parts of them are already on stackabletech/listener-operator#347
Metadata
Metadata
Assignees
Labels
Type
Projects
Status
Status