Skip to content

Listener improvements - Usability & Stability #770

@lfrancke

Description

@lfrancke

Description

With our SDP 25.7 release we rolled out support for our Listener operator across the entire platform. This issue collects various items of feedback that we have received since that all aim to make usage of the Listener easier and more stable.

In particular our users might be facing the issue that their deployments stop working (unexpectedly) if their nodes receive new addresses. This is - in theory - documented but not in a way that makes it obvious what's going to happen. We could improve the documentation but instead we propose some changes that make it harder to do the wrong thing in the first place and to make the defaults "safer".

Value

  • We want these changes so users are not surprised by broken clusters without any obvious changes on their own
  • We want this so people can control themselves how much movement of services they want to allow or whether they favor stability over availability
  • We want this because we expect that more people will be using Listener going forward and we anticipate support requests do to this which wo want to preempt

Dependencies

  • This will require CRD changes for the ListenerClass
  • Optional changes to stackablectl
  • Demos & Getting Started guides need to be changed

Detailed description

Currently we default to using the listener class external-unstable in various places (demos & getting started). Together with the preset stable-nodes which we roll out this translates to services being created as NodePort and these are pinned to specific Kubernetes nodes by default for some/all (TODO: clarify) products & roles.

When a node now "rotates" as happens frequently in cloud scenarios these Listener PVs can't find "their" node anymore and can't be mounted to the Pods which in effect makes the affected service unavailable until the PV is deleted manually.

Tasks

Acceptance Criteria

  • Clusters using ephemeral nodes (changing IPs) work out of the box with all Stackable commands and docs
    • Clusters without LoadBalancers might fail some of these and that's fine, at least the failure happens right at deployment time and not at a random time later
  • I can read documentation that guides me in picking the right settings which doesn't assume knowledge about the Listener or other complex concepts. It should only talk about things like "cloud", "ips", "kubernetes nodes" etc.

PRs / Issues

Release Notes

  • The listener-operator default preset changed from stable-nodes to ephemeral-nodes, to not deploy NodePorts that pin Pods to the Kubernetes nodes any more. Previously, your external-stable NodePorts pined the Pod to a specific node, which caused problems with node rotation.
  • You can configure the preset explicitly using helm --set preset=stable-nodes/ephemeral-nodes/none or stackablectl --listener-class-preset stable-nodes/ephemeral-nodes/none. stackablectl automatically detects k3s and kind clusters and uses stable-nodes for them (as LoadBalancers aren't available)
  • When using NodePorts you can now configure whether pods should be pinned to specific nodes using .spec.pinnedNodePorts on the ListenerClass
  • You can read on the details in this issue

Other parts of them are already on stackabletech/listener-operator#347

Metadata

Metadata

Assignees

Labels

epicrelease-noteDenotes a PR that will be considered when it comes time to generate release notes.release/25.11.0

Type

Projects

Status

Done

Status

Done

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions