Skip to content

Nodes that are NotReady (and which might have a NodeNetworkUnavailable condition) are added to load balancer services #133052

@rybnico

Description

@rybnico

What happened?

A node that was NotReady was added to a LoadBalancer pool (using the OpenStack Cloud Controller Manager).
A LoadBalancer service (exposed on a NodePort) was not operational (but the ip/port was reachable and thus enabled by the external load balancer), because the node's networking was not yet fully operational.

What did you expect to happen?

I would have expected nodes to only be added to a load balancer pool when they are Ready (or when the NodeNetworkUnavailable condition is false).

How can we reproduce it (as minimally and precisely as possible)?

  • Add a LoadBalancer Service
  • Add a new Node that takes some time to get ready (e.g. by disabling the CNI DaemonSet on that node)
  • The Node is added to the LoadBalancer service

Anything else we need to know?

Although the service controller code contains a nodeReadyPredicate, it appears to be unused. The ensureLoadBalancer function uses stableNodeSetPredicates, which do not include the nodeReadyPredicate.
Consequently, nodes that are NotReady are added to a load balancer pool even when the node's networking is not yet fully operational.

I suggest either adding the nodeReadyPredicate to the stableNodeSetPredicate or adding a nodeNetworkNotUnavailablePredicate, and checking whether the v1.NodeNetworkUnavailable condition is false.

I could create a pull request for this.

Kubernetes version

Details
Server Version: v1.29.12

Cloud provider

Details `openstack-cloud-controller-manager:v1.29.0`

OS version

Details
$ cat /etc/os-release
NAME="Flatcar Container Linux by Kinvolk"
ID=flatcar
ID_LIKE=coreos
VERSION=3815.2.5
VERSION_ID=3815.2.5
BUILD_ID=2024-07-01-2356
SYSEXT_LEVEL=1.0
PRETTY_NAME="Flatcar Container Linux by Kinvolk 3815.2.5 (Oklo)"
ANSI_COLOR="38;5;75"
HOME_URL="https://flatcar.org/"
BUG_REPORT_URL="https://issues.flatcar.org"
FLATCAR_BOARD="amd64-usr"
CPE_NAME="cpe:2.3:o:flatcar-linux:flatcar_linux:3815.2.5:*:*:*:*:*:*:*"
$ uname -a
Linux xxx 6.1.96-flatcar #1 SMP PREEMPT_DYNAMIC Mon Jul  1 23:29:55 -00 2024 x86_64 AMD EPYC-Milan Processor AuthenticAMD GNU/Linux

Install tools

Details

Container runtime (CRI) and version (if applicable)

Details

Related plugins (CNI, CSI, ...) and versions (if applicable)

Details Calico v3.27.3

Metadata

Metadata

Assignees

No one assigned

    Labels

    kind/bugCategorizes issue or PR as related to a bug.needs-triageIndicates an issue or PR lacks a `triage/foo` label and requires one.sig/cloud-providerCategorizes an issue or PR as relevant to SIG Cloud Provider.sig/networkCategorizes an issue or PR as relevant to SIG Network.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions