Skip to content

Conversation

@ddelnano
Copy link
Member

@ddelnano ddelnano commented Dec 23, 2024

Summary: Ensure etcd stateful set has required capabilities to run on OpenShift

When using the etcd metadata store on an Openshift cluster, the container gets stuck in its start up script and continuously prints the following error.

Waiting for pl-etcd-1.pl-etcd.pl to come up
ping: permission denied (are you root?)

Waiting for pl-etcd-1.pl-etcd.pl to come up
ping: permission denied (are you root?)

The etcd stateful set requires an additional capability, which was missed when the other services had stricter security context settings added. This change also requires the following SecurityContextConstraints changes (pixie-io/docs.px.dev#292)

Relevant Issues: N/A

Type of change: /kind bug

Test Plan: Deployed the non-operator version of Pixie to an Openshift cluster and verified etcd is scheduled now

  • Verified etcd metadata deployment with these changes works on GKE cluster

Changelog Message: Fixed an issue where the etcd metadata store wouldn't schedule on Openshift clusters

Signed-off-by: Dom Del Nano <ddelnano@gmail.com>
@ddelnano ddelnano requested a review from a team as a code owner December 23, 2024 20:27
@aimichelle aimichelle merged commit 73c8340 into pixie-io:main Dec 23, 2024
22 checks passed
aimichelle pushed a commit that referenced this pull request Jan 1, 2025
…2070)

Summary: Ensure securityContext key is inside StatefulSet's containers
block

While testing the [0.14.14-pre-r1.0 Vizier
release](https://github.com/pixie-io/pixie/releases/tag/release%2Fvizier%2Fv0.14.14-pre-r1.0),
I realized that the fix from #2069 wasn't applied properly. The
`securityContext` block was indented too far and results in an invalid
manifest.

Relevant Issues: N/A

Type of change: /kind bug

Test Plan: Verified the following
- [x] Existing yamls produce error if applied as is
```
$ kubectl -n pl apply -k k8s/vizier_deps/base/etcd
The request is invalid: patch: Invalid value: "map[metadata:map[annotations:map[kubectl.kubernetes.io/last-applied-configuration:{\"apiVersion\":\"apps/v1\",\"kind\":\"StatefulSet\",\"metadata\":{\"annotations\":{},\"labels\":{\"app\":\"pl-monitoring\",\"etcd_cluster\":\"pl-etcd\"},\"name\":\"pl-etcd\",\"namespace\":\"pl\"},\"spec\":{\"podManagementPolicy\":\"Parallel\",\"replicas\":3,\"selector\":{\"matchLabels\":{\"app\":\"pl-monitoring\",\"etcd_cluster\":\"pl-etcd\"}},\"serviceName\":\"pl-etcd\",\"template\":{\"metadata\":{\"labels\":{\"app\":\"pl-monitoring\",\"etcd_cluster\":\"pl-etcd\",\"plane\":\"control\"},\"name\":\"pl-etcd\"},\"spec\":{\"containers\":[{\"env\":[{\"name\":\"INITIAL_CLUSTER_SIZE\",\"value\":\"3\"},{\"name\":\"CLUSTER_NAME\",\"value\":\"pl-etcd\"},{\"name\":\"ETCDCTL_API\",\"value\":\"3\"},{\"name\":\"POD_NAMESPACE\",\"valueFrom\":{\"fieldRef\":{\"fieldPath\":\"metadata.namespace\"}}},{\"name\":\"DATA_DIR\",\"value\":\"/var/run/etcd\"},{\"name\":\"ETCD_AUTO_COMPACTION_RETENTION\",\"value\":\"5\"},{\"name\":\"ETCD_AUTO_COMPACTION_MODE\",\"value\":\"revision\"}],\"image\":\"gcr.io/pixie-oss/pixie-dev-public/etcd:3.5.9@sha256:e18afc6dda592b426834342393c4c4bd076cb46fa7e10fa7818952cae3047ca9\",\"lifecycle\":{\"preStop\":{\"exec\":{\"command\":[\"/etc/etcd/scripts/prestop.sh\"]}}},\"livenessProbe\":{\"exec\":{\"command\":[\"/etc/etcd/scripts/healthcheck.sh\"]},\"failureThreshold\":5,\"initialDelaySeconds\":60,\"periodSeconds\":10,\"securityContext\":{\"capabilities\":{\"add\":[\"NET_RAW\"]},\"seccompProfile\":{\"type\":\"RuntimeDefault\"}},\"successThreshold\":1,\"timeoutSeconds\":5},\"name\":\"etcd\",\"ports\":[{\"containerPort\":2379,\"name\":\"client\"},{\"containerPort\":2380,\"name\":\"server\"}],\"readinessProbe\":{\"exec\":{\"command\":[\"/etc/etcd/scripts/healthcheck.sh\"]},\"failureThreshold\":3,\"initialDelaySeconds\":1,\"periodSeconds\":5,\"successThreshold\":1,\"timeoutSeconds\":5},\"volumeMounts\":[{\"mountPath\":\"/var/run/etcd\",\"name\":\"etcd-data\"},{\"mountPath\":\"/etc/etcdtls/member/peer-tls\",\"name\":\"member-peer-tls\"},{\"mountPath\":\"/etc/etcdtls/member/server-tls\",\"name\":\"member-server-tls\"},{\"mountPath\":\"/etc/etcdtls/client/etcd-tls\",\"name\":\"etcd-client-tls\"}]}],\"securityContext\":{\"seccompProfile\":{\"type\":\"RuntimeDefault\"}},\"tolerations\":[{\"effect\":\"NoSchedule\",\"key\":\"kubernetes.io/arch\",\"operator\":\"Equal\",\"value\":\"amd64\"},{\"effect\":\"NoExecute\",\"key\":\"kubernetes.io/arch\",\"operator\":\"Equal\",\"value\":\"amd64\"},{\"effect\":\"NoSchedule\",\"key\":\"kubernetes.io/arch\",\"operator\":\"Equal\",\"value\":\"arm64\"},{\"effect\":\"NoExecute\",\"key\":\"kubernetes.io/arch\",\"operator\":\"Equal\",\"value\":\"arm64\"}],\"volumes\":[{\"name\":\"member-peer-tls\",\"secret\":{\"secretName\":\"etcd-peer-tls-certs\"}},{\"name\":\"member-server-tls\",\"secret\":{\"secretName\":\"etcd-server-tls-certs\"}},{\"name\":\"etcd-client-tls\",\"secret\":{\"secretName\":\"etcd-client-tls-certs\"}},{\"emptyDir\":{},\"name\":\"etcd-data\"}]}}}}\n]] spec:map[template:map[spec:map[tolerations:[map[effect:NoSchedule key:kubernetes.io/arch operator:Equal value:amd64] map[effect:NoExecute key:kubernetes.io/arch operator:Equal value:amd64] map[effect:NoSchedule key:kubernetes.io/arch operator:Equal value:arm64] map[effect:NoExecute key:kubernetes.io/arch operator:Equal value:arm64]]]]]]": strict decoding error: unknown field "spec.template.spec.containers[0].livenessProbe.securityContext"
```
- [x] Manifests from this PR deploy etcd properly
```
$ git diff
diff --git a/k8s/vizier_deps/base/etcd/etcd_statefulset.yaml b/k8s/vizier_deps/base/etcd/etcd_statefulset.yaml
index 01d6a6c..0f4452c 100644
--- a/k8s/vizier_deps/base/etcd/etcd_statefulset.yaml
+++ b/k8s/vizier_deps/base/etcd/etcd_statefulset.yaml
@@ -106,12 +106,12 @@ spec:
           periodSeconds: 10
           successThreshold: 1
           timeoutSeconds: 5
-          securityContext:
-            capabilities:
-              add:
-              - NET_RAW
-            seccompProfile:
-              type: RuntimeDefault
+        securityContext:
+          capabilities:
+            add:
+            - NET_RAW
+          seccompProfile:
+            type: RuntimeDefault
         volumeMounts:
         - mountPath: /var/run/etcd
           name: etcd-data

$ kubectl -n pl apply -k k8s/vizier_deps/base/etcd
service/pl-etcd unchanged
service/pl-etcd-client unchanged
Warning: resource statefulsets/pl-etcd is missing the kubectl.kubernetes.io/last-applied-configuration annotation which is required by kubectl apply. kubectl apply should only be used on resources created declaratively by either kubectl create --save-config or kubectl apply. The missing annotation will be patched automatically.
statefulset.apps/pl-etcd configured
poddisruptionbudget.policy/pl-etcd-pdb configured
```

Signed-off-by: Dom Del Nano <ddelnano@gmail.com>
ddelnano added a commit to ddelnano/pixie that referenced this pull request Aug 6, 2025
pixie-io#2069)

Summary: Ensure etcd stateful set has required capabilities to run on
OpenShift

When using the etcd metadata store on an Openshift cluster, the
container gets stuck in its start up script and continuously prints the
following error.
```
Waiting for pl-etcd-1.pl-etcd.pl to come up
ping: permission denied (are you root?)

Waiting for pl-etcd-1.pl-etcd.pl to come up
ping: permission denied (are you root?)
```

The etcd stateful set requires an additional capability, which was
missed when the other services had stricter security context settings
added. This change also requires the following
`SecurityContextConstraints` changes
(pixie-io/docs.px.dev#292)

Relevant Issues: N/A

Type of change: /kind bug

Test Plan: Deployed the non-operator version of Pixie to an Openshift
cluster and verified etcd is scheduled now
- [x] Verified etcd metadata deployment with these changes works on GKE
cluster

Changelog Message: Fixed an issue where the etcd metadata store wouldn't
schedule on Openshift clusters

Signed-off-by: Dom Del Nano <ddelnano@gmail.com>
GitOrigin-RevId: 73c8340
ddelnano added a commit to ddelnano/pixie that referenced this pull request Aug 6, 2025
…ixie-io#2070)

Summary: Ensure securityContext key is inside StatefulSet's containers
block

While testing the [0.14.14-pre-r1.0 Vizier
release](https://github.com/pixie-io/pixie/releases/tag/release%2Fvizier%2Fv0.14.14-pre-r1.0),
I realized that the fix from pixie-io#2069 wasn't applied properly. The
`securityContext` block was indented too far and results in an invalid
manifest.

Relevant Issues: N/A

Type of change: /kind bug

Test Plan: Verified the following
- [x] Existing yamls produce error if applied as is
```
$ kubectl -n pl apply -k k8s/vizier_deps/base/etcd
The request is invalid: patch: Invalid value: "map[metadata:map[annotations:map[kubectl.kubernetes.io/last-applied-configuration:{\"apiVersion\":\"apps/v1\",\"kind\":\"StatefulSet\",\"metadata\":{\"annotations\":{},\"labels\":{\"app\":\"pl-monitoring\",\"etcd_cluster\":\"pl-etcd\"},\"name\":\"pl-etcd\",\"namespace\":\"pl\"},\"spec\":{\"podManagementPolicy\":\"Parallel\",\"replicas\":3,\"selector\":{\"matchLabels\":{\"app\":\"pl-monitoring\",\"etcd_cluster\":\"pl-etcd\"}},\"serviceName\":\"pl-etcd\",\"template\":{\"metadata\":{\"labels\":{\"app\":\"pl-monitoring\",\"etcd_cluster\":\"pl-etcd\",\"plane\":\"control\"},\"name\":\"pl-etcd\"},\"spec\":{\"containers\":[{\"env\":[{\"name\":\"INITIAL_CLUSTER_SIZE\",\"value\":\"3\"},{\"name\":\"CLUSTER_NAME\",\"value\":\"pl-etcd\"},{\"name\":\"ETCDCTL_API\",\"value\":\"3\"},{\"name\":\"POD_NAMESPACE\",\"valueFrom\":{\"fieldRef\":{\"fieldPath\":\"metadata.namespace\"}}},{\"name\":\"DATA_DIR\",\"value\":\"/var/run/etcd\"},{\"name\":\"ETCD_AUTO_COMPACTION_RETENTION\",\"value\":\"5\"},{\"name\":\"ETCD_AUTO_COMPACTION_MODE\",\"value\":\"revision\"}],\"image\":\"gcr.io/pixie-oss/pixie-dev-public/etcd:3.5.9@sha256:e18afc6dda592b426834342393c4c4bd076cb46fa7e10fa7818952cae3047ca9\",\"lifecycle\":{\"preStop\":{\"exec\":{\"command\":[\"/etc/etcd/scripts/prestop.sh\"]}}},\"livenessProbe\":{\"exec\":{\"command\":[\"/etc/etcd/scripts/healthcheck.sh\"]},\"failureThreshold\":5,\"initialDelaySeconds\":60,\"periodSeconds\":10,\"securityContext\":{\"capabilities\":{\"add\":[\"NET_RAW\"]},\"seccompProfile\":{\"type\":\"RuntimeDefault\"}},\"successThreshold\":1,\"timeoutSeconds\":5},\"name\":\"etcd\",\"ports\":[{\"containerPort\":2379,\"name\":\"client\"},{\"containerPort\":2380,\"name\":\"server\"}],\"readinessProbe\":{\"exec\":{\"command\":[\"/etc/etcd/scripts/healthcheck.sh\"]},\"failureThreshold\":3,\"initialDelaySeconds\":1,\"periodSeconds\":5,\"successThreshold\":1,\"timeoutSeconds\":5},\"volumeMounts\":[{\"mountPath\":\"/var/run/etcd\",\"name\":\"etcd-data\"},{\"mountPath\":\"/etc/etcdtls/member/peer-tls\",\"name\":\"member-peer-tls\"},{\"mountPath\":\"/etc/etcdtls/member/server-tls\",\"name\":\"member-server-tls\"},{\"mountPath\":\"/etc/etcdtls/client/etcd-tls\",\"name\":\"etcd-client-tls\"}]}],\"securityContext\":{\"seccompProfile\":{\"type\":\"RuntimeDefault\"}},\"tolerations\":[{\"effect\":\"NoSchedule\",\"key\":\"kubernetes.io/arch\",\"operator\":\"Equal\",\"value\":\"amd64\"},{\"effect\":\"NoExecute\",\"key\":\"kubernetes.io/arch\",\"operator\":\"Equal\",\"value\":\"amd64\"},{\"effect\":\"NoSchedule\",\"key\":\"kubernetes.io/arch\",\"operator\":\"Equal\",\"value\":\"arm64\"},{\"effect\":\"NoExecute\",\"key\":\"kubernetes.io/arch\",\"operator\":\"Equal\",\"value\":\"arm64\"}],\"volumes\":[{\"name\":\"member-peer-tls\",\"secret\":{\"secretName\":\"etcd-peer-tls-certs\"}},{\"name\":\"member-server-tls\",\"secret\":{\"secretName\":\"etcd-server-tls-certs\"}},{\"name\":\"etcd-client-tls\",\"secret\":{\"secretName\":\"etcd-client-tls-certs\"}},{\"emptyDir\":{},\"name\":\"etcd-data\"}]}}}}\n]] spec:map[template:map[spec:map[tolerations:[map[effect:NoSchedule key:kubernetes.io/arch operator:Equal value:amd64] map[effect:NoExecute key:kubernetes.io/arch operator:Equal value:amd64] map[effect:NoSchedule key:kubernetes.io/arch operator:Equal value:arm64] map[effect:NoExecute key:kubernetes.io/arch operator:Equal value:arm64]]]]]]": strict decoding error: unknown field "spec.template.spec.containers[0].livenessProbe.securityContext"
```
- [x] Manifests from this PR deploy etcd properly
```
$ git diff
diff --git a/k8s/vizier_deps/base/etcd/etcd_statefulset.yaml b/k8s/vizier_deps/base/etcd/etcd_statefulset.yaml
index 01d6a6c..0f4452c 100644
--- a/k8s/vizier_deps/base/etcd/etcd_statefulset.yaml
+++ b/k8s/vizier_deps/base/etcd/etcd_statefulset.yaml
@@ -106,12 +106,12 @@ spec:
           periodSeconds: 10
           successThreshold: 1
           timeoutSeconds: 5
-          securityContext:
-            capabilities:
-              add:
-              - NET_RAW
-            seccompProfile:
-              type: RuntimeDefault
+        securityContext:
+          capabilities:
+            add:
+            - NET_RAW
+          seccompProfile:
+            type: RuntimeDefault
         volumeMounts:
         - mountPath: /var/run/etcd
           name: etcd-data

$ kubectl -n pl apply -k k8s/vizier_deps/base/etcd
service/pl-etcd unchanged
service/pl-etcd-client unchanged
Warning: resource statefulsets/pl-etcd is missing the kubectl.kubernetes.io/last-applied-configuration annotation which is required by kubectl apply. kubectl apply should only be used on resources created declaratively by either kubectl create --save-config or kubectl apply. The missing annotation will be patched automatically.
statefulset.apps/pl-etcd configured
poddisruptionbudget.policy/pl-etcd-pdb configured
```

Signed-off-by: Dom Del Nano <ddelnano@gmail.com>
GitOrigin-RevId: 06e8911
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants