Skip to content

[WIP] DON'T REVIEW Jv run perf scale using comment get diagnostic bundle#19907

Draft
JoukoVirtanen wants to merge 16 commits intomasterfrom
jv-run-perf-scale-using-comment-get-diagnostic-bundle
Draft

[WIP] DON'T REVIEW Jv run perf scale using comment get diagnostic bundle#19907
JoukoVirtanen wants to merge 16 commits intomasterfrom
jv-run-perf-scale-using-comment-get-diagnostic-bundle

Conversation

@JoukoVirtanen
Copy link
Copy Markdown
Contributor

@JoukoVirtanen JoukoVirtanen commented Apr 8, 2026

Description

Mostly for testing

User-facing documentation

Testing and quality

  • the change is production ready: the change is GA, or otherwise the functionality is gated by a feature flag
  • CI results are inspected

Automated testing

  • added unit tests
  • added e2e tests
  • added regression tests
  • added compatibility tests
  • modified existing tests

How I validated my change

change me!

@JoukoVirtanen JoukoVirtanen added the do-not-merge A change which is not meant to be merged label Apr 8, 2026
@openshift-ci
Copy link
Copy Markdown

openshift-ci bot commented Apr 8, 2026

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 8, 2026

🚀 Build Images Ready

Images are ready for commit a071937. To use with deploy scripts:

export MAIN_IMAGE_TAG=4.11.x-576-ga071937036

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 8, 2026

🚀 Build Images Ready

Images are ready for commit e2a34b2. To use with deploy scripts:

export MAIN_IMAGE_TAG=4.11.x-566-ge2a34b221e

@codecov
Copy link
Copy Markdown

codecov bot commented Apr 8, 2026

Codecov Report

❌ Patch coverage is 52.88889% with 106 lines in your changes missing coverage. Please review.
✅ Project coverage is 49.58%. Comparing base (2d5d7a2) to head (a071937).
⚠️ Report is 67 commits behind head on master.

Files with missing lines Patch % Lines
central/graphql/resolvers/generated.go 13.04% 40 Missing ⚠️
operator/api/v1alpha1/zz_generated.deepcopy.go 0.00% 29 Missing ⚠️
central/cluster/datastore/datastore_impl.go 60.71% 9 Missing and 2 partials ⚠️
...l/securedcluster/values/translation/translation.go 65.00% 5 Missing and 2 partials ⚠️
central/processindicator/datastore/metrics.go 86.36% 6 Missing ⚠️
central/detection/lifecycle/manager_impl.go 55.55% 2 Missing and 2 partials ⚠️
...ntral/processindicator/datastore/datastore_impl.go 82.35% 2 Missing and 1 partial ⚠️
pkg/cluster/filtering.go 85.71% 1 Missing and 2 partials ⚠️
operator/api/v1alpha1/securedcluster_types.go 0.00% 2 Missing ⚠️
sensor/common/sensor/central_communication_impl.go 0.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master   #19907      +/-   ##
==========================================
- Coverage   49.60%   49.58%   -0.03%     
==========================================
  Files        2763     2765       +2     
  Lines      208339   208563     +224     
==========================================
+ Hits       103341   103406      +65     
- Misses      97331    97490     +159     
  Partials     7667     7667              
Flag Coverage Δ
go-unit-tests 49.58% <52.88%> (-0.03%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

8c24cc0 X-Smart-Branch-Parent: master
fa2b9d3 Added metrics for histogram of argument length, total bytes of arguments stored, and number of process indicators inserted
8a3b701 Removed a redundant metric
7057af7 Removed recordProcessIndicatorAdded
06dcc77 Added a metric for the number of characters in arguments
df35517 Removed processIndicatorsAddedCounter
2d9b9c0 Removed process_args_size_bytes metric
9c5e5bc Renamed process_args_size_chars to process_upserted_args_size
2e2e1bf The metric takes in the cluster id as a parameter
99b09d1 Process arguments lengths are now specified by cluster and namespace
3ba4841 There is just one histogram for central. Argument lengths are still broken down by cluster and namespace
ac515c4 Update central/processindicator/datastore/metrics.go
0653d26 Added comment explaining usage of RuneCountInString
32f80fa Added processUpsertedCount by cluster id and namespace
48eeb8a X-Smart-Branch-Parent: jv-ROX-32873-metrics-for-process-arguments
61e8bdb Added a metric for lineage info
d9c13c2 There is just one histogram for central. Process lineage lengths are still broken down by cluster and namespace
c386499 Added word upserted to metric description
998eb1c X-Smart-Branch-Parent: jv-ROX-32873-metrics-for-process-arguments
2500d11 Added metrics for different types of process indicator pruning, total pruned, and net processes
7669364 Fixed unit test
4ac0e98 Not doing two queries to delete process indicators
79f5280 Removed metric for net process indicators
30ca3db Fixed style
a527a8b Fixed error from rebase
d1c57c7 Merge branch 'jv-ROX-32873-metrics-for-process-arguments' into jv-all-process-indicator-metrics
ee228f9 Merge branch 'jv-ROX-33352-add-histogram-of-lineage-size' into jv-all-process-indicator-metrics
fba1716 Merge jv-ROX-33267-metrics-to-keep-track-of-pruned-process-indicators
882f345 Added buckets with smaller sizes
91b207c Changed process_indicators_lineage_size to process_upserted_lineage_size
0a22fc8 Added buckets with smaller sizes
erthalion and others added 4 commits April 10, 2026 15:14
Allow to configure per-namespace persistence for process indicators, so
that Central wouldn't need to store information, which never will be used.

It could be configured via DynamicConfig of the cluster configuration in
the form:

```
message RuntimeDataControl {
  string namespace_filter = 1;
  bool exclude_openshift = 2;
  bool persistence = 3;
}
```

Where `namespace_filter` allows to specify a custom regex to filter out
processes by matching namespace, `exclude_openshift` instructs Central to
exclude anything from openshift-* namespaces, and `persistence` can be used to
disable storing process indicators at all.
Copy link
Copy Markdown
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've found 8 issues, and left some high level feedback:

  • GetNamespaceFilter assumes DynamicConfig/HelmConfig/RuntimeDataControl are always non-nil; consider adding nil checks or early returns to avoid potential panics when these fields are unset or partially configured.
  • The new process metrics (especially processUpserted* with cluster/namespace labels) can introduce high-cardinality Prometheus series; you may want to bound the number of distinct label values or gate these metrics behind a feature flag/config option.
  • Several new log messages (e.g., in datastoreImpl.buildCache, MatchProcessIndicator, sensor service) use Infof and could be quite noisy in large clusters; consider downgrading some of these to debug level or adding rate limiting once debugging is complete.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- GetNamespaceFilter assumes DynamicConfig/HelmConfig/RuntimeDataControl are always non-nil; consider adding nil checks or early returns to avoid potential panics when these fields are unset or partially configured.
- The new process metrics (especially processUpserted* with cluster/namespace labels) can introduce high-cardinality Prometheus series; you may want to bound the number of distinct label values or gate these metrics behind a feature flag/config option.
- Several new log messages (e.g., in datastoreImpl.buildCache, MatchProcessIndicator, sensor service) use Infof and could be quite noisy in large clusters; consider downgrading some of these to debug level or adding rate limiting once debugging is complete.

## Individual Comments

### Comment 1
<location path="central/cluster/datastore/datastore_impl.go" line_range="184-186" />
<code_context>

 	for _, c := range clusters {
 		ds.idToNameCache.Add(c.GetId(), c.GetName())
+		namespaceFilter := clusterPkg.GetNamespaceFilter(c)
+		log.Infof("Setting namespace filter for cluster %s (%s): %q", c.GetName(), c.GetId(), namespaceFilter)
+		ds.idToRegexCache.Add(c.GetId(),
+			regexp.MustCompile(namespaceFilter))
 		ds.nameToIDCache.Add(c.GetName(), c.GetId())
</code_context>
<issue_to_address>
**issue (bug_risk):** Consider what regex is stored when no runtime data control is configured to avoid over-filtering

`GetNamespaceFilter`’s return value is compiled and cached here. If it returns an empty string, `regexp.MustCompile("")` matches all namespaces, and with `MatchProcessIndicator`’s “exclude on match” behavior this would suppress all indicators when runtime data control is unset/defaulted.

This logic therefore depends on `GetNamespaceFilter` returning a “match nothing” pattern (or similar) when no filtering is desired; otherwise the cache causes unintended global suppression tied to the default behavior of `GetNamespaceFilter`.
</issue_to_address>

### Comment 2
<location path="central/detection/lifecycle/manager_impl.go" line_range="257-262" />
<code_context>
 		}
-		indicatorSlice = append(indicatorSlice, indicator)
+
+		match, err := m.clusterDataStore.MatchProcessIndicator(lifecycleMgrCtx, indicator)
+		if err != nil {
+			log.Errorf("Cannot match indicator %+v: %v", indicator, err)
+		} else if !match {
+			indicatorSlice = append(indicatorSlice, indicator)
+		} else {
+			log.Infof("Process Indicator doesn't match %+v", indicator)
+		}
</code_context>
<issue_to_address>
**🚨 suggestion (security):** Reduce verbose logging of full process indicators in the hot path

This branch logs the full `indicator` struct at info level when filtered out. Given the high volume of process indicators, this can both inflate log size and leak sensitive runtime details. Please reduce this to debug level and/or log only key fields (e.g., cluster, namespace, deployment, exec path) instead of the full struct.

Suggested implementation:

```golang
		} else {
			log.Debugf("Process indicator filtered out (cluster=%s, namespace=%s, deployment=%s, execPath=%s)",
				indicator.GetClusterId(),
				indicator.GetNamespace(),
				indicator.GetDeploymentId(),
				indicator.GetSignal().GetExecFilePath())
		}

```

You may need to adjust the field accessors used in the debug log depending on the actual `ProcessIndicator` API in this codebase. For example:
- If there is a `GetDeploymentName()` instead of `GetDeploymentId()`, switch to that.
- If namespace or cluster are nested (e.g., `indicator.GetPod().GetNamespace()`), update the call chain accordingly.
- If `GetSignal().GetExecFilePath()` is not available, use the appropriate field that represents the executed path or process name.
</issue_to_address>

### Comment 3
<location path="central/sensor/service/service_impl.go" line_range="275-276" />
<code_context>

 func (s *serviceImpl) getClusterForConnection(sensorHello *central.SensorHello, serviceID *storage.ServiceIdentity) (*storage.Cluster, error) {
 	helmConfigInit := sensorHello.GetHelmManagedConfigInit()
+	log.Infof("HelmConfigInit %+v", helmConfigInit)

 	clusterIDFromCert := serviceID.GetId()
</code_context>
<issue_to_address>
**🚨 suggestion (security):** Avoid logging full Helm-managed config structures at info level

Logging `HelmManagedConfigInit` with `%+v` at info level can unintentionally expose sensitive or environment-specific settings (e.g., registry overrides) in Central logs. Since this runs on the sensor connection hot path, consider removing the log, lowering it to debug, or logging only a minimal, non-sensitive subset of fields.
</issue_to_address>

### Comment 4
<location path="sensor/common/sensor/central_communication_impl.go" line_range="172" />
<code_context>
 			log.Infof("Re-using cluster ID %s of previous run. If you see the connection to central failing, re-apply a new Helm configuration via 'helm upgrade', or delete the sensor pod.", cachedClusterID)
 		}

+		log.Infof("Set HelmManagedConfigInit %+v", helmManagedCfg)
 		sensorHello.HelmManagedConfigInit = helmManagedCfg
 	}
</code_context>
<issue_to_address>
**🚨 suggestion (security):** Similarly reduce verbosity/sensitivity of Helm config logging on the sensor side

Dumping the full `helmManagedCfg` at info level can expose unnecessary configuration details and add noise. Consider moving this to debug level or logging only a minimal summary (e.g., an ID or checksum).

```suggestion
		log.Debugf("Setting HelmManagedConfigInit in sensor hello")
```
</issue_to_address>

### Comment 5
<location path="pkg/cluster/filtering_test.go" line_range="25" />
<code_context>
+	},
+}
+
+func TestNamespaceFilter(t *testing.T) {
+	cases := map[string]struct {
+		configureClusterFn func(*storage.Cluster)
</code_context>
<issue_to_address>
**suggestion (testing):** Add test coverage for MANUAL/UNKNOWN-managed clusters and nil runtime data control configuration

`GetNamespaceFilter` behaves differently based on `cluster.ManagedBy` and assumes dynamic/Helm runtime data control, but current tests only cover the Helm-managed path with a non-nil `RuntimeDataControl`. Please add tests that:

- Use a `MANAGER_TYPE_MANUAL` (and possibly `UNKNOWN`) cluster to cover the `cluster.GetDynamicConfig().GetRuntimeDataControl()` path.
- Exercise cases where `DynamicConfig`, `HelmConfig.DynamicConfig`, or `RuntimeDataControl` are nil to confirm the function does not panic and applies the expected default filter.

This will increase confidence that the filtering logic works across all cluster configurations and nil-config scenarios.

Suggested implementation:

```golang
func TestNamespaceFilter(t *testing.T) {
	cases := map[string]struct {
		configureClusterFn func(*storage.Cluster)
		expectedFilter     string
	}{
		"Empty filter configuration": {
			configureClusterFn: func(*storage.Cluster) {},
			expectedFilter:     "",
		},
		"Custom filter configuration": {
			configureClusterFn: func(cluster *storage.Cluster) {
				cluster.HelmConfig.DynamicConfig.RuntimeDataControl.NamespaceFilter = "test-.*"
			},
			expectedFilter: "test-.*",
		},
		"Helm-managed cluster with nil dynamic config": {
			configureClusterFn: func(cluster *storage.Cluster) {
				// Simulate a Helm-managed cluster with no DynamicConfig.
				if cluster.HelmConfig != nil {
					cluster.HelmConfig.DynamicConfig = nil
				}
			},
			// With nil dynamic config, GetNamespaceFilter should not panic and should
			// fall back to the default (no filter).
			expectedFilter: "",
		},
		"Helm-managed cluster with nil runtime data control": {
			configureClusterFn: func(cluster *storage.Cluster) {
				// Simulate a Helm-managed cluster with DynamicConfig but nil RuntimeDataControl.
				if cluster.HelmConfig != nil && cluster.HelmConfig.DynamicConfig != nil {
					cluster.HelmConfig.DynamicConfig.RuntimeDataControl = nil
				}
			},
			expectedFilter: "",
		},
		"Manual-managed cluster with runtime data control filter": {
			configureClusterFn: func(cluster *storage.Cluster) {
				// Switch to MANUAL to exercise the cluster.GetDynamicConfig().GetRuntimeDataControl() path.
				cluster.ManagedBy = storage.ManagerType_MANAGER_TYPE_MANUAL
				cluster.HelmConfig = nil
				cluster.DynamicConfig = &storage.DynamicClusterConfig{
					RuntimeDataControl: &storage.DynamicClusterConfig_RuntimeDataControl{
						NamespaceFilter: "manual-.*",
						Persistence:     true,
					},
				}
			},
			expectedFilter: "manual-.*",
		},
		"Manual-managed cluster with nil dynamic config": {
			configureClusterFn: func(cluster *storage.Cluster) {
				cluster.ManagedBy = storage.ManagerType_MANAGER_TYPE_MANUAL
				cluster.HelmConfig = nil
				// Explicitly nil DynamicConfig to ensure GetNamespaceFilter handles it gracefully.
				cluster.DynamicConfig = nil
			},
			expectedFilter: "",
		},
		"Manual-managed cluster with nil runtime data control": {
			configureClusterFn: func(cluster *storage.Cluster) {
				cluster.ManagedBy = storage.ManagerType_MANAGER_TYPE_MANUAL
				cluster.HelmConfig = nil
				cluster.DynamicConfig = &storage.DynamicClusterConfig{
					// RuntimeDataControl is intentionally left nil.
					RuntimeDataControl: nil,
				}
			},
			expectedFilter: "",
		},
		"Unknown-managed cluster with runtime data control filter": {
			configureClusterFn: func(cluster *storage.Cluster) {
				// UNKNOWN should also go through the dynamic config path.
				cluster.ManagedBy = storage.ManagerType_MANAGER_TYPE_UNKNOWN
				cluster.HelmConfig = nil
				cluster.DynamicConfig = &storage.DynamicClusterConfig{
					RuntimeDataControl: &storage.DynamicClusterConfig_RuntimeDataControl{
						NamespaceFilter: "unknown-.*",
						Persistence:     true,
					},
				}
			},
			expectedFilter: "unknown-.*",
		},
		"Unknown-managed cluster with nil dynamic config": {
			configureClusterFn: func(cluster *storage.Cluster) {
				cluster.ManagedBy = storage.ManagerType_MANAGER_TYPE_UNKNOWN
				cluster.HelmConfig = nil
				cluster.DynamicConfig = nil
			},
			expectedFilter: "",
		},

```

1. The replacement assumes the existing `"Custom filter configuration"` case includes an `expectedFilter: "test-.*",` field as shown. If your current code differs (for example, if `expectedFilter` is declared on a different line or is missing), adjust the SEARCH pattern to match your actual block and keep the same set of test cases in the REPLACE section.
2. This change reuses whatever test harness you have below the `cases` map (likely a `for name, tc := range cases` loop that creates a base cluster and calls `GetNamespaceFilter`). No changes to that loop should be necessary as long as it:
   - Starts from a Helm-managed base cluster similar to the one in your snippet.
   - Applies `configureClusterFn` to a fresh `*storage.Cluster` per test case before calling `GetNamespaceFilter`.
3. If your base test cluster is not Helm-managed by default or is not initialized with `HelmConfig.DynamicConfig.RuntimeDataControl`, you may want to ensure the base cluster reflects the initial Helm-managed configuration you expect, so the new Helm-nil tests are meaningful.
</issue_to_address>

### Comment 6
<location path="operator/internal/securedcluster/values/translation/translation_test.go" line_range="1044" />
<code_context>
 				},
 			},
 		},
+		"runtime data control persistence enabled": {
+			args: args{
+				client: newDefaultFakeClient(t),
</code_context>
<issue_to_address>
**suggestion (testing):** Extend runtimeDataControl translation tests to cover excludeOpenshift and namespaceFilter

The new case under `"runtime data control persistence enabled"` only exercises the `persistence` field. Since `RuntimeDataControlSpec` adds three fields and `getRuntimeDataControlValues` branches on each, please add at least:

- A case with `ExcludeOpenshift` = `RuntimeConfigEnabled`, asserting `runtimeDataControl.excludeOpenshift` is `true` in the Helm values.
- A case with a non-empty `NamespaceFilter`, asserting `runtimeDataControl.namespaceFilter` is rendered as expected.

This ensures the CRD → Helm values translation is covered for all new fields, not just persistence.

Suggested implementation:

```golang
		"runtime data control persistence enabled": {
			args: args{
				client: newDefaultFakeClient(t),
				sc: platform.SecuredCluster{
					ObjectMeta: metav1.ObjectMeta{Namespace: "stackrox"},
					Spec: platform.SecuredClusterSpec{
						ClusterName: ptr.To("test-cluster"),
						RuntimeDataControl: &platform.RuntimeDataControlSpec{
							Persistence: platform.RuntimeConfigEnabled.Pointer(),
						},
					},
				},
			},
		},
		"runtime data control exclude openshift enabled": {
			args: args{
				client: newDefaultFakeClient(t),
				sc: platform.SecuredCluster{
					ObjectMeta: metav1.ObjectMeta{Namespace: "stackrox"},
					Spec: platform.SecuredClusterSpec{
						ClusterName: ptr.To("test-cluster"),
						RuntimeDataControl: &platform.RuntimeDataControlSpec{
							ExcludeOpenshift: platform.RuntimeConfigEnabled.Pointer(),
						},
					},
				},
			},
		},
		"runtime data control namespace filter set": {
			args: args{
				client: newDefaultFakeClient(t),
				sc: platform.SecuredCluster{
					ObjectMeta: metav1.ObjectMeta{Namespace: "stackrox"},
					Spec: platform.SecuredClusterSpec{
						ClusterName: ptr.To("test-cluster"),
						RuntimeDataControl: &platform.RuntimeDataControlSpec{
							// Use a non-empty namespace filter to ensure it is rendered into Helm values.
							NamespaceFilter: &platform.NamespaceFilterSpec{
								Namespaces: []string{"team-a", "team-b"},
							},
						},
					},
				},
			},

```

I only see the table entry definition, not the full test harness or the exact types in `platform.RuntimeDataControlSpec` and related namespace filter structs. To complete the implementation and satisfy your comment, you should:

1. Adjust the `ExcludeOpenshift` field assignment if its type differs from `platform.RuntimeConfigEnabled.Pointer()` (e.g., if it uses a different enum or a plain bool).
2. Replace `platform.NamespaceFilterSpec` and its `Namespaces` field with the actual type and field names used by `RuntimeDataControlSpec.NamespaceFilter`. The key requirement is that the filter is non-empty so the Helm translation code exercises its branch.
3. In the assertion section of the test (where the Helm values are inspected), add expectations for these new cases:
   - For `"runtime data control exclude openshift enabled"`, assert that `runtimeDataControl.excludeOpenshift` is `true` in the rendered values.
   - For `"runtime data control namespace filter set"`, assert that `runtimeDataControl.namespaceFilter` matches the structure you expect (e.g., list of namespaces, mode, etc.).
4. If the table-driven test has an `expectedValues` or similar field instead of per-case assertions, add the appropriate expected `runtimeDataControl` sub-map to each new case instead.
</issue_to_address>

### Comment 7
<location path="central/detection/lifecycle/manager_impl_test.go" line_range="347-356" />
<code_context>
+func (suite *ManagerTestSuite) TestFlushIndicators() {
</code_context>
<issue_to_address>
**suggestion (testing):** Add tests for error and edge paths in flushIndicatorQueue (MatchProcessIndicator errors and empty queue)

Right now this only validates the happy path. Please also add tests for:

- `MatchProcessIndicator` returning an error (e.g., mock returning `(false, err)`) to document whether that indicator is dropped or still persisted.
- An empty `queuedIndicators` map (or all indicators filtered out) to confirm no `AddProcessIndicators` calls are made and no panics occur.

These will better cover the new filtering behavior in the lifecycle manager.

Suggested implementation:

```golang
func (suite *ManagerTestSuite) TestFlushIndicators() {
	deploymentID1, indicator1 := makeIndicator()
	deploymentID2, indicator2 := makeIndicator()

	// Queue both indicators.
	suite.manager.queuedIndicators[deploymentID1] = indicator1
	suite.manager.queuedIndicators[deploymentID2] = indicator2

	// Make first indicator to match and be filtered out.
	suite.cluster.EXPECT().MatchProcessIndicator(gomock.Any(), indicator1).
		Return(true, nil)

	// The second indicator should pass through.
	suite.cluster.EXPECT().MatchProcessIndicator(gomock.Any(), indicator2).
		Return(false, nil)

	// Only the second indicator should be persisted.
	suite.indicators.EXPECT().
		AddProcessIndicators(gomock.Any(), []*storage.ProcessIndicator{indicator2}).
		Return(nil)

	suite.manager.flushIndicatorQueue(context.Background())
}

func (suite *ManagerTestSuite) TestFlushIndicators_MatchError() {
	deploymentID, indicator := makeIndicator()

	// Queue the indicator.
	suite.manager.queuedIndicators[deploymentID] = indicator

	// Simulate a matching error for this indicator.
	suite.cluster.EXPECT().
		MatchProcessIndicator(gomock.Any(), indicator).
		Return(false, errors.New("match error"))

	// On error, the indicator should still be persisted so it isn't lost.
	suite.indicators.EXPECT().
		AddProcessIndicators(gomock.Any(), []*storage.ProcessIndicator{indicator}).
		Return(nil)

	suite.manager.flushIndicatorQueue(context.Background())
}

func (suite *ManagerTestSuite) TestFlushIndicators_EmptyQueue() {
	// Start with an empty queue to exercise the edge case.
	suite.manager.queuedIndicators = make(map[string]*storage.ProcessIndicator)

	// No indicators should be written when the queue is empty.
	suite.indicators.EXPECT().
		AddProcessIndicators(gomock.Any(), gomock.Any()).
		Times(0)

	// Should not panic and should not attempt to persist indicators.
	suite.manager.flushIndicatorQueue(context.Background())
}

```

1. Ensure the test file imports the `context` and `errors` packages (if they are not already imported):
   - `import "context"`
   - `import "errors"`
2. If the `queuedIndicators` field on `suite.manager` can be `nil` before usage in some tests, make sure it is initialized in the test suite setup (e.g., in `SetupTest` / `SetupSuite`) to avoid nil map assignments.
3. Adjust the expectation in `TestFlushIndicators_MatchError` if the actual, desired behavior for a `MatchProcessIndicator` error is to *drop* the indicator instead of persisting it:
   - In that case, change the `AddProcessIndicators` expectation to `.Times(0)` and update the test name/comment to reflect dropping-on-error semantics.
</issue_to_address>

### Comment 8
<location path="central/cluster/datastore/datastore_impl_test.go" line_range="833-842" />
<code_context>
+func (s *clusterDataStoreTestSuite) TestProcessMatching() {
</code_context>
<issue_to_address>
**suggestion (testing):** Consider broadening TestProcessMatching to cover clusters without runtimeDataControl and non-Helm-managed clusters

Currently this only exercises regex matching for a Helm-managed cluster with a non-nil `RuntimeDataControl`. To better cover `MatchProcessIndicator` and namespace filtering, please also add:

- A case where `DynamicConfig`/`RuntimeDataControl` is nil, asserting the expected default behavior (e.g., `false` without error).
- A case for a MANUAL-managed cluster to verify it uses the expected namespace filter source and that the cache path behaves correctly.

This will validate matching behavior across the main cluster management/configuration variants.

Suggested implementation:

```golang
func (s *clusterDataStoreTestSuite) TestProcessMatching() {
	ctx := context.Background()

	helmClusterID := fixtureconsts.Cluster1
	manualClusterID := fixtureconsts.Cluster2

	// Helm-managed cluster with non-nil RuntimeDataControl; Regex should match "test-.*" namespaces.
	helmCluster := &storage.Cluster{
		Id:        helmClusterID,
		Name:      "helm-cluster",
		ManagedBy: storage.ManagerType_MANAGER_TYPE_HELM_CHART,
		HelmConfig: &storage.CompleteClusterConfig{
			DynamicConfig: &storage.DynamicClusterConfig{
				RuntimeDataControl: &storage.DynamicClusterConfig_RuntimeDataControl{
					NamespaceFilter: "test-.*",
					Persistence:     true,
				},
			},
		},
	}

	// Helm-managed cluster with nil DynamicConfig/RuntimeDataControl; should take the default path
	// and return "not matched" without error.
	nilRuntimeControlCluster := &storage.Cluster{
		Id:        uuid.NewV4().String(),
		Name:      "no-runtime-control",
		ManagedBy: storage.ManagerType_MANAGER_TYPE_HELM_CHART,
		HelmConfig: &storage.CompleteClusterConfig{
			// DynamicConfig intentionally nil.
		},
	}

	// MANUAL-managed cluster to exercise non-Helm cache path and namespace filter source.
	// For these tests we rely on the default behavior (no RuntimeDataControl) and verify that
	// we still get a deterministic non-error result.
	manualCluster := &storage.Cluster{
		Id:        manualClusterID,
		Name:      "manual-cluster",
		ManagedBy: storage.ManagerType_MANAGER_TYPE_MANUAL,
		// No HelmConfig – this should force the datastore down the MANUAL-management path.
	}

	// Upsert clusters so MatchProcessIndicator can look them up.
	require.NoError(s.T(), s.datastore.UpsertCluster(ctx, helmCluster))
	require.NoError(s.T(), s.datastore.UpsertCluster(ctx, nilRuntimeControlCluster))
	require.NoError(s.T(), s.datastore.UpsertCluster(ctx, manualCluster))

	// Process indicators to exercise regex matching and default behavior.
	matchingIndicator := &storage.ProcessIndicator{
		ClusterId: helmClusterID,
		Namespace: "test-namespace",
		PodId:     "pod-1",
		Signal: &storage.ProcessSignal{
			ContainerId: "container-1",
			Name:        "nginx",
		},
	}

	nonMatchingIndicator := &storage.ProcessIndicator{
		ClusterId: helmClusterID,
		Namespace: "prod-namespace",
		PodId:     "pod-2",
		Signal: &storage.ProcessSignal{
			ContainerId: "container-2",
			Name:        "nginx",
		},
	}

	nilRuntimeControlIndicator := &storage.ProcessIndicator{
		ClusterId: nilRuntimeControlCluster.Id,
		Namespace: "any-namespace",
		PodId:     "pod-3",
		Signal: &storage.ProcessSignal{
			ContainerId: "container-3",
			Name:        "redis",
		},
	}

	manualClusterIndicator := &storage.ProcessIndicator{
		ClusterId: manualClusterID,
		Namespace: "manual-namespace",
		PodId:     "pod-4",
		Signal: &storage.ProcessSignal{
			ContainerId: "container-4",
			Name:        "busybox",
		},
	}

	s.Run("helm cluster with matching namespace filter", func() {
		matched, err := s.datastore.MatchProcessIndicator(ctx, matchingIndicator)
		require.NoError(s.T(), err)
		assert.True(s.T(), matched, "expected helm cluster regex to match namespace")
	})

	s.Run("helm cluster with non-matching namespace filter", func() {
		matched, err := s.datastore.MatchProcessIndicator(ctx, nonMatchingIndicator)
		require.NoError(s.T(), err)
		assert.False(s.T(), matched, "expected helm cluster regex NOT to match namespace")
	})

	s.Run("helm cluster with nil RuntimeDataControl", func() {
		matched, err := s.datastore.MatchProcessIndicator(ctx, nilRuntimeControlIndicator)
		require.NoError(s.T(), err, "expected default behavior without error when RuntimeDataControl is nil")
		assert.False(s.T(), matched, "expected default behavior to return not-matched when RuntimeDataControl is nil")
	})

	s.Run("manual-managed cluster", func() {
		// First call exercises initial cache population / lookup path.
		matched1, err := s.datastore.MatchProcessIndicator(ctx, manualClusterIndicator)
		require.NoError(s.T(), err)
		// We don't assert a specific match policy here, only that the behavior is consistent and non-erroring.
		// For the current implementation we expect "false" when no RuntimeDataControl is present.
		assert.False(s.T(), matched1, "expected manual-managed cluster to return not-matched by default")

		// Second call should hit any internal cache path, and must behave identically.
		matched2, err := s.datastore.MatchProcessIndicator(ctx, manualClusterIndicator)
		require.NoError(s.T(), err)
		assert.Equal(s.T(), matched1, matched2, "manual-managed cluster matching should be stable across cache hits")
	})

```

The edit assumes the following, which you may need to adjust to your existing code:

1. The test suite already imports:
   - `context` (`context.Background()`),
   - `github.com/stretchr/testify/require`,
   - `github.com/stretchr/testify/assert`,
   - `github.com/stackrox/rox/generated/storage`,
   - `github.com/stackrox/rox/pkg/fixtures/fixtureconsts`,
   - and a UUID helper (`github.com/stackrox/rox/pkg/uuid` or similar) for `uuid.NewV4().String()`.  
   If any of these are missing or named differently, update the imports or replace the UUID generation with any existing helper.

2. The datastore under test exposes:
   `MatchProcessIndicator(ctx context.Context, pi *storage.ProcessIndicator) (bool, error)`  
   and `UpsertCluster(ctx context.Context, c *storage.Cluster) error`.  
   If the signatures differ (e.g., additional parameters, cluster ID instead of looking it up from the indicator), adapt the calls accordingly.

3. If your test suite uses a `sac.WithAllAccess` context instead of `context.Background()`, replace the context creation line with the appropriate helper to maintain consistency with other tests.

4. Ensure that no remaining fragments of the original `TestProcessMatching` function remain below this replacement; the full body of the function should match the replacement shown above, terminated by a single closing brace `}` for the function.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Comment on lines +184 to +186
namespaceFilter := clusterPkg.GetNamespaceFilter(c)
log.Infof("Setting namespace filter for cluster %s (%s): %q", c.GetName(), c.GetId(), namespaceFilter)
ds.idToRegexCache.Add(c.GetId(),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (bug_risk): Consider what regex is stored when no runtime data control is configured to avoid over-filtering

GetNamespaceFilter’s return value is compiled and cached here. If it returns an empty string, regexp.MustCompile("") matches all namespaces, and with MatchProcessIndicator’s “exclude on match” behavior this would suppress all indicators when runtime data control is unset/defaulted.

This logic therefore depends on GetNamespaceFilter returning a “match nothing” pattern (or similar) when no filtering is desired; otherwise the cache causes unintended global suppression tied to the default behavior of GetNamespaceFilter.

Comment on lines +257 to +262
match, err := m.clusterDataStore.MatchProcessIndicator(lifecycleMgrCtx, indicator)
if err != nil {
log.Errorf("Cannot match indicator %+v: %v", indicator, err)
} else if !match {
indicatorSlice = append(indicatorSlice, indicator)
} else {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚨 suggestion (security): Reduce verbose logging of full process indicators in the hot path

This branch logs the full indicator struct at info level when filtered out. Given the high volume of process indicators, this can both inflate log size and leak sensitive runtime details. Please reduce this to debug level and/or log only key fields (e.g., cluster, namespace, deployment, exec path) instead of the full struct.

Suggested implementation:

		} else {
			log.Debugf("Process indicator filtered out (cluster=%s, namespace=%s, deployment=%s, execPath=%s)",
				indicator.GetClusterId(),
				indicator.GetNamespace(),
				indicator.GetDeploymentId(),
				indicator.GetSignal().GetExecFilePath())
		}

You may need to adjust the field accessors used in the debug log depending on the actual ProcessIndicator API in this codebase. For example:

  • If there is a GetDeploymentName() instead of GetDeploymentId(), switch to that.
  • If namespace or cluster are nested (e.g., indicator.GetPod().GetNamespace()), update the call chain accordingly.
  • If GetSignal().GetExecFilePath() is not available, use the appropriate field that represents the executed path or process name.

Comment on lines 275 to +276
helmConfigInit := sensorHello.GetHelmManagedConfigInit()
log.Infof("HelmConfigInit %+v", helmConfigInit)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚨 suggestion (security): Avoid logging full Helm-managed config structures at info level

Logging HelmManagedConfigInit with %+v at info level can unintentionally expose sensitive or environment-specific settings (e.g., registry overrides) in Central logs. Since this runs on the sensor connection hot path, consider removing the log, lowering it to debug, or logging only a minimal, non-sensitive subset of fields.

log.Infof("Re-using cluster ID %s of previous run. If you see the connection to central failing, re-apply a new Helm configuration via 'helm upgrade', or delete the sensor pod.", cachedClusterID)
}

log.Infof("Set HelmManagedConfigInit %+v", helmManagedCfg)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚨 suggestion (security): Similarly reduce verbosity/sensitivity of Helm config logging on the sensor side

Dumping the full helmManagedCfg at info level can expose unnecessary configuration details and add noise. Consider moving this to debug level or logging only a minimal summary (e.g., an ID or checksum).

Suggested change
log.Infof("Set HelmManagedConfigInit %+v", helmManagedCfg)
log.Debugf("Setting HelmManagedConfigInit in sensor hello")

},
}

func TestNamespaceFilter(t *testing.T) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (testing): Add test coverage for MANUAL/UNKNOWN-managed clusters and nil runtime data control configuration

GetNamespaceFilter behaves differently based on cluster.ManagedBy and assumes dynamic/Helm runtime data control, but current tests only cover the Helm-managed path with a non-nil RuntimeDataControl. Please add tests that:

  • Use a MANAGER_TYPE_MANUAL (and possibly UNKNOWN) cluster to cover the cluster.GetDynamicConfig().GetRuntimeDataControl() path.
  • Exercise cases where DynamicConfig, HelmConfig.DynamicConfig, or RuntimeDataControl are nil to confirm the function does not panic and applies the expected default filter.

This will increase confidence that the filtering logic works across all cluster configurations and nil-config scenarios.

Suggested implementation:

func TestNamespaceFilter(t *testing.T) {
	cases := map[string]struct {
		configureClusterFn func(*storage.Cluster)
		expectedFilter     string
	}{
		"Empty filter configuration": {
			configureClusterFn: func(*storage.Cluster) {},
			expectedFilter:     "",
		},
		"Custom filter configuration": {
			configureClusterFn: func(cluster *storage.Cluster) {
				cluster.HelmConfig.DynamicConfig.RuntimeDataControl.NamespaceFilter = "test-.*"
			},
			expectedFilter: "test-.*",
		},
		"Helm-managed cluster with nil dynamic config": {
			configureClusterFn: func(cluster *storage.Cluster) {
				// Simulate a Helm-managed cluster with no DynamicConfig.
				if cluster.HelmConfig != nil {
					cluster.HelmConfig.DynamicConfig = nil
				}
			},
			// With nil dynamic config, GetNamespaceFilter should not panic and should
			// fall back to the default (no filter).
			expectedFilter: "",
		},
		"Helm-managed cluster with nil runtime data control": {
			configureClusterFn: func(cluster *storage.Cluster) {
				// Simulate a Helm-managed cluster with DynamicConfig but nil RuntimeDataControl.
				if cluster.HelmConfig != nil && cluster.HelmConfig.DynamicConfig != nil {
					cluster.HelmConfig.DynamicConfig.RuntimeDataControl = nil
				}
			},
			expectedFilter: "",
		},
		"Manual-managed cluster with runtime data control filter": {
			configureClusterFn: func(cluster *storage.Cluster) {
				// Switch to MANUAL to exercise the cluster.GetDynamicConfig().GetRuntimeDataControl() path.
				cluster.ManagedBy = storage.ManagerType_MANAGER_TYPE_MANUAL
				cluster.HelmConfig = nil
				cluster.DynamicConfig = &storage.DynamicClusterConfig{
					RuntimeDataControl: &storage.DynamicClusterConfig_RuntimeDataControl{
						NamespaceFilter: "manual-.*",
						Persistence:     true,
					},
				}
			},
			expectedFilter: "manual-.*",
		},
		"Manual-managed cluster with nil dynamic config": {
			configureClusterFn: func(cluster *storage.Cluster) {
				cluster.ManagedBy = storage.ManagerType_MANAGER_TYPE_MANUAL
				cluster.HelmConfig = nil
				// Explicitly nil DynamicConfig to ensure GetNamespaceFilter handles it gracefully.
				cluster.DynamicConfig = nil
			},
			expectedFilter: "",
		},
		"Manual-managed cluster with nil runtime data control": {
			configureClusterFn: func(cluster *storage.Cluster) {
				cluster.ManagedBy = storage.ManagerType_MANAGER_TYPE_MANUAL
				cluster.HelmConfig = nil
				cluster.DynamicConfig = &storage.DynamicClusterConfig{
					// RuntimeDataControl is intentionally left nil.
					RuntimeDataControl: nil,
				}
			},
			expectedFilter: "",
		},
		"Unknown-managed cluster with runtime data control filter": {
			configureClusterFn: func(cluster *storage.Cluster) {
				// UNKNOWN should also go through the dynamic config path.
				cluster.ManagedBy = storage.ManagerType_MANAGER_TYPE_UNKNOWN
				cluster.HelmConfig = nil
				cluster.DynamicConfig = &storage.DynamicClusterConfig{
					RuntimeDataControl: &storage.DynamicClusterConfig_RuntimeDataControl{
						NamespaceFilter: "unknown-.*",
						Persistence:     true,
					},
				}
			},
			expectedFilter: "unknown-.*",
		},
		"Unknown-managed cluster with nil dynamic config": {
			configureClusterFn: func(cluster *storage.Cluster) {
				cluster.ManagedBy = storage.ManagerType_MANAGER_TYPE_UNKNOWN
				cluster.HelmConfig = nil
				cluster.DynamicConfig = nil
			},
			expectedFilter: "",
		},
  1. The replacement assumes the existing "Custom filter configuration" case includes an expectedFilter: "test-.*", field as shown. If your current code differs (for example, if expectedFilter is declared on a different line or is missing), adjust the SEARCH pattern to match your actual block and keep the same set of test cases in the REPLACE section.
  2. This change reuses whatever test harness you have below the cases map (likely a for name, tc := range cases loop that creates a base cluster and calls GetNamespaceFilter). No changes to that loop should be necessary as long as it:
    • Starts from a Helm-managed base cluster similar to the one in your snippet.
    • Applies configureClusterFn to a fresh *storage.Cluster per test case before calling GetNamespaceFilter.
  3. If your base test cluster is not Helm-managed by default or is not initialized with HelmConfig.DynamicConfig.RuntimeDataControl, you may want to ensure the base cluster reflects the initial Helm-managed configuration you expect, so the new Helm-nil tests are meaningful.

},
},
},
"runtime data control persistence enabled": {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (testing): Extend runtimeDataControl translation tests to cover excludeOpenshift and namespaceFilter

The new case under "runtime data control persistence enabled" only exercises the persistence field. Since RuntimeDataControlSpec adds three fields and getRuntimeDataControlValues branches on each, please add at least:

  • A case with ExcludeOpenshift = RuntimeConfigEnabled, asserting runtimeDataControl.excludeOpenshift is true in the Helm values.
  • A case with a non-empty NamespaceFilter, asserting runtimeDataControl.namespaceFilter is rendered as expected.

This ensures the CRD → Helm values translation is covered for all new fields, not just persistence.

Suggested implementation:

		"runtime data control persistence enabled": {
			args: args{
				client: newDefaultFakeClient(t),
				sc: platform.SecuredCluster{
					ObjectMeta: metav1.ObjectMeta{Namespace: "stackrox"},
					Spec: platform.SecuredClusterSpec{
						ClusterName: ptr.To("test-cluster"),
						RuntimeDataControl: &platform.RuntimeDataControlSpec{
							Persistence: platform.RuntimeConfigEnabled.Pointer(),
						},
					},
				},
			},
		},
		"runtime data control exclude openshift enabled": {
			args: args{
				client: newDefaultFakeClient(t),
				sc: platform.SecuredCluster{
					ObjectMeta: metav1.ObjectMeta{Namespace: "stackrox"},
					Spec: platform.SecuredClusterSpec{
						ClusterName: ptr.To("test-cluster"),
						RuntimeDataControl: &platform.RuntimeDataControlSpec{
							ExcludeOpenshift: platform.RuntimeConfigEnabled.Pointer(),
						},
					},
				},
			},
		},
		"runtime data control namespace filter set": {
			args: args{
				client: newDefaultFakeClient(t),
				sc: platform.SecuredCluster{
					ObjectMeta: metav1.ObjectMeta{Namespace: "stackrox"},
					Spec: platform.SecuredClusterSpec{
						ClusterName: ptr.To("test-cluster"),
						RuntimeDataControl: &platform.RuntimeDataControlSpec{
							// Use a non-empty namespace filter to ensure it is rendered into Helm values.
							NamespaceFilter: &platform.NamespaceFilterSpec{
								Namespaces: []string{"team-a", "team-b"},
							},
						},
					},
				},
			},

I only see the table entry definition, not the full test harness or the exact types in platform.RuntimeDataControlSpec and related namespace filter structs. To complete the implementation and satisfy your comment, you should:

  1. Adjust the ExcludeOpenshift field assignment if its type differs from platform.RuntimeConfigEnabled.Pointer() (e.g., if it uses a different enum or a plain bool).
  2. Replace platform.NamespaceFilterSpec and its Namespaces field with the actual type and field names used by RuntimeDataControlSpec.NamespaceFilter. The key requirement is that the filter is non-empty so the Helm translation code exercises its branch.
  3. In the assertion section of the test (where the Helm values are inspected), add expectations for these new cases:
    • For "runtime data control exclude openshift enabled", assert that runtimeDataControl.excludeOpenshift is true in the rendered values.
    • For "runtime data control namespace filter set", assert that runtimeDataControl.namespaceFilter matches the structure you expect (e.g., list of namespaces, mode, etc.).
  4. If the table-driven test has an expectedValues or similar field instead of per-case assertions, add the appropriate expected runtimeDataControl sub-map to each new case instead.

Comment on lines +347 to +356
func (suite *ManagerTestSuite) TestFlushIndicators() {
_, indicator1 := makeIndicator()
_, indicator2 := makeIndicator()

// Make first indicator to match and be filtered out
suite.cluster.EXPECT().MatchProcessIndicator(gomock.Any(), indicator1).
Return(true, nil)

// The second indicator should pass through
suite.cluster.EXPECT().MatchProcessIndicator(gomock.Any(), indicator2).
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (testing): Add tests for error and edge paths in flushIndicatorQueue (MatchProcessIndicator errors and empty queue)

Right now this only validates the happy path. Please also add tests for:

  • MatchProcessIndicator returning an error (e.g., mock returning (false, err)) to document whether that indicator is dropped or still persisted.
  • An empty queuedIndicators map (or all indicators filtered out) to confirm no AddProcessIndicators calls are made and no panics occur.

These will better cover the new filtering behavior in the lifecycle manager.

Suggested implementation:

func (suite *ManagerTestSuite) TestFlushIndicators() {
	deploymentID1, indicator1 := makeIndicator()
	deploymentID2, indicator2 := makeIndicator()

	// Queue both indicators.
	suite.manager.queuedIndicators[deploymentID1] = indicator1
	suite.manager.queuedIndicators[deploymentID2] = indicator2

	// Make first indicator to match and be filtered out.
	suite.cluster.EXPECT().MatchProcessIndicator(gomock.Any(), indicator1).
		Return(true, nil)

	// The second indicator should pass through.
	suite.cluster.EXPECT().MatchProcessIndicator(gomock.Any(), indicator2).
		Return(false, nil)

	// Only the second indicator should be persisted.
	suite.indicators.EXPECT().
		AddProcessIndicators(gomock.Any(), []*storage.ProcessIndicator{indicator2}).
		Return(nil)

	suite.manager.flushIndicatorQueue(context.Background())
}

func (suite *ManagerTestSuite) TestFlushIndicators_MatchError() {
	deploymentID, indicator := makeIndicator()

	// Queue the indicator.
	suite.manager.queuedIndicators[deploymentID] = indicator

	// Simulate a matching error for this indicator.
	suite.cluster.EXPECT().
		MatchProcessIndicator(gomock.Any(), indicator).
		Return(false, errors.New("match error"))

	// On error, the indicator should still be persisted so it isn't lost.
	suite.indicators.EXPECT().
		AddProcessIndicators(gomock.Any(), []*storage.ProcessIndicator{indicator}).
		Return(nil)

	suite.manager.flushIndicatorQueue(context.Background())
}

func (suite *ManagerTestSuite) TestFlushIndicators_EmptyQueue() {
	// Start with an empty queue to exercise the edge case.
	suite.manager.queuedIndicators = make(map[string]*storage.ProcessIndicator)

	// No indicators should be written when the queue is empty.
	suite.indicators.EXPECT().
		AddProcessIndicators(gomock.Any(), gomock.Any()).
		Times(0)

	// Should not panic and should not attempt to persist indicators.
	suite.manager.flushIndicatorQueue(context.Background())
}
  1. Ensure the test file imports the context and errors packages (if they are not already imported):
    • import "context"
    • import "errors"
  2. If the queuedIndicators field on suite.manager can be nil before usage in some tests, make sure it is initialized in the test suite setup (e.g., in SetupTest / SetupSuite) to avoid nil map assignments.
  3. Adjust the expectation in TestFlushIndicators_MatchError if the actual, desired behavior for a MatchProcessIndicator error is to drop the indicator instead of persisting it:
    • In that case, change the AddProcessIndicators expectation to .Times(0) and update the test name/comment to reflect dropping-on-error semantics.

Comment on lines +833 to +842
func (s *clusterDataStoreTestSuite) TestProcessMatching() {
clusterID := fixtureconsts.Cluster1
testCluster := &storage.Cluster{
Id: clusterID,
Name: "test",
ManagedBy: storage.ManagerType_MANAGER_TYPE_HELM_CHART,
HelmConfig: &storage.CompleteClusterConfig{
DynamicConfig: &storage.DynamicClusterConfig{
RuntimeDataControl: &storage.DynamicClusterConfig_RuntimeDataControl{
NamespaceFilter: "test-.*",
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (testing): Consider broadening TestProcessMatching to cover clusters without runtimeDataControl and non-Helm-managed clusters

Currently this only exercises regex matching for a Helm-managed cluster with a non-nil RuntimeDataControl. To better cover MatchProcessIndicator and namespace filtering, please also add:

  • A case where DynamicConfig/RuntimeDataControl is nil, asserting the expected default behavior (e.g., false without error).
  • A case for a MANUAL-managed cluster to verify it uses the expected namespace filter source and that the cache path behaves correctly.

This will validate matching behavior across the main cluster management/configuration variants.

Suggested implementation:

func (s *clusterDataStoreTestSuite) TestProcessMatching() {
	ctx := context.Background()

	helmClusterID := fixtureconsts.Cluster1
	manualClusterID := fixtureconsts.Cluster2

	// Helm-managed cluster with non-nil RuntimeDataControl; Regex should match "test-.*" namespaces.
	helmCluster := &storage.Cluster{
		Id:        helmClusterID,
		Name:      "helm-cluster",
		ManagedBy: storage.ManagerType_MANAGER_TYPE_HELM_CHART,
		HelmConfig: &storage.CompleteClusterConfig{
			DynamicConfig: &storage.DynamicClusterConfig{
				RuntimeDataControl: &storage.DynamicClusterConfig_RuntimeDataControl{
					NamespaceFilter: "test-.*",
					Persistence:     true,
				},
			},
		},
	}

	// Helm-managed cluster with nil DynamicConfig/RuntimeDataControl; should take the default path
	// and return "not matched" without error.
	nilRuntimeControlCluster := &storage.Cluster{
		Id:        uuid.NewV4().String(),
		Name:      "no-runtime-control",
		ManagedBy: storage.ManagerType_MANAGER_TYPE_HELM_CHART,
		HelmConfig: &storage.CompleteClusterConfig{
			// DynamicConfig intentionally nil.
		},
	}

	// MANUAL-managed cluster to exercise non-Helm cache path and namespace filter source.
	// For these tests we rely on the default behavior (no RuntimeDataControl) and verify that
	// we still get a deterministic non-error result.
	manualCluster := &storage.Cluster{
		Id:        manualClusterID,
		Name:      "manual-cluster",
		ManagedBy: storage.ManagerType_MANAGER_TYPE_MANUAL,
		// No HelmConfig – this should force the datastore down the MANUAL-management path.
	}

	// Upsert clusters so MatchProcessIndicator can look them up.
	require.NoError(s.T(), s.datastore.UpsertCluster(ctx, helmCluster))
	require.NoError(s.T(), s.datastore.UpsertCluster(ctx, nilRuntimeControlCluster))
	require.NoError(s.T(), s.datastore.UpsertCluster(ctx, manualCluster))

	// Process indicators to exercise regex matching and default behavior.
	matchingIndicator := &storage.ProcessIndicator{
		ClusterId: helmClusterID,
		Namespace: "test-namespace",
		PodId:     "pod-1",
		Signal: &storage.ProcessSignal{
			ContainerId: "container-1",
			Name:        "nginx",
		},
	}

	nonMatchingIndicator := &storage.ProcessIndicator{
		ClusterId: helmClusterID,
		Namespace: "prod-namespace",
		PodId:     "pod-2",
		Signal: &storage.ProcessSignal{
			ContainerId: "container-2",
			Name:        "nginx",
		},
	}

	nilRuntimeControlIndicator := &storage.ProcessIndicator{
		ClusterId: nilRuntimeControlCluster.Id,
		Namespace: "any-namespace",
		PodId:     "pod-3",
		Signal: &storage.ProcessSignal{
			ContainerId: "container-3",
			Name:        "redis",
		},
	}

	manualClusterIndicator := &storage.ProcessIndicator{
		ClusterId: manualClusterID,
		Namespace: "manual-namespace",
		PodId:     "pod-4",
		Signal: &storage.ProcessSignal{
			ContainerId: "container-4",
			Name:        "busybox",
		},
	}

	s.Run("helm cluster with matching namespace filter", func() {
		matched, err := s.datastore.MatchProcessIndicator(ctx, matchingIndicator)
		require.NoError(s.T(), err)
		assert.True(s.T(), matched, "expected helm cluster regex to match namespace")
	})

	s.Run("helm cluster with non-matching namespace filter", func() {
		matched, err := s.datastore.MatchProcessIndicator(ctx, nonMatchingIndicator)
		require.NoError(s.T(), err)
		assert.False(s.T(), matched, "expected helm cluster regex NOT to match namespace")
	})

	s.Run("helm cluster with nil RuntimeDataControl", func() {
		matched, err := s.datastore.MatchProcessIndicator(ctx, nilRuntimeControlIndicator)
		require.NoError(s.T(), err, "expected default behavior without error when RuntimeDataControl is nil")
		assert.False(s.T(), matched, "expected default behavior to return not-matched when RuntimeDataControl is nil")
	})

	s.Run("manual-managed cluster", func() {
		// First call exercises initial cache population / lookup path.
		matched1, err := s.datastore.MatchProcessIndicator(ctx, manualClusterIndicator)
		require.NoError(s.T(), err)
		// We don't assert a specific match policy here, only that the behavior is consistent and non-erroring.
		// For the current implementation we expect "false" when no RuntimeDataControl is present.
		assert.False(s.T(), matched1, "expected manual-managed cluster to return not-matched by default")

		// Second call should hit any internal cache path, and must behave identically.
		matched2, err := s.datastore.MatchProcessIndicator(ctx, manualClusterIndicator)
		require.NoError(s.T(), err)
		assert.Equal(s.T(), matched1, matched2, "manual-managed cluster matching should be stable across cache hits")
	})

The edit assumes the following, which you may need to adjust to your existing code:

  1. The test suite already imports:

    • context (context.Background()),
    • github.com/stretchr/testify/require,
    • github.com/stretchr/testify/assert,
    • github.com/stackrox/rox/generated/storage,
    • github.com/stackrox/rox/pkg/fixtures/fixtureconsts,
    • and a UUID helper (github.com/stackrox/rox/pkg/uuid or similar) for uuid.NewV4().String().
      If any of these are missing or named differently, update the imports or replace the UUID generation with any existing helper.
  2. The datastore under test exposes:
    MatchProcessIndicator(ctx context.Context, pi *storage.ProcessIndicator) (bool, error)
    and UpsertCluster(ctx context.Context, c *storage.Cluster) error.
    If the signatures differ (e.g., additional parameters, cluster ID instead of looking it up from the indicator), adapt the calls accordingly.

  3. If your test suite uses a sac.WithAllAccess context instead of context.Background(), replace the context creation line with the appropriate helper to maintain consistency with other tests.

  4. Ensure that no remaining fragments of the original TestProcessMatching function remain below this replacement; the full body of the function should match the replacement shown above, terminated by a single closing brace } for the function.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants