fix(usage): reject namespaced usages of cluster-scoped resources#7253
fix(usage): reject namespaced usages of cluster-scoped resources#7253officialasishkumar wants to merge 4 commits into
Conversation
📝 WalkthroughWalkthroughAdds reconciler-side detection and rejection of namespaced Changes
Sequence Diagram(s)mermaid mermaid Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
Suggested reviewers
Would you like a short changelog entry or release-note wording for this change? 🚥 Pre-merge checks | ✅ 6✅ Passed checks (6 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
internal/controller/protection/usage/reconciler_test.go (1)
281-349: Could we add one companion case forlen(usages) > 0?Nice coverage here. To fully protect this branch’s intent, would you add a case where
FindUsageOfreturns at least one usage and assert thecrossplane.io/in-uselabel is not removed while the namespacedUsageis still rejected? That would prevent regressions in the “remove only when no valid usages require it” behavior.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/controller/protection/usage/reconciler_test.go` around lines 281 - 349, Add a new test case mirroring "RejectNamespacedUsageOfClusterScopedResource" but have FinderFn return a non-empty slice (e.g., []protection.Usage{&protection.InternalUsage{}}) so len(usages) > 0; keep the same WithClientApplicator setup but change MockUpdate (for composed.Unstructured) to assert the inUseLabelKey is still present (fail if it's missing) and keep MockStatusUpdate asserting the Usage is rejected with the same ReconcileError; ensure the test uses the same Reconciler options (WithSelectorResolver, WithFinalizer) and expects the same err value as the original case.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@internal/controller/protection/usage/reconciler_test.go`:
- Around line 281-349: Add a new test case mirroring
"RejectNamespacedUsageOfClusterScopedResource" but have FinderFn return a
non-empty slice (e.g., []protection.Usage{&protection.InternalUsage{}}) so
len(usages) > 0; keep the same WithClientApplicator setup but change MockUpdate
(for composed.Unstructured) to assert the inUseLabelKey is still present (fail
if it's missing) and keep MockStatusUpdate asserting the Usage is rejected with
the same ReconcileError; ensure the test uses the same Reconciler options
(WithSelectorResolver, WithFinalizer) and expects the same err value as the
original case.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 48170176-5516-40a3-ad42-760ec20b5213
📒 Files selected for processing (2)
internal/controller/protection/usage/reconciler.gointernal/controller/protection/usage/reconciler_test.go
|
from the original issue #7249:
I think we should also address the webhook. Can we include that in this PR too, or would you be interested in contributing it in a separate PR, @officialasishkumar? |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
internal/webhook/protection/usage/validating_test.go (1)
345-354: Test execution looks good!The test runner follows the standard table-driven pattern. One small thought: the
DenyBadObjecttest case (lines 89-91) compares the exact error message text including"invalid character 'o' in literal null (expecting 'u')". This could potentially be fragile if the JSON library's error messages change across Go versions, though in practice this tends to be stable. Just something to be aware of if this test ever becomes flaky!🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/webhook/protection/usage/validating_test.go` around lines 345 - 354, The test is fragile because the DenyBadObject case asserts an exact JSON parser error string; update the assertion in the table-driven test that constructs NewValidatingHandler and calls Handle so it checks for the presence of a stable substring or error type instead of matching the full message (e.g., use strings.Contains on got.Error() or compare against a sentinel error) to make the DenyBadObject expectation resilient to Go/JSON error text changes.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@internal/webhook/protection/usage/validating_test.go`:
- Around line 345-354: The test is fragile because the DenyBadObject case
asserts an exact JSON parser error string; update the assertion in the
table-driven test that constructs NewValidatingHandler and calls Handle so it
checks for the presence of a stable substring or error type instead of matching
the full message (e.g., use strings.Contains on got.Error() or compare against a
sentinel error) to make the DenyBadObject expectation resilient to Go/JSON error
text changes.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 56cb31e0-555c-4998-9b70-c262f30496e2
📒 Files selected for processing (4)
cluster/webhookconfigurations/usage.yamlinternal/webhook/protection/usage/handler.gointernal/webhook/protection/usage/validating.gointernal/webhook/protection/usage/validating_test.go
✅ Files skipped from review due to trivial changes (1)
- cluster/webhookconfigurations/usage.yaml
|
@phisco added |
Stop namespaced Usage objects from silently pretending to protect cluster-scoped resources. The controller now surfaces a reconcile error, removes a misleading in-use label when no valid usages remain, and adds regression coverage for the unsupported configuration discussed in crossplane#7249. Signed-off-by: Asish Kumar <officialasishkumar@gmail.com>
…esources Add a validating admission webhook that rejects namespaced Usage objects referencing cluster-scoped resources at creation and update time. This complements the existing reconciler check by providing faster feedback through admission rejection rather than waiting for the reconciler to surface a condition error. The webhook resolves the target GVK scope via the REST mapper. If the referenced GVK cannot be resolved (e.g. CRD not yet installed), the request is allowed and the reconciler handles it later. Signed-off-by: Asish Kumar <officialasishkumar@gmail.com>
4616be5 to
6127fc9
Compare
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@internal/controller/protection/usage/reconciler.go`:
- Around line 488-491: The delete branch in Reconcile assumes the current Usage
is included in FindUsageOf results (it removes the crossplane.io/in-use label
when len(usages) < 2), but FindUsageOf does not return invalid namespaced Usages
for cluster-scoped targets; update the delete path to mirror the finder
assumption by either (a) special-casing deletion to treat usages as if the
current Usage may be excluded (i.e., call FindUsageOf and if it returns 1 ensure
that usage is a valid ClusterUsage before removing the label), or (b) extract
the label-retention logic into a shared helper used by both the non-delete and
delete branches so both use the same check; reference the Reconcile method, the
usages variable returned by FindUsageOf, and the crossplane.io/in-use label when
implementing this change.
- Around line 477-515: The unsupported-Usage condition must be set and persisted
even if cleanup (r.resource.FindUsageOf or r.client.Update) fails: ensure you
call status.MarkConditions(xpv2.ReconcileError(errUnsupportedUsage...)) and
attempt r.client.Status().Update(ctx, uu) and r.record.Event(uu,
event.Warning(reasonUnsupportedUsage, err)) before returning for any early
errors in the block that check/clean the inUse label (i.e. inside the
r.resource.FindUsageOf error branch and the r.client.Update error branch).
Concretely, in the error paths for r.resource.FindUsageOf and r.client.Update
(and when r.client.Status().Update returns an error), set the same
ReconcileError condition on uu via status.MarkConditions, call
r.client.Status().Update (log/record any Status update error but do not suppress
the unsupported-usage condition), record the unsupported-usage event, then
return the original cleanup error (wrap/log as currently done) so the actionable
condition is always present for users.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 0197eabc-8911-4195-9fd2-e56ae48615a7
📒 Files selected for processing (6)
cluster/webhookconfigurations/usage.yamlinternal/controller/protection/usage/reconciler.gointernal/controller/protection/usage/reconciler_test.gointernal/webhook/protection/usage/handler.gointernal/webhook/protection/usage/validating.gointernal/webhook/protection/usage/validating_test.go
🚧 Files skipped from review as they are similar to previous changes (4)
- internal/webhook/protection/usage/handler.go
- cluster/webhookconfigurations/usage.yaml
- internal/webhook/protection/usage/validating_test.go
- internal/webhook/protection/usage/validating.go
| // Finder does not return this invalid namespaced Usage for a | ||
| // cluster-scoped resource, so zero matches means the label is only | ||
| // misleading and should be removed. | ||
| if len(usages) == 0 { |
There was a problem hiding this comment.
Please mirror this finder assumption in the delete path.
Thanks for calling out that FindUsageOf won't return the invalid namespaced Usage here. The delete branch earlier in Reconcile still removes the label when len(usages) < 2, which assumes the current Usage is part of the result. If one invalid namespaced Usage and one valid ClusterUsage reference the same cluster-scoped object, deleting the invalid one will see len(usages) == 1 and strip crossplane.io/in-use from a still-protected resource. Could we special-case that path too, or share the label-retention check between both branches?
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@internal/controller/protection/usage/reconciler.go` around lines 488 - 491,
The delete branch in Reconcile assumes the current Usage is included in
FindUsageOf results (it removes the crossplane.io/in-use label when len(usages)
< 2), but FindUsageOf does not return invalid namespaced Usages for
cluster-scoped targets; update the delete path to mirror the finder assumption
by either (a) special-casing deletion to treat usages as if the current Usage
may be excluded (i.e., call FindUsageOf and if it returns 1 ensure that usage is
a valid ClusterUsage before removing the label), or (b) extract the
label-retention logic into a shared helper used by both the non-delete and
delete branches so both use the same check; reference the Reconcile method, the
usages variable returned by FindUsageOf, and the crossplane.io/in-use label when
implementing this change.
There was a problem hiding this comment.
Not addressing this. The delete path at line 380 (len(usages) < 2) is pre-existing code not modified by this PR. With the new webhook rejecting creation of invalid namespaced Usages and the reconciler also rejecting them, this edge case only applies to pre-existing bad data and is increasingly unlikely. It would be best handled in a separate change if needed.
There was a problem hiding this comment.
@officialasishkumar, that's a fair call — the delete path is pre-existing code outside this PR's scope, and the double layer of prevention (webhook + reconciler rejection) makes the edge case increasingly unlikely for new data. I'll defer to your judgment here.
Would you like me to open a follow-up GitHub issue to track the delete-path label-retention fix (adjusting the len(usages) < 2 check to account for FindUsageOf not returning invalid namespaced Usages) so it isn't lost?
…cluster-scope Signed-off-by: Asish Kumar <officialasishkumar@gmail.com>
Added a test case that verifies the in-use label is preserved on the target resource when other valid usages still reference it, while the namespaced Usage is still correctly rejected with a ReconcileError. Signed-off-by: Asish Kumar <officialasishkumar@gmail.com>
There was a problem hiding this comment.
🧹 Nitpick comments (2)
internal/controller/protection/usage/reconciler_test.go (2)
319-333: Consider usingcmpopts.IgnoreFieldsor consolidating the condition verification.Both test cases have identical status verification logic in their
MockStatusUpdatecallbacks. This duplication could be extracted into a helper function for clarity. However, given that this follows the existing patterns in the file (e.g., similar verbose checks at lines 545-548), I understand if you prefer to keep it consistent with the established style.This is a minor suggestion—the current implementation is correct and readable.
♻️ Optional: Extract status verification helper
func verifyReconcileErrorCondition(t *testing.T, o *v1beta1.Usage, wantErr error) { t.Helper() got := o.Status.GetCondition(xpv2.TypeSynced) want := xpv2.ReconcileError(wantErr) if diff := cmp.Diff(want.Status, got.Status); diff != "" { t.Fatalf("unexpected synced status: %s", diff) } if diff := cmp.Diff(want.Reason, got.Reason); diff != "" { t.Fatalf("unexpected synced reason: %s", diff) } if diff := cmp.Diff(want.Message, got.Message); diff != "" { t.Fatalf("unexpected synced message: %s", diff) } }Then in MockStatusUpdate:
MockStatusUpdate: test.NewMockSubResourceUpdateFn(nil, func(obj client.Object) error { verifyReconcileErrorCondition(t, obj.(*v1beta1.Usage), errors.Errorf("%s: Usage %q in namespace %q references cluster-scoped %s %q; use ClusterUsage instead", errUnsupportedUsage, "protect-cluster", "default", "ClusterThing", "cluster-resource")) return nil }),Also applies to: 386-400
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/controller/protection/usage/reconciler_test.go` around lines 319 - 333, Tests duplicate the detailed condition verification in multiple MockStatusUpdate callbacks; extract that logic into a helper and call it from each callback to remove duplication. Add a helper function (e.g., verifyReconcileErrorCondition(t *testing.T, o *v1beta1.Usage, wantErr error)) that calls o.Status.GetCondition(xpv2.TypeSynced), builds the expected xpv2.ReconcileError(wantErr) and runs the three cmp.Diff checks, then replace the inline checks inside test.NewMockSubResourceUpdateFn callbacks (the MockStatusUpdate blocks) to call this helper with the constructed errors (using errUnsupportedUsage and the same Usage name/namespace/cluster resource strings). This keeps behavior identical while removing duplicated code.
281-349: Resolve MockUpdate inconsistency between the two test cases.The annotation update logic at lines 449-455 in the reconciler is conditional—it only calls
client.Update()if the computed annotation value differs from the current one. Since both test cases pre-set the annotation to"undefined", the reconciler will attempt to update it with the computed value. When the Update call fails (as mocked in the first test), the error is logged but doesn't stop execution, allowing the reconciliation to continue and reach the cluster-scope check.However, there's an inconsistency: the first test's
MockUpdatereturns an error for non-Unstructured objects (lines 309-317), while the second test returnsnil(lines 378-385). Both tests reach the same endpoint (setting the ReconcileError condition), so clarify why this differs between them. Either both should return an error to match the actual behavior, or both should returnnilif the annotation update is expected to succeed. This will improve test clarity and ensure the mocking is intentional.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/controller/protection/usage/reconciler_test.go` around lines 281 - 349, The MockUpdate behavior in the two related tests is inconsistent: the first test's test.MockClient.MockUpdate returns an error for non-*composed.Unstructured objects while the second returns nil, but both exercises hit the reconciler's annotation-update path (which only calls client.Update when the computed annotation differs) and then proceed to the same cluster-scope rejection logic; make them consistent by updating the first test's MockUpdate in the RejectNamespacedUsageOfClusterScopedResource case (or the second test) so both MockUpdate handlers either return nil for the expected object types or both simulate the same update error, referencing the test.MockClient.MockUpdate, WithClientApplicator/xpresource.ClientApplicator, and detailsAnnotationKey/*composed.Unstructured handling to locate the mocks to change.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@internal/controller/protection/usage/reconciler_test.go`:
- Around line 319-333: Tests duplicate the detailed condition verification in
multiple MockStatusUpdate callbacks; extract that logic into a helper and call
it from each callback to remove duplication. Add a helper function (e.g.,
verifyReconcileErrorCondition(t *testing.T, o *v1beta1.Usage, wantErr error))
that calls o.Status.GetCondition(xpv2.TypeSynced), builds the expected
xpv2.ReconcileError(wantErr) and runs the three cmp.Diff checks, then replace
the inline checks inside test.NewMockSubResourceUpdateFn callbacks (the
MockStatusUpdate blocks) to call this helper with the constructed errors (using
errUnsupportedUsage and the same Usage name/namespace/cluster resource strings).
This keeps behavior identical while removing duplicated code.
- Around line 281-349: The MockUpdate behavior in the two related tests is
inconsistent: the first test's test.MockClient.MockUpdate returns an error for
non-*composed.Unstructured objects while the second returns nil, but both
exercises hit the reconciler's annotation-update path (which only calls
client.Update when the computed annotation differs) and then proceed to the same
cluster-scope rejection logic; make them consistent by updating the first test's
MockUpdate in the RejectNamespacedUsageOfClusterScopedResource case (or the
second test) so both MockUpdate handlers either return nil for the expected
object types or both simulate the same update error, referencing the
test.MockClient.MockUpdate, WithClientApplicator/xpresource.ClientApplicator,
and detailsAnnotationKey/*composed.Unstructured handling to locate the mocks to
change.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: d086abb1-b8ff-40fe-8a4d-d312f9b53853
📒 Files selected for processing (1)
internal/controller/protection/usage/reconciler_test.go
There was a problem hiding this comment.
Pull request overview
This PR addresses a protection gap where a namespaced Usage could appear to protect a cluster-scoped target (via the crossplane.io/in-use label) but still allow deletion because the protection webhook can’t find that Usage. The change enforces that this unsupported configuration is rejected and avoids leaving misleading “in-use” labels behind.
Changes:
- Add a validating admission webhook to reject namespaced
Usageobjects whosespec.ofrefers to a cluster-scoped GVK. - Update the Usage reconciler to error/reject when it resolves a cluster-scoped target from a namespaced
Usage, and to remove a misleadingcrossplane.io/in-uselabel when appropriate. - Add unit tests for both the new validating webhook and the controller-side rejection / label removal behavior.
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| internal/webhook/protection/usage/validating.go | New validating webhook handler that rejects namespaced Usage referencing cluster-scoped GVKs. |
| internal/webhook/protection/usage/validating_test.go | Unit tests covering allow/deny behavior of the new validating handler. |
| internal/webhook/protection/usage/handler.go | Registers the new validating webhook endpoint with the manager. |
| internal/controller/protection/usage/reconciler.go | Controller-side guardrail: reject unsupported namespaced Usage → cluster-scoped target, and remove misleading in-use label when no valid usages exist. |
| internal/controller/protection/usage/reconciler_test.go | Tests for rejecting unsupported configuration and correct label removal/retention behavior. |
| cluster/webhookconfigurations/usage.yaml | Deploys an additional ValidatingWebhookConfiguration for the new validation endpoint. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| err := errors.Errorf("%s: Usage %q in namespace %q references cluster-scoped %s %q; use ClusterUsage instead", errUnsupportedUsage, uu.GetName(), uu.GetNamespace(), used.GetKind(), used.GetName()) | ||
| status.MarkConditions(xpv2.ReconcileError(err)) | ||
| _ = r.client.Status().Update(ctx, uu) | ||
| r.record.Event(uu, event.Warning(reasonUnsupportedUsage, err)) | ||
|
|
||
| return reconcile.Result{}, err |
There was a problem hiding this comment.
The reconcile error returned here omits the target resource APIVersion/GVK (it only includes kind and name). Including used.GetAPIVersion() (and ideally kind/name) would make the error more actionable, especially when the same Kind can exist across multiple API groups/versions.
| func SetupWebhookWithManager(mgr ctrl.Manager, f Finder, options controller.Options) { | ||
| h := NewHandler(xpunstructured.NewClient(mgr.GetClient()), f, WithLogger(options.Logger.WithValues("webhook", "no-usages"))) | ||
| mgr.GetWebhookServer().Register("/validate-no-usages", &webhook.Admission{Handler: h}) | ||
|
|
||
| vh := NewValidatingHandler(mgr.GetRESTMapper(), WithValidatingLogger(options.Logger.WithValues("webhook", "validate-usage-scope"))) | ||
| mgr.GetWebhookServer().Register("/validate-usage-scope", &webhook.Admission{Handler: vh}) |
There was a problem hiding this comment.
This change introduces a new validating admission webhook endpoint (/validate-usage-scope) in addition to the controller-side rejection described in the PR summary. Because the configuration uses failurePolicy: Fail, any webhook outage/misconfiguration will block CREATE/UPDATE of Usage objects; please call this out explicitly in the PR description (and/or justify the chosen failure policy) so operators understand the operational impact.
Description of your changes
Namespaced
Usageobjects can currently resolve cluster-scopedspec.oftargets, add thecrossplane.io/in-uselabel, and still fail to block deletion because the webhook lookup cannot find those usages. That leaves the target looking protected when it is not.This PR rejects that unsupported configuration in the usage controller. When a namespaced
Usageresolves a cluster-scoped target it now:Usagewith a reconcile errorcrossplane.io/in-uselabel if no valid usages still require itClusterUsageinsteadFixes #7249
I have:
./nix.sh flake checkto ensure this PR is ready for review.Added or updated e2e tests.(unit coverage exercises the controller behavior directly)Linked a PR or a docs tracking issue to document this change.(bug fix for unsupported existing behavior)Added(not applicable for this fix)backport release-x.ylabels to auto-backport this PR.Followed the API promotion workflow if this PR introduces, removes, or promotes an API.(no API changes)Need help with this checklist? See the cheat sheet.