ROX-33555: Forward Central VM ACK/NACK to compliance#19322
ROX-33555: Forward Central VM ACK/NACK to compliance#19322
Conversation
|
This change is part of the following stack: Change managed by git-spice. |
|
Skipping CI for Draft Pull Request. |
There was a problem hiding this comment.
Hey - I've found 2 issues, and left some high level feedback:
- Consider whether
toComplianceshould be buffered, sinceforwardToComplianceis called inline fromProcessMessageand an unbuffered channel here could back up Central’s ACK processing if the compliance multiplexer is slow or temporarily blocked. - In
forwardToCompliance, you setHostnameandBroadcastbased solely onresourceID; double-check that using the VM resource ID asHostnameand broadcasting only whenresourceID == ""matches the expectations of the compliance side consumers (if not, it may be worth adding a small helper or comment clarifying the mapping).
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- Consider whether `toCompliance` should be buffered, since `forwardToCompliance` is called inline from `ProcessMessage` and an unbuffered channel here could back up Central’s ACK processing if the compliance multiplexer is slow or temporarily blocked.
- In `forwardToCompliance`, you set `Hostname` and `Broadcast` based solely on `resourceID`; double-check that using the VM resource ID as `Hostname` and broadcasting only when `resourceID == ""` matches the expectations of the compliance side consumers (if not, it may be worth adding a small helper or comment clarifying the mapping).
## Individual Comments
### Comment 1
<location path="sensor/common/virtualmachine/index/handler_impl.go" line_range="167" />
<code_context>
return errStartMoreThanOnce
}
h.indexReports = make(chan *v1.IndexReport, env.VirtualMachinesIndexReportsBufferSize.IntegerSetting())
+ h.toCompliance = make(chan common.MessageToComplianceWithAddress)
h.toCentral = h.run(h.indexReports)
return nil
</code_context>
<issue_to_address>
**suggestion (bug_risk):** Consider buffering the compliance channel to avoid blocking ACK processing when Compliance is slow or temporarily unavailable.
Because `toCompliance` is unbuffered, `forwardToCompliance` will block until the consumer reads, and `ProcessMessage` is on the hot ACK/NACK path. If Compliance is slow or backpressured, this can stall ACK handling. A small buffer (as used for other internal channels) would mitigate head-of-line blocking while still preserving ordering.
Suggested implementation:
```golang
if h.toCentral != nil || h.indexReports != nil || h.toCompliance != nil {
return errStartMoreThanOnce
}
h.indexReports = make(chan *v1.IndexReport, env.VirtualMachinesIndexReportsBufferSize.IntegerSetting())
h.toCompliance = make(chan common.MessageToComplianceWithAddress, env.VirtualMachinesToComplianceBufferSize.IntegerSetting())
```
1. Define `VirtualMachinesToComplianceBufferSize` in the `env` package (or equivalent config) following the same pattern used for `VirtualMachinesIndexReportsBufferSize` (type, default, description, etc.).
2. If there is an existing common buffer size setting for internal VM handler channels (e.g., a shared constant/env var), you may want to reuse that instead of introducing a new one, for consistency.
3. Ensure any unit tests that construct `handlerImpl` or validate channel behavior are updated to account for the buffered `toCompliance` channel (e.g., tests that might have assumed it was unbuffered).
</issue_to_address>
### Comment 2
<location path="sensor/common/virtualmachine/index/handler_impl.go" line_range="240-242" />
<code_context>
+ return
+ }
+
+ select {
+ case <-h.stopper.Flow().StopRequested():
+ case h.toCompliance <- common.MessageToComplianceWithAddress{
+ Msg: &sensor.MsgToCompliance{
+ Msg: &sensor.MsgToCompliance_ComplianceAck{
</code_context>
<issue_to_address>
**suggestion:** Consider handling or at least logging when messages are dropped due to sensor shutdown.
Here, when `h.stopper.Flow().StopRequested()` fires, the select exits and the ACK/NACK is silently dropped. If these messages matter for Compliance accounting, this makes shutdown issues hard to trace. Please add at least a debug log or a counter for messages dropped due to shutdown to make this behavior observable.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
|
Images are ready for the commit at c47fcd6. To use with deploy scripts, first |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #19322 +/- ##
==========================================
- Coverage 49.61% 49.57% -0.04%
==========================================
Files 2765 2764 -1
Lines 208541 208390 -151
==========================================
- Hits 103459 103313 -146
- Misses 97405 97426 +21
+ Partials 7677 7651 -26
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
0ad5d24 to
c795476
Compare
99d3e6c to
fa6c362
Compare
c795476 to
cbee5a2
Compare
fa6c362 to
54bb9f9
Compare
cbee5a2 to
283b771
Compare
54bb9f9 to
c47fcd6
Compare
283b771 to
a9270b7
Compare
c47fcd6 to
08f7beb
Compare
Sensor's VM index handler now translates Central's SensorACK into ComplianceACK and forwards it to compliance through a dedicated channel, closing the ACK loop so the relay's UMH can track confirmation state. Declares SensorACKSupport capability and registers the handler as a ComplianceComponent on the multiplexer. AI-assisted: code was extracted from another feature branch by AI, with test fix for updated Capabilities. Reviewed and verified by the author.
08f7beb to
80bc69e
Compare
🚀 Build Images ReadyImages are ready for commit b0e1c76. To use with deploy scripts: export MAIN_IMAGE_TAG=4.11.x-611-gb0e1c76f28 |
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
1 similar comment
✅ Actions performedReview triggered.
|
|
@sourcery-ai review |
There was a problem hiding this comment.
Hey - I've found 1 issue, and left some high level feedback:
- The new
toCompliancechannel is unbuffered and used in the ACK hot path; consider adding a small buffer or a clearly non-blocking send strategy so slow compliance handling cannot block processing of Central messages. - In
forwardToCompliance, the VMresourceIDis used both asHostnameand to decideBroadcast; it would be good to double-check that these semantics match how the compliance multiplexer routes messages (e.g., whether VM IDs should be treated as hostnames or if a more explicit addressing field is needed).
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- The new `toCompliance` channel is unbuffered and used in the ACK hot path; consider adding a small buffer or a clearly non-blocking send strategy so slow compliance handling cannot block processing of Central messages.
- In `forwardToCompliance`, the VM `resourceID` is used both as `Hostname` and to decide `Broadcast`; it would be good to double-check that these semantics match how the compliance multiplexer routes messages (e.g., whether VM IDs should be treated as hostnames or if a more explicit addressing field is needed).
## Individual Comments
### Comment 1
<location path="sensor/common/virtualmachine/index/handler_impl_test.go" line_range="230-231" />
<code_context>
func (s *virtualMachineHandlerSuite) TestCapabilities() {
caps := s.handler.Capabilities()
- s.Require().Empty(caps)
+ s.Require().Len(caps, 1)
+ s.Contains(caps, centralsensor.SensorACKSupport)
}
</code_context>
<issue_to_address>
**suggestion (testing):** Add tests for the new compliance-forwarding behavior and lifecycle methods (forwardToCompliance, ComplianceC, Stopped).
The capabilities test now checks that `SensorACKSupport` is declared, but it doesn’t exercise the new behavior it enables. Please add tests that cover:
1. **forwardToCompliance happy paths**
- After `Start()` (so `toCompliance` is initialized), processing a `central.SensorACK` with `ACK` produces a message on `ComplianceC()` with:
- `Msg.ComplianceAck.Action == sensor.MsgToCompliance_ComplianceACK_ACK`
- `MessageType == VM_INDEX_REPORT`
- `ResourceId` and `Reason` correctly propagated.
- Same for `NACK`, including forwarding `Reason`.
2. **Forwarding edge cases**
- `toCompliance == nil` (no `Start()`): `ProcessMessage`/`forwardToCompliance` should not panic or block.
- Unknown `SensorACK_Action`: should be logged and dropped, with `ComplianceC()` remaining empty.
3. **Stopped()/stopper behavior**
- When `stopper.Flow().Stop()` is triggered, `forwardToCompliance` should not block on a full `toCompliance` channel. You can set up a non-drained channel, call `Stop()`, then invoke `ProcessMessage`.
4. **ComplianceC wiring**
- Verify that `Start()` initializes `ComplianceC()` (non-nil, readable) and that it’s the same channel used internally.
These will demonstrate the ACK/NACK path to compliance and ensure no-Start, unknown actions, and stop conditions are handled safely.
Suggested implementation:
```golang
func (s *virtualMachineHandlerSuite) TestCapabilities() {
caps := s.handler.Capabilities()
s.Require().Len(caps, 1)
s.Contains(caps, centralsensor.SensorACKSupport)
}
// forwardToCompliance happy-path: ACK
func (s *virtualMachineHandlerSuite) TestForwardToCompliance_ACK() {
// Ensure Start wires up the compliance channel
err := s.handler.Start(s.ctx, nil)
s.Require().NoError(err)
defer s.handler.Stop(s.ctx)
complianceC := s.handler.ComplianceC()
s.Require().NotNil(complianceC)
const (
resourceID = "vm-123"
reason = "all good"
)
ack := ¢ral.SensorACK{
Action: central.SensorACK_ACK,
ResourceId: resourceID,
Reason: reason,
}
msg := ¢ral.MsgFromSensor{
Msg: ¢ral.MsgFromSensor_SensorAck{
SensorAck: ack,
},
}
err = s.handler.ProcessMessage(msg)
s.Require().NoError(err)
select {
case toComplianceMsg := <-complianceC:
s.Require().NotNil(toComplianceMsg)
s.Require().NotNil(toComplianceMsg.Msg)
complianceAck := toComplianceMsg.Msg.GetComplianceAck()
s.Require().NotNil(complianceAck)
s.Equal(sensor.MsgToCompliance_ComplianceACK_ACK, complianceAck.Action)
s.Equal(sensor.MessageType_VM_INDEX_REPORT, complianceAck.MessageType)
s.Equal(resourceID, complianceAck.ResourceId)
s.Equal(reason, complianceAck.Reason)
case <-time.After(2 * time.Second):
s.Fail("timed out waiting for compliance ACK message")
}
}
// forwardToCompliance happy-path: NACK
func (s *virtualMachineHandlerSuite) TestForwardToCompliance_NACK() {
err := s.handler.Start(s.ctx, nil)
s.Require().NoError(err)
defer s.handler.Stop(s.ctx)
complianceC := s.handler.ComplianceC()
s.Require().NotNil(complianceC)
const (
resourceID = "vm-456"
reason = "validation failed"
)
nack := ¢ral.SensorACK{
Action: central.SensorACK_NACK,
ResourceId: resourceID,
Reason: reason,
}
msg := ¢ral.MsgFromSensor{
Msg: ¢ral.MsgFromSensor_SensorAck{
SensorAck: nack,
},
}
err = s.handler.ProcessMessage(msg)
s.Require().NoError(err)
select {
case toComplianceMsg := <-complianceC:
s.Require().NotNil(toComplianceMsg)
s.Require().NotNil(toComplianceMsg.Msg)
complianceAck := toComplianceMsg.Msg.GetComplianceAck()
s.Require().NotNil(complianceAck)
s.Equal(sensor.MsgToCompliance_ComplianceACK_NACK, complianceAck.Action)
s.Equal(sensor.MessageType_VM_INDEX_REPORT, complianceAck.MessageType)
s.Equal(resourceID, complianceAck.ResourceId)
s.Equal(reason, complianceAck.Reason)
case <-time.After(2 * time.Second):
s.Fail("timed out waiting for compliance NACK message")
}
}
// Edge case: no Start() called, toCompliance is nil – should not panic or block
func (s *virtualMachineHandlerSuite) TestForwardToCompliance_NoStartDoesNotPanicOrBlock() {
// Do NOT call Start()
ack := ¢ral.SensorACK{
Action: central.SensorACK_ACK,
ResourceId: "vm-no-start",
Reason: "ignored",
}
msg := ¢ral.MsgFromSensor{
Msg: ¢ral.MsgFromSensor_SensorAck{
SensorAck: ack,
},
}
done := make(chan struct{})
go func() {
defer close(done)
_ = s.handler.ProcessMessage(msg)
}()
select {
case <-done:
// ok, returned without panic/block
case <-time.After(2 * time.Second):
s.Fail("ProcessMessage should not block when Start() has not been called")
}
// ComplianceC() may be nil or empty; in either case we must not block.
if ch := s.handler.ComplianceC(); ch != nil {
select {
case <-ch:
s.Fail("no message should be forwarded to compliance when Start() was not called")
default:
}
}
}
// Edge case: unknown SensorACK_Action should be logged and dropped
func (s *virtualMachineHandlerSuite) TestForwardToCompliance_UnknownActionDropped() {
err := s.handler.Start(s.ctx, nil)
s.Require().NoError(err)
defer s.handler.Stop(s.ctx)
complianceC := s.handler.ComplianceC()
s.Require().NotNil(complianceC)
unknownAck := ¢ral.SensorACK{
Action: central.SensorACK_Action(999),
ResourceId: "vm-unknown",
Reason: "unknown",
}
msg := ¢ral.MsgFromSensor{
Msg: ¢ral.MsgFromSensor_SensorAck{
SensorAck: unknownAck,
},
}
err = s.handler.ProcessMessage(msg)
s.Require().NoError(err)
// Channel should remain empty (message dropped)
select {
case <-complianceC:
s.Fail("unexpected message on ComplianceC for unknown SensorACK action")
default:
}
}
// Stopped()/stopper behavior: forwardToCompliance must not block on full channel after Stop
func (s *virtualMachineHandlerSuite) TestForwardToCompliance_DoesNotBlockWhenStopped() {
err := s.handler.Start(s.ctx, nil)
s.Require().NoError(err)
defer s.handler.Stop(s.ctx)
complianceC := s.handler.ComplianceC()
s.Require().NotNil(complianceC)
// Fill the channel to capacity. This assumes a small buffer; if capacity is unknown,
// we conservatively send until a non-blocking send fails.
for i := 0; i < cap(complianceC); i++ {
select {
case complianceC <- &sensor.MsgToCompliance{}:
default:
// Channel is already full.
break
}
}
// Trigger stopper/Stopped behavior
s.handler.Stop(s.ctx)
nack := ¢ral.SensorACK{
Action: central.SensorACK_NACK,
ResourceId: "vm-stopped",
Reason: "after stop",
}
msg := ¢ral.MsgFromSensor{
Msg: ¢ral.MsgFromSensor_SensorAck{
SensorAck: nack,
},
}
done := make(chan struct{})
go func() {
defer close(done)
_ = s.handler.ProcessMessage(msg)
}()
select {
case <-done:
// ok, no blocking on full compliance channel after stop
case <-time.After(2 * time.Second):
s.Fail("ProcessMessage should not block when handler has been stopped, even if ComplianceC is full")
}
}
// ComplianceC wiring: Start initializes non-nil channel used internally
func (s *virtualMachineHandlerSuite) TestStartInitializesComplianceC() {
// Before Start, ComplianceC may be nil or undefined; we only assert post-Start behavior.
err := s.handler.Start(s.ctx, nil)
s.Require().NoError(err)
defer s.handler.Stop(s.ctx)
complianceC := s.handler.ComplianceC()
s.Require().NotNil(complianceC)
ack := ¢ral.SensorACK{
Action: central.SensorACK_ACK,
ResourceId: "vm-wiring",
Reason: "wiring-test",
}
msg := ¢ral.MsgFromSensor{
Msg: ¢ral.MsgFromSensor_SensorAck{
SensorAck: ack,
},
}
err = s.handler.ProcessMessage(msg)
s.Require().NoError(err)
select {
case <-complianceC:
// message observed on same channel returned by ComplianceC()
case <-time.After(2 * time.Second):
s.Fail("expected a message on ComplianceC after processing SensorACK")
}
}
func (s *virtualMachineHandlerSuite) TestAccepts() {
```
To make these tests compile and integrate cleanly, you will also need to:
1. Ensure the following imports exist in `handler_impl_test.go` (adjust paths to match your existing imports):
- `time`
- `github.com/stackrox/rox/generated/internalapi/central`
- `github.com/stackrox/rox/generated/internalapi/sensor`
- If `centralsensor.SensorACKSupport` is defined in a different package, keep the existing import for `centralsensor`.
2. Confirm the handler API:
- `Start` / `Stop` signatures `(ctx context.Context, _ any)` or similar may differ; update the test calls accordingly.
- `ComplianceC()` should return `chan *sensor.MsgToCompliance`; if it returns a receive-only channel, remove the direct sends in `TestForwardToCompliance_DoesNotBlockWhenStopped` and instead pre-fill the internal channel via the handler (or expose capacity via a helper).
- If `Stop` is not a method, but the suite exposes a stopper (`s.stopper.Flow().Stop()`), invoke that instead of `s.handler.Stop(s.ctx)`.
3. Adjust enum names if they differ slightly from these guesses:
- `central.SensorACK_ACK`, `central.SensorACK_NACK`, `central.SensorACK_Action`
- `sensor.MsgToCompliance_ComplianceACK_ACK`, `sensor.MsgToCompliance_ComplianceACK_NACK`
- `sensor.MessageType_VM_INDEX_REPORT`
4. If the suite uses a different context field than `s.ctx`, or uses `SetupTest`/`TearDownTest` to manage lifecycle, align the test setup/teardown with that pattern (e.g. omit explicit `Start`/`Stop` if already done in `SetupTest`).
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
📝 WalkthroughSummary by CodeRabbit
WalkthroughVM index handler now exposes a compliance channel and stopped signal, advertises SensorACKSupport, increments ACK metrics, and forwards Central SensorACK ACK/NACK messages to Compliance; the handler is registered with the compliance multiplexer when VMs are enabled. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant Central as Central
participant Handler as VM Index Handler
participant Compliance as ComplianceMultiplexer
participant Stopper as Stopper
Central->>Handler: Send MsgToSensor_SensorAck (ACK/NACK)
Handler->>Handler: metrics.IndexReportAcksReceived++
alt compliance channel present and not stopped
Handler->>Compliance: forwardToCompliance(MessageToComplianceWithAddress)
Compliance-->>Handler: accept (buffered / non-blocking)
else channel nil OR handler stopped
Handler--xCompliance: drop / skip send
end
Stopper->>Handler: Stopped() signal
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes 🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
sensor/kubernetes/sensor/sensor.go (1)
208-208: Consider making the type assertion compile-time safe.The type assertion
virtualMachineHandler.(common.ComplianceComponent)will panic at runtime if the interface isn't satisfied. While the code is correct today, this creates fragile coupling—if someone later removesStopped()orComplianceC()fromhandlerImpl, the compiler won't catch it.Consider having
vmIndex.Handlerextendcommon.ComplianceComponentor havingNewHandlerreturn a type that includes it:// Option 1: In index/handler.go type Handler interface { common.ComplianceComponent Send(ctx context.Context, vm *v1.IndexReport) error // ... other methods } // Then line 208 becomes simply: complianceMultiplexer.AddComponentWithComplianceC(virtualMachineHandler)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@sensor/kubernetes/sensor/sensor.go` at line 208, The direct type assertion virtualMachineHandler.(common.ComplianceComponent) is unsafe and can panic; make the handler type express the contract instead: update the vmIndex.Handler interface (or the return type of NewHandler/handlerImpl) to embed common.ComplianceComponent (so it requires ComplianceC() and Stopped()), then replace the assertion by passing virtualMachineHandler directly to complianceMultiplexer.AddComponentWithComplianceC; ensure references include vmIndex.Handler, NewHandler, handlerImpl, ComplianceC(), and Stopped().
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@sensor/kubernetes/sensor/sensor.go`:
- Line 208: The direct type assertion
virtualMachineHandler.(common.ComplianceComponent) is unsafe and can panic; make
the handler type express the contract instead: update the vmIndex.Handler
interface (or the return type of NewHandler/handlerImpl) to embed
common.ComplianceComponent (so it requires ComplianceC() and Stopped()), then
replace the assertion by passing virtualMachineHandler directly to
complianceMultiplexer.AddComponentWithComplianceC; ensure references include
vmIndex.Handler, NewHandler, handlerImpl, ComplianceC(), and Stopped().
ℹ️ Review info
⚙️ Run configuration
Configuration used: Central YAML (base), Organization UI (inherited)
Review profile: CHILL
Plan: Pro
Run ID: 61d79a44-9b67-4bd1-ae75-f45578cc0b9f
📒 Files selected for processing (3)
sensor/common/virtualmachine/index/handler_impl.gosensor/common/virtualmachine/index/handler_impl_test.gosensor/kubernetes/sensor/sensor.go
|
@sourcery-ai review |
|
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@sensor/common/virtualmachine/index/handler_impl.go`:
- Around line 219-260: forwardToCompliance synchronously blocks on sending to
h.toCompliance which can hang ProcessMessage() when compliance backpressure
occurs; make the send non-blocking or bounded by context/stopper to avoid
stalling Central. Modify handlerImpl.forwardToCompliance to attempt a
non-blocking send (use a select with the existing stop case, the send case, and
a default case that drops and logs a warning) or perform the send in a
short-lived goroutine with a timeout tied to stopper.Flow().StopRequested();
update references to h.toCompliance and forwardToCompliance accordingly so VM
ACK/NACKs are dropped with a log instead of blocking central processing.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Central YAML (base), Organization UI (inherited)
Review profile: CHILL
Plan: Pro
Run ID: 4e9963b9-918c-4633-845c-4b348a016e54
📒 Files selected for processing (4)
sensor/common/virtualmachine/index/handler.gosensor/common/virtualmachine/index/handler_impl.gosensor/common/virtualmachine/index/handler_impl_test.gosensor/kubernetes/sensor/sensor.go
|
@coderabbitai review again |
|
✅ Actions performedFull review triggered. |
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
sensor/common/virtualmachine/index/handler_impl.go (1)
132-145:⚠️ Potential issue | 🟡 MinorDon’t record unknown
SensorACKactions as ACK metrics.
forwardToCompliance()correctly drops unsupported actions, butProcessMessage()still incrementsIndexReportAcksReceivedfor them viaaction.String(). That can create bogus metric series and silently skew ACK/NACK monitoring. Only increment the metric after the action is recognized asACKorNACK.Possible fix
-func (h *handlerImpl) ProcessMessage(ctx context.Context, msg *central.MsgToSensor) error { +func (h *handlerImpl) ProcessMessage(ctx context.Context, msg *central.MsgToSensor) error { sensorAck := msg.GetSensorAck() if sensorAck == nil || sensorAck.GetMessageType() != central.SensorACK_VM_INDEX_REPORT { return nil } vmID := sensorAck.GetResourceId() action := sensorAck.GetAction() reason := sensorAck.GetReason() - h.forwardToCompliance(ctx, vmID, action, reason) - - metrics.IndexReportAcksReceived.WithLabelValues(action.String()).Inc() + if h.forwardToCompliance(ctx, vmID, action, reason) { + metrics.IndexReportAcksReceived.WithLabelValues(action.String()).Inc() + } return nil } func (h *handlerImpl) forwardToCompliance( ctx context.Context, resourceID string, action central.SensorACK_Action, reason string, -) { +) bool { if h.toCompliance == nil { log.Debug("Compliance channel not initialized; skipping forwarding VM ACK/NACK") - return + return false } var complianceAction sensor.MsgToCompliance_ComplianceACK_Action switch action { case central.SensorACK_ACK: complianceAction = sensor.MsgToCompliance_ComplianceACK_ACK case central.SensorACK_NACK: complianceAction = sensor.MsgToCompliance_ComplianceACK_NACK default: log.Warnf("Unknown SensorACK action for VM index report: %v", action) - return + return false } + + // existing send logic... + return true }Also applies to: 230-239
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@sensor/common/virtualmachine/index/handler_impl.go` around lines 132 - 145, ProcessMessage currently increments metrics.IndexReportAcksReceived for any SensorACK action even if forwardToCompliance drops unsupported actions; change it so the metric is only incremented when the action is one of the supported ACK types (e.g., central.SensorACK_ACK or central.SensorACK_NACK) after validating action, and do not call .WithLabelValues(action.String()).Inc() for unknown/unsupported actions—use the same action check/logic used by forwardToCompliance to decide when to increment the metric.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In `@sensor/common/virtualmachine/index/handler_impl.go`:
- Around line 132-145: ProcessMessage currently increments
metrics.IndexReportAcksReceived for any SensorACK action even if
forwardToCompliance drops unsupported actions; change it so the metric is only
incremented when the action is one of the supported ACK types (e.g.,
central.SensorACK_ACK or central.SensorACK_NACK) after validating action, and do
not call .WithLabelValues(action.String()).Inc() for unknown/unsupported
actions—use the same action check/logic used by forwardToCompliance to decide
when to increment the metric.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Central YAML (base), Organization UI (inherited)
Review profile: CHILL
Plan: Pro
Run ID: 725e503e-efe6-4d33-9520-51f1f1bacb16
📒 Files selected for processing (5)
sensor/common/virtualmachine/index/handler.gosensor/common/virtualmachine/index/handler_impl.gosensor/common/virtualmachine/index/handler_impl_test.gosensor/common/virtualmachine/index/mocks/handler.gosensor/kubernetes/sensor/sensor.go
@coderabbitai this is a design choice, not a bug. Counting all received ACKs (including unknown) is arguably the right thing - you want to know if Central is sending unexpected action types. |
|
Tip For best results, initiate chat on the files or code changes.
|
There was a problem hiding this comment.
Hey - I've found 1 issue
Prompt for AI Agents
Please address the comments from this code review:
## Individual Comments
### Comment 1
<location path="sensor/common/virtualmachine/index/handler_impl.go" line_range="143" />
<code_context>
- // TODO(ROX-xxxxx): Implement retry logic or notifying VM relay.
- // Currently, the VM relay has its own retry mechanism, but it's not aware of Central's rate limiting.
- }
+ metrics.IndexReportAcksReceived.WithLabelValues(action.String()).Inc()
return nil
</code_context>
<issue_to_address>
**suggestion:** Metrics are incremented even for unknown ACK actions, which may pollute metric labels
In `forwardToCompliance`, the `default` branch treats any non-ACK/NACK as unknown and returns early, but the metric is still incremented with `action.String()`. This will introduce label values for unsupported actions that Compliance never processes. Consider only incrementing for known actions or tracking unknown actions in a separate metric/label to avoid polluting the primary series.
Suggested implementation:
```golang
```
`.
Here are the changes:
<file_operations>
<file_operation operation="edit" file_path="sensor/common/virtualmachine/index/handler_impl.go">
<<<<<<< SEARCH
metrics.IndexReportAcksReceived.WithLabelValues(action.String()).Inc()
=======
>>>>>>> REPLACE
</file_operation>
</file_operations>
<additional_changes>
Inside `forwardToCompliance` (in `handler_impl.go`), adjust the switch that handles the ACK action (the one using `action` and with a `default` branch that returns early on unknown actions):
1. In the `case` branch that handles the ACK action, add:
```go
metrics.IndexReportAcksReceived.WithLabelValues(action.String()).Inc()
```
close to where the ACK is successfully processed (before returning / sending to Compliance).
2. In the `case` branch that handles the NACK action, also add:
```go
metrics.IndexReportAcksReceived.WithLabelValues(action.String()).Inc()
```
again, only after the NACK has been validated as a supported action and is being processed.
3. Do **not** add any metric increment in the `default` branch; keep it returning early for unknown actions. This ensures that `IndexReportAcksReceived` is only incremented for supported actions and avoids label pollution from unknown values.
4. If you want visibility into unsupported actions, consider adding a separate counter, e.g.:
```go
metrics.IndexReportAcksUnknownActions.Inc()
```
inside the `default` branch, using a metric with no `action` label (or with a bounded "reason" label) so it doesn't create unbounded series.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
Description
Closes the ACK loop for VM index reports by forwarding Central's
SensorACKresponses back tocompliance through Sensor.
Before: Sensor received ACK/NACK from Central for VM index reports but only logged them. The
compliance relay had no visibility into whether Central accepted or rejected a report, so it could
not adjust retry behavior accordingly.
After: Sensor translates
central.SensorACKintosensor.MsgToCompliance_ComplianceACKwithmessage type
VM_INDEX_REPORTand sends it to compliance via a dedicatedtoCompliancechannel.Compliance's
handleVMIndexACK(from the parent PR) forwards it to the relay's UMH, which updatesper-VM retry state.
Changes:
handler_impl.go: AddedforwardToCompliance()to translate and send ACK/NACK. AddedComplianceC()andStopped()to satisfy theComplianceComponentinterface. DeclaresSensorACKSupportcapability.sensor.go: Registers the VM handler as aComplianceComponenton the compliance multiplexer.handler_impl_test.go: UpdatedTestCapabilitiesto expectSensorACKSupport.Third and final PR in the ROX-32316 stack.
Depends on:
piotr/ROX-32316-vm-relay-ack-flow(VM relay UMH integration).AI-assisted: code was extracted from the feature branch by AI, with test fix
for updated Capabilities return value. Reviewed and verified by the author.
User-facing documentation
Testing and quality
Automated testing
How I validated my change