Skip to content

ROX-33555: Forward Central VM ACK/NACK to compliance#19322

Open
vikin91 wants to merge 5 commits intomasterfrom
piotr/ROX-32316-sensor-vm-ack-forwarding
Open

ROX-33555: Forward Central VM ACK/NACK to compliance#19322
vikin91 wants to merge 5 commits intomasterfrom
piotr/ROX-32316-sensor-vm-ack-forwarding

Conversation

@vikin91
Copy link
Copy Markdown
Contributor

@vikin91 vikin91 commented Mar 6, 2026

Description

Closes the ACK loop for VM index reports by forwarding Central's SensorACK responses back to
compliance through Sensor.

Before: Sensor received ACK/NACK from Central for VM index reports but only logged them. The
compliance relay had no visibility into whether Central accepted or rejected a report, so it could
not adjust retry behavior accordingly.

After: Sensor translates central.SensorACK into sensor.MsgToCompliance_ComplianceACK with
message type VM_INDEX_REPORT and sends it to compliance via a dedicated toCompliance channel.
Compliance's handleVMIndexACK (from the parent PR) forwards it to the relay's UMH, which updates
per-VM retry state.

Changes:

  • handler_impl.go: Added forwardToCompliance() to translate and send ACK/NACK. Added
    ComplianceC() and Stopped() to satisfy the ComplianceComponent interface. Declares
    SensorACKSupport capability.
  • sensor.go: Registers the VM handler as a ComplianceComponent on the compliance multiplexer.
  • handler_impl_test.go: Updated TestCapabilities to expect SensorACKSupport.

Third and final PR in the ROX-32316 stack.

Depends on: piotr/ROX-32316-vm-relay-ack-flow (VM relay UMH integration).

AI-assisted: code was extracted from the feature branch by AI, with test fix
for updated Capabilities return value. Reviewed and verified by the author.

User-facing documentation

Testing and quality

  • the change is production ready: the change is GA, or otherwise the functionality is gated by a feature flag
  • CI results are inspected

Automated testing

  • added unit tests
  • added e2e tests
  • added regression tests
  • added compatibility tests
  • modified existing tests

How I validated my change

  • CI
  • Manually on a cluster. TODO

@vikin91
Copy link
Copy Markdown
Contributor Author

vikin91 commented Mar 6, 2026

This change is part of the following stack:

Change managed by git-spice.

@openshift-ci
Copy link
Copy Markdown

openshift-ci bot commented Mar 6, 2026

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

Copy link
Copy Markdown
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've found 2 issues, and left some high level feedback:

  • Consider whether toCompliance should be buffered, since forwardToCompliance is called inline from ProcessMessage and an unbuffered channel here could back up Central’s ACK processing if the compliance multiplexer is slow or temporarily blocked.
  • In forwardToCompliance, you set Hostname and Broadcast based solely on resourceID; double-check that using the VM resource ID as Hostname and broadcasting only when resourceID == "" matches the expectations of the compliance side consumers (if not, it may be worth adding a small helper or comment clarifying the mapping).
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- Consider whether `toCompliance` should be buffered, since `forwardToCompliance` is called inline from `ProcessMessage` and an unbuffered channel here could back up Central’s ACK processing if the compliance multiplexer is slow or temporarily blocked.
- In `forwardToCompliance`, you set `Hostname` and `Broadcast` based solely on `resourceID`; double-check that using the VM resource ID as `Hostname` and broadcasting only when `resourceID == ""` matches the expectations of the compliance side consumers (if not, it may be worth adding a small helper or comment clarifying the mapping).

## Individual Comments

### Comment 1
<location path="sensor/common/virtualmachine/index/handler_impl.go" line_range="167" />
<code_context>
 		return errStartMoreThanOnce
 	}
 	h.indexReports = make(chan *v1.IndexReport, env.VirtualMachinesIndexReportsBufferSize.IntegerSetting())
+	h.toCompliance = make(chan common.MessageToComplianceWithAddress)
 	h.toCentral = h.run(h.indexReports)
 	return nil
</code_context>
<issue_to_address>
**suggestion (bug_risk):** Consider buffering the compliance channel to avoid blocking ACK processing when Compliance is slow or temporarily unavailable.

Because `toCompliance` is unbuffered, `forwardToCompliance` will block until the consumer reads, and `ProcessMessage` is on the hot ACK/NACK path. If Compliance is slow or backpressured, this can stall ACK handling. A small buffer (as used for other internal channels) would mitigate head-of-line blocking while still preserving ordering.

Suggested implementation:

```golang
	if h.toCentral != nil || h.indexReports != nil || h.toCompliance != nil {
		return errStartMoreThanOnce
	}
	h.indexReports = make(chan *v1.IndexReport, env.VirtualMachinesIndexReportsBufferSize.IntegerSetting())
	h.toCompliance = make(chan common.MessageToComplianceWithAddress, env.VirtualMachinesToComplianceBufferSize.IntegerSetting())

```

1. Define `VirtualMachinesToComplianceBufferSize` in the `env` package (or equivalent config) following the same pattern used for `VirtualMachinesIndexReportsBufferSize` (type, default, description, etc.).
2. If there is an existing common buffer size setting for internal VM handler channels (e.g., a shared constant/env var), you may want to reuse that instead of introducing a new one, for consistency.
3. Ensure any unit tests that construct `handlerImpl` or validate channel behavior are updated to account for the buffered `toCompliance` channel (e.g., tests that might have assumed it was unbuffered).
</issue_to_address>

### Comment 2
<location path="sensor/common/virtualmachine/index/handler_impl.go" line_range="240-242" />
<code_context>
+		return
+	}
+
+	select {
+	case <-h.stopper.Flow().StopRequested():
+	case h.toCompliance <- common.MessageToComplianceWithAddress{
+		Msg: &sensor.MsgToCompliance{
+			Msg: &sensor.MsgToCompliance_ComplianceAck{
</code_context>
<issue_to_address>
**suggestion:** Consider handling or at least logging when messages are dropped due to sensor shutdown.

Here, when `h.stopper.Flow().StopRequested()` fires, the select exits and the ACK/NACK is silently dropped. If these messages matter for Compliance accounting, this makes shutdown issues hard to trace. Please add at least a debug log or a counter for messages dropped due to shutdown to make this behavior observable.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

@rhacs-bot
Copy link
Copy Markdown
Contributor

rhacs-bot commented Mar 6, 2026

Images are ready for the commit at c47fcd6.

To use with deploy scripts, first export MAIN_IMAGE_TAG=4.11.x-388-gc47fcd6b9a.

@codecov
Copy link
Copy Markdown

codecov bot commented Mar 6, 2026

Codecov Report

❌ Patch coverage is 94.00000% with 3 lines in your changes missing coverage. Please review.
✅ Project coverage is 49.57%. Comparing base (ab1e8a2) to head (3ba3b77).
⚠️ Report is 17 commits behind head on master.

Files with missing lines Patch % Lines
sensor/common/virtualmachine/index/handler_impl.go 94.00% 2 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master   #19322      +/-   ##
==========================================
- Coverage   49.61%   49.57%   -0.04%     
==========================================
  Files        2765     2764       -1     
  Lines      208541   208390     -151     
==========================================
- Hits       103459   103313     -146     
- Misses      97405    97426      +21     
+ Partials     7677     7651      -26     
Flag Coverage Δ
go-unit-tests 49.57% <94.00%> (-0.04%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@vikin91 vikin91 changed the title ROX-32316: Forward Central VM ACK/NACK to compliance ROX-32848: Forward Central VM ACK/NACK to compliance Mar 6, 2026
@vikin91 vikin91 force-pushed the piotr/ROX-32316-vm-relay-ack-flow branch from 0ad5d24 to c795476 Compare March 9, 2026 10:05
@vikin91 vikin91 force-pushed the piotr/ROX-32316-sensor-vm-ack-forwarding branch from 99d3e6c to fa6c362 Compare March 9, 2026 10:05
@vikin91 vikin91 changed the title ROX-32848: Forward Central VM ACK/NACK to compliance ROX-33555: Forward Central VM ACK/NACK to compliance Mar 11, 2026
@vikin91 vikin91 force-pushed the piotr/ROX-32316-vm-relay-ack-flow branch from c795476 to cbee5a2 Compare March 11, 2026 09:48
@vikin91 vikin91 force-pushed the piotr/ROX-32316-sensor-vm-ack-forwarding branch from fa6c362 to 54bb9f9 Compare March 11, 2026 09:48
@vikin91 vikin91 force-pushed the piotr/ROX-32316-vm-relay-ack-flow branch from cbee5a2 to 283b771 Compare March 19, 2026 14:53
@vikin91 vikin91 force-pushed the piotr/ROX-32316-sensor-vm-ack-forwarding branch from 54bb9f9 to c47fcd6 Compare March 19, 2026 14:53
@vikin91 vikin91 force-pushed the piotr/ROX-32316-vm-relay-ack-flow branch from 283b771 to a9270b7 Compare April 9, 2026 08:55
@vikin91 vikin91 force-pushed the piotr/ROX-32316-sensor-vm-ack-forwarding branch from c47fcd6 to 08f7beb Compare April 9, 2026 08:55
Sensor's VM index handler now translates Central's SensorACK into
ComplianceACK and forwards it to compliance through a dedicated channel,
closing the ACK loop so the relay's UMH can track confirmation state.
Declares SensorACKSupport capability and registers the handler as a
ComplianceComponent on the multiplexer.

AI-assisted: code was extracted from another feature branch by AI,
with test fix for updated Capabilities. Reviewed and verified by the author.
@vikin91 vikin91 force-pushed the piotr/ROX-32316-sensor-vm-ack-forwarding branch from 08f7beb to 80bc69e Compare April 9, 2026 09:01
@vikin91 vikin91 changed the base branch from piotr/ROX-32316-vm-relay-ack-flow to master April 9, 2026 09:01
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 9, 2026

🚀 Build Images Ready

Images are ready for commit b0e1c76. To use with deploy scripts:

export MAIN_IMAGE_TAG=4.11.x-611-gb0e1c76f28

@vikin91
Copy link
Copy Markdown
Contributor Author

vikin91 commented Apr 9, 2026

@coderabbitai review

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Apr 9, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

1 similar comment
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Apr 9, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@vikin91
Copy link
Copy Markdown
Contributor Author

vikin91 commented Apr 9, 2026

@sourcery-ai review

Copy link
Copy Markdown
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've found 1 issue, and left some high level feedback:

  • The new toCompliance channel is unbuffered and used in the ACK hot path; consider adding a small buffer or a clearly non-blocking send strategy so slow compliance handling cannot block processing of Central messages.
  • In forwardToCompliance, the VM resourceID is used both as Hostname and to decide Broadcast; it would be good to double-check that these semantics match how the compliance multiplexer routes messages (e.g., whether VM IDs should be treated as hostnames or if a more explicit addressing field is needed).
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- The new `toCompliance` channel is unbuffered and used in the ACK hot path; consider adding a small buffer or a clearly non-blocking send strategy so slow compliance handling cannot block processing of Central messages.
- In `forwardToCompliance`, the VM `resourceID` is used both as `Hostname` and to decide `Broadcast`; it would be good to double-check that these semantics match how the compliance multiplexer routes messages (e.g., whether VM IDs should be treated as hostnames or if a more explicit addressing field is needed).

## Individual Comments

### Comment 1
<location path="sensor/common/virtualmachine/index/handler_impl_test.go" line_range="230-231" />
<code_context>
 func (s *virtualMachineHandlerSuite) TestCapabilities() {
 	caps := s.handler.Capabilities()
-	s.Require().Empty(caps)
+	s.Require().Len(caps, 1)
+	s.Contains(caps, centralsensor.SensorACKSupport)
 }

</code_context>
<issue_to_address>
**suggestion (testing):** Add tests for the new compliance-forwarding behavior and lifecycle methods (forwardToCompliance, ComplianceC, Stopped).

The capabilities test now checks that `SensorACKSupport` is declared, but it doesn’t exercise the new behavior it enables. Please add tests that cover:

1. **forwardToCompliance happy paths**
   - After `Start()` (so `toCompliance` is initialized), processing a `central.SensorACK` with `ACK` produces a message on `ComplianceC()` with:
     - `Msg.ComplianceAck.Action == sensor.MsgToCompliance_ComplianceACK_ACK`
     - `MessageType == VM_INDEX_REPORT`
     - `ResourceId` and `Reason` correctly propagated.
   - Same for `NACK`, including forwarding `Reason`.

2. **Forwarding edge cases**
   - `toCompliance == nil` (no `Start()`): `ProcessMessage`/`forwardToCompliance` should not panic or block.
   - Unknown `SensorACK_Action`: should be logged and dropped, with `ComplianceC()` remaining empty.

3. **Stopped()/stopper behavior**
   - When `stopper.Flow().Stop()` is triggered, `forwardToCompliance` should not block on a full `toCompliance` channel. You can set up a non-drained channel, call `Stop()`, then invoke `ProcessMessage`.

4. **ComplianceC wiring**
   - Verify that `Start()` initializes `ComplianceC()` (non-nil, readable) and that it’s the same channel used internally.

These will demonstrate the ACK/NACK path to compliance and ensure no-Start, unknown actions, and stop conditions are handled safely.

Suggested implementation:

```golang
func (s *virtualMachineHandlerSuite) TestCapabilities() {
	caps := s.handler.Capabilities()
	s.Require().Len(caps, 1)
	s.Contains(caps, centralsensor.SensorACKSupport)
}

// forwardToCompliance happy-path: ACK
func (s *virtualMachineHandlerSuite) TestForwardToCompliance_ACK() {
	// Ensure Start wires up the compliance channel
	err := s.handler.Start(s.ctx, nil)
	s.Require().NoError(err)
	defer s.handler.Stop(s.ctx)

	complianceC := s.handler.ComplianceC()
	s.Require().NotNil(complianceC)

	const (
		resourceID = "vm-123"
		reason     = "all good"
	)

	ack := &central.SensorACK{
		Action:     central.SensorACK_ACK,
		ResourceId: resourceID,
		Reason:     reason,
	}

	msg := &central.MsgFromSensor{
		Msg: &central.MsgFromSensor_SensorAck{
			SensorAck: ack,
		},
	}

	err = s.handler.ProcessMessage(msg)
	s.Require().NoError(err)

	select {
	case toComplianceMsg := <-complianceC:
		s.Require().NotNil(toComplianceMsg)
		s.Require().NotNil(toComplianceMsg.Msg)
		complianceAck := toComplianceMsg.Msg.GetComplianceAck()
		s.Require().NotNil(complianceAck)
		s.Equal(sensor.MsgToCompliance_ComplianceACK_ACK, complianceAck.Action)
		s.Equal(sensor.MessageType_VM_INDEX_REPORT, complianceAck.MessageType)
		s.Equal(resourceID, complianceAck.ResourceId)
		s.Equal(reason, complianceAck.Reason)
	case <-time.After(2 * time.Second):
		s.Fail("timed out waiting for compliance ACK message")
	}
}

// forwardToCompliance happy-path: NACK
func (s *virtualMachineHandlerSuite) TestForwardToCompliance_NACK() {
	err := s.handler.Start(s.ctx, nil)
	s.Require().NoError(err)
	defer s.handler.Stop(s.ctx)

	complianceC := s.handler.ComplianceC()
	s.Require().NotNil(complianceC)

	const (
		resourceID = "vm-456"
		reason     = "validation failed"
	)

	nack := &central.SensorACK{
		Action:     central.SensorACK_NACK,
		ResourceId: resourceID,
		Reason:     reason,
	}

	msg := &central.MsgFromSensor{
		Msg: &central.MsgFromSensor_SensorAck{
			SensorAck: nack,
		},
	}

	err = s.handler.ProcessMessage(msg)
	s.Require().NoError(err)

	select {
	case toComplianceMsg := <-complianceC:
		s.Require().NotNil(toComplianceMsg)
		s.Require().NotNil(toComplianceMsg.Msg)
		complianceAck := toComplianceMsg.Msg.GetComplianceAck()
		s.Require().NotNil(complianceAck)
		s.Equal(sensor.MsgToCompliance_ComplianceACK_NACK, complianceAck.Action)
		s.Equal(sensor.MessageType_VM_INDEX_REPORT, complianceAck.MessageType)
		s.Equal(resourceID, complianceAck.ResourceId)
		s.Equal(reason, complianceAck.Reason)
	case <-time.After(2 * time.Second):
		s.Fail("timed out waiting for compliance NACK message")
	}
}

// Edge case: no Start() called, toCompliance is nil – should not panic or block
func (s *virtualMachineHandlerSuite) TestForwardToCompliance_NoStartDoesNotPanicOrBlock() {
	// Do NOT call Start()

	ack := &central.SensorACK{
		Action:     central.SensorACK_ACK,
		ResourceId: "vm-no-start",
		Reason:     "ignored",
	}

	msg := &central.MsgFromSensor{
		Msg: &central.MsgFromSensor_SensorAck{
			SensorAck: ack,
		},
	}

	done := make(chan struct{})
	go func() {
		defer close(done)
		_ = s.handler.ProcessMessage(msg)
	}()

	select {
	case <-done:
		// ok, returned without panic/block
	case <-time.After(2 * time.Second):
		s.Fail("ProcessMessage should not block when Start() has not been called")
	}

	// ComplianceC() may be nil or empty; in either case we must not block.
	if ch := s.handler.ComplianceC(); ch != nil {
		select {
		case <-ch:
			s.Fail("no message should be forwarded to compliance when Start() was not called")
		default:
		}
	}
}

// Edge case: unknown SensorACK_Action should be logged and dropped
func (s *virtualMachineHandlerSuite) TestForwardToCompliance_UnknownActionDropped() {
	err := s.handler.Start(s.ctx, nil)
	s.Require().NoError(err)
	defer s.handler.Stop(s.ctx)

	complianceC := s.handler.ComplianceC()
	s.Require().NotNil(complianceC)

	unknownAck := &central.SensorACK{
		Action:     central.SensorACK_Action(999),
		ResourceId: "vm-unknown",
		Reason:     "unknown",
	}

	msg := &central.MsgFromSensor{
		Msg: &central.MsgFromSensor_SensorAck{
			SensorAck: unknownAck,
		},
	}

	err = s.handler.ProcessMessage(msg)
	s.Require().NoError(err)

	// Channel should remain empty (message dropped)
	select {
	case <-complianceC:
		s.Fail("unexpected message on ComplianceC for unknown SensorACK action")
	default:
	}
}

// Stopped()/stopper behavior: forwardToCompliance must not block on full channel after Stop
func (s *virtualMachineHandlerSuite) TestForwardToCompliance_DoesNotBlockWhenStopped() {
	err := s.handler.Start(s.ctx, nil)
	s.Require().NoError(err)
	defer s.handler.Stop(s.ctx)

	complianceC := s.handler.ComplianceC()
	s.Require().NotNil(complianceC)

	// Fill the channel to capacity. This assumes a small buffer; if capacity is unknown,
	// we conservatively send until a non-blocking send fails.
	for i := 0; i < cap(complianceC); i++ {
		select {
		case complianceC <- &sensor.MsgToCompliance{}:
		default:
			// Channel is already full.
			break
		}
	}

	// Trigger stopper/Stopped behavior
	s.handler.Stop(s.ctx)

	nack := &central.SensorACK{
		Action:     central.SensorACK_NACK,
		ResourceId: "vm-stopped",
		Reason:     "after stop",
	}

	msg := &central.MsgFromSensor{
		Msg: &central.MsgFromSensor_SensorAck{
			SensorAck: nack,
		},
	}

	done := make(chan struct{})
	go func() {
		defer close(done)
		_ = s.handler.ProcessMessage(msg)
	}()

	select {
	case <-done:
		// ok, no blocking on full compliance channel after stop
	case <-time.After(2 * time.Second):
		s.Fail("ProcessMessage should not block when handler has been stopped, even if ComplianceC is full")
	}
}

// ComplianceC wiring: Start initializes non-nil channel used internally
func (s *virtualMachineHandlerSuite) TestStartInitializesComplianceC() {
	// Before Start, ComplianceC may be nil or undefined; we only assert post-Start behavior.
	err := s.handler.Start(s.ctx, nil)
	s.Require().NoError(err)
	defer s.handler.Stop(s.ctx)

	complianceC := s.handler.ComplianceC()
	s.Require().NotNil(complianceC)

	ack := &central.SensorACK{
		Action:     central.SensorACK_ACK,
		ResourceId: "vm-wiring",
		Reason:     "wiring-test",
	}

	msg := &central.MsgFromSensor{
		Msg: &central.MsgFromSensor_SensorAck{
			SensorAck: ack,
		},
	}

	err = s.handler.ProcessMessage(msg)
	s.Require().NoError(err)

	select {
	case <-complianceC:
		// message observed on same channel returned by ComplianceC()
	case <-time.After(2 * time.Second):
		s.Fail("expected a message on ComplianceC after processing SensorACK")
	}
}

func (s *virtualMachineHandlerSuite) TestAccepts() {

```

To make these tests compile and integrate cleanly, you will also need to:

1. Ensure the following imports exist in `handler_impl_test.go` (adjust paths to match your existing imports):
   - `time`
   - `github.com/stackrox/rox/generated/internalapi/central`
   - `github.com/stackrox/rox/generated/internalapi/sensor`
   - If `centralsensor.SensorACKSupport` is defined in a different package, keep the existing import for `centralsensor`.
2. Confirm the handler API:
   - `Start` / `Stop` signatures `(ctx context.Context, _ any)` or similar may differ; update the test calls accordingly.
   - `ComplianceC()` should return `chan *sensor.MsgToCompliance`; if it returns a receive-only channel, remove the direct sends in `TestForwardToCompliance_DoesNotBlockWhenStopped` and instead pre-fill the internal channel via the handler (or expose capacity via a helper).
   - If `Stop` is not a method, but the suite exposes a stopper (`s.stopper.Flow().Stop()`), invoke that instead of `s.handler.Stop(s.ctx)`.
3. Adjust enum names if they differ slightly from these guesses:
   - `central.SensorACK_ACK`, `central.SensorACK_NACK`, `central.SensorACK_Action`
   - `sensor.MsgToCompliance_ComplianceACK_ACK`, `sensor.MsgToCompliance_ComplianceACK_NACK`
   - `sensor.MessageType_VM_INDEX_REPORT`
4. If the suite uses a different context field than `s.ctx`, or uses `SetupTest`/`TearDownTest` to manage lifecycle, align the test setup/teardown with that pattern (e.g. omit explicit `Start`/`Stop` if already done in `SetupTest`).
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Apr 9, 2026

📝 Walkthrough

Summary by CodeRabbit

  • New Features

    • Forward VM index report ACK/NACKs to compliance; expose a compliance channel and a stopped signal; report ACK capability.
  • Bug Fixes

    • Avoid blocking or panics when handler not started or stopped; drop unknown ACK actions safely.
    • Ensure broadcast ACKs are handled correctly and non-blocking drops occur when compliance queue is full or context is cancelled.
  • Tests

    • Added tests for capability reporting, compliance forwarding, broadcast behavior, and non-blocking/drop semantics.

Walkthrough

VM index handler now exposes a compliance channel and stopped signal, advertises SensorACKSupport, increments ACK metrics, and forwards Central SensorACK ACK/NACK messages to Compliance; the handler is registered with the compliance multiplexer when VMs are enabled.

Changes

Cohort / File(s) Summary
VM Index Handler Implementation
sensor/common/virtualmachine/index/handler_impl.go
Added toCompliance channel and ComplianceC(); added Stopped(); Capabilities() returns SensorACKSupport; ProcessMessage(ctx, ...) counts ACKs and forwards ACK/NACK to compliance via new forwardToCompliance() with hostname/broadcast/resourceId/reason and non-blocking/drop semantics.
VM Index Handler Tests
sensor/common/virtualmachine/index/handler_impl_test.go
Updated capability assertion; added tests validating ComplianceC() after Start(), ACK→Compliance mapping (ACK/NACK, message type, ResourceId→Hostname, Broadcast), drops for unknown actions, non-blocking behavior when not started/stopped, queue-full drops, and context-cancel while queue full.
Handler Interface / Mocks
sensor/common/virtualmachine/index/handler.go, sensor/common/virtualmachine/index/mocks/handler.go
Handler interface now embeds common.ComplianceComponent (replacing common.SensorComponent); mocks updated with ComplianceC() and Stopped() methods and expectation recorders.
Sensor Integration Wiring
sensor/kubernetes/sensor/sensor.go
When VirtualMachines feature enabled, VM index handler is also registered with the compliance multiplexer via AddComponentWithComplianceC(...), wiring its compliance channel into the multiplexer.

Sequence Diagram(s)

sequenceDiagram
    autonumber
    participant Central as Central
    participant Handler as VM Index Handler
    participant Compliance as ComplianceMultiplexer
    participant Stopper as Stopper

    Central->>Handler: Send MsgToSensor_SensorAck (ACK/NACK)
    Handler->>Handler: metrics.IndexReportAcksReceived++
    alt compliance channel present and not stopped
        Handler->>Compliance: forwardToCompliance(MessageToComplianceWithAddress)
        Compliance-->>Handler: accept (buffered / non-blocking)
    else channel nil OR handler stopped
        Handler--xCompliance: drop / skip send
    end
    Stopper->>Handler: Stopped() signal
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and specifically identifies the main change: forwarding Central VM ACK/NACK messages to the compliance system.
Description check ✅ Passed The description provides a comprehensive explanation of the problem being solved, the solution implemented, specific code changes, and relevant context including dependencies and testing status.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch piotr/ROX-32316-sensor-vm-ack-forwarding

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
sensor/kubernetes/sensor/sensor.go (1)

208-208: Consider making the type assertion compile-time safe.

The type assertion virtualMachineHandler.(common.ComplianceComponent) will panic at runtime if the interface isn't satisfied. While the code is correct today, this creates fragile coupling—if someone later removes Stopped() or ComplianceC() from handlerImpl, the compiler won't catch it.

Consider having vmIndex.Handler extend common.ComplianceComponent or having NewHandler return a type that includes it:

// Option 1: In index/handler.go
type Handler interface {
    common.ComplianceComponent
    Send(ctx context.Context, vm *v1.IndexReport) error
    // ... other methods
}

// Then line 208 becomes simply:
complianceMultiplexer.AddComponentWithComplianceC(virtualMachineHandler)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@sensor/kubernetes/sensor/sensor.go` at line 208, The direct type assertion
virtualMachineHandler.(common.ComplianceComponent) is unsafe and can panic; make
the handler type express the contract instead: update the vmIndex.Handler
interface (or the return type of NewHandler/handlerImpl) to embed
common.ComplianceComponent (so it requires ComplianceC() and Stopped()), then
replace the assertion by passing virtualMachineHandler directly to
complianceMultiplexer.AddComponentWithComplianceC; ensure references include
vmIndex.Handler, NewHandler, handlerImpl, ComplianceC(), and Stopped().
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@sensor/kubernetes/sensor/sensor.go`:
- Line 208: The direct type assertion
virtualMachineHandler.(common.ComplianceComponent) is unsafe and can panic; make
the handler type express the contract instead: update the vmIndex.Handler
interface (or the return type of NewHandler/handlerImpl) to embed
common.ComplianceComponent (so it requires ComplianceC() and Stopped()), then
replace the assertion by passing virtualMachineHandler directly to
complianceMultiplexer.AddComponentWithComplianceC; ensure references include
vmIndex.Handler, NewHandler, handlerImpl, ComplianceC(), and Stopped().

ℹ️ Review info
⚙️ Run configuration

Configuration used: Central YAML (base), Organization UI (inherited)

Review profile: CHILL

Plan: Pro

Run ID: 61d79a44-9b67-4bd1-ae75-f45578cc0b9f

📥 Commits

Reviewing files that changed from the base of the PR and between ab1e8a2 and 80bc69e.

📒 Files selected for processing (3)
  • sensor/common/virtualmachine/index/handler_impl.go
  • sensor/common/virtualmachine/index/handler_impl_test.go
  • sensor/kubernetes/sensor/sensor.go

@vikin91
Copy link
Copy Markdown
Contributor Author

vikin91 commented Apr 10, 2026

@sourcery-ai review
@coderabbitai review

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Apr 10, 2026

@vikin91 I'll kick off a review of the changes right away!

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@sensor/common/virtualmachine/index/handler_impl.go`:
- Around line 219-260: forwardToCompliance synchronously blocks on sending to
h.toCompliance which can hang ProcessMessage() when compliance backpressure
occurs; make the send non-blocking or bounded by context/stopper to avoid
stalling Central. Modify handlerImpl.forwardToCompliance to attempt a
non-blocking send (use a select with the existing stop case, the send case, and
a default case that drops and logs a warning) or perform the send in a
short-lived goroutine with a timeout tied to stopper.Flow().StopRequested();
update references to h.toCompliance and forwardToCompliance accordingly so VM
ACK/NACKs are dropped with a log instead of blocking central processing.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Central YAML (base), Organization UI (inherited)

Review profile: CHILL

Plan: Pro

Run ID: 4e9963b9-918c-4633-845c-4b348a016e54

📥 Commits

Reviewing files that changed from the base of the PR and between 80bc69e and 2371746.

📒 Files selected for processing (4)
  • sensor/common/virtualmachine/index/handler.go
  • sensor/common/virtualmachine/index/handler_impl.go
  • sensor/common/virtualmachine/index/handler_impl_test.go
  • sensor/kubernetes/sensor/sensor.go

@vikin91
Copy link
Copy Markdown
Contributor Author

vikin91 commented Apr 10, 2026

@coderabbitai review again

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Apr 10, 2026

@vikin91 Sure! I'll kick off a full review of all the changes right away.

✅ Actions performed

Full review triggered.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
sensor/common/virtualmachine/index/handler_impl.go (1)

132-145: ⚠️ Potential issue | 🟡 Minor

Don’t record unknown SensorACK actions as ACK metrics.

forwardToCompliance() correctly drops unsupported actions, but ProcessMessage() still increments IndexReportAcksReceived for them via action.String(). That can create bogus metric series and silently skew ACK/NACK monitoring. Only increment the metric after the action is recognized as ACK or NACK.

Possible fix
-func (h *handlerImpl) ProcessMessage(ctx context.Context, msg *central.MsgToSensor) error {
+func (h *handlerImpl) ProcessMessage(ctx context.Context, msg *central.MsgToSensor) error {
 	sensorAck := msg.GetSensorAck()
 	if sensorAck == nil || sensorAck.GetMessageType() != central.SensorACK_VM_INDEX_REPORT {
 		return nil
 	}

 	vmID := sensorAck.GetResourceId()
 	action := sensorAck.GetAction()
 	reason := sensorAck.GetReason()
-	h.forwardToCompliance(ctx, vmID, action, reason)
-
-	metrics.IndexReportAcksReceived.WithLabelValues(action.String()).Inc()
+	if h.forwardToCompliance(ctx, vmID, action, reason) {
+		metrics.IndexReportAcksReceived.WithLabelValues(action.String()).Inc()
+	}

 	return nil
 }
 
 func (h *handlerImpl) forwardToCompliance(
 	ctx context.Context,
 	resourceID string,
 	action central.SensorACK_Action,
 	reason string,
-) {
+) bool {
 	if h.toCompliance == nil {
 		log.Debug("Compliance channel not initialized; skipping forwarding VM ACK/NACK")
-		return
+		return false
 	}

 	var complianceAction sensor.MsgToCompliance_ComplianceACK_Action
 	switch action {
 	case central.SensorACK_ACK:
 		complianceAction = sensor.MsgToCompliance_ComplianceACK_ACK
 	case central.SensorACK_NACK:
 		complianceAction = sensor.MsgToCompliance_ComplianceACK_NACK
 	default:
 		log.Warnf("Unknown SensorACK action for VM index report: %v", action)
-		return
+		return false
 	}
+
+	// existing send logic...
+	return true
 }

Also applies to: 230-239

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@sensor/common/virtualmachine/index/handler_impl.go` around lines 132 - 145,
ProcessMessage currently increments metrics.IndexReportAcksReceived for any
SensorACK action even if forwardToCompliance drops unsupported actions; change
it so the metric is only incremented when the action is one of the supported ACK
types (e.g., central.SensorACK_ACK or central.SensorACK_NACK) after validating
action, and do not call .WithLabelValues(action.String()).Inc() for
unknown/unsupported actions—use the same action check/logic used by
forwardToCompliance to decide when to increment the metric.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@sensor/common/virtualmachine/index/handler_impl.go`:
- Around line 132-145: ProcessMessage currently increments
metrics.IndexReportAcksReceived for any SensorACK action even if
forwardToCompliance drops unsupported actions; change it so the metric is only
incremented when the action is one of the supported ACK types (e.g.,
central.SensorACK_ACK or central.SensorACK_NACK) after validating action, and do
not call .WithLabelValues(action.String()).Inc() for unknown/unsupported
actions—use the same action check/logic used by forwardToCompliance to decide
when to increment the metric.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Central YAML (base), Organization UI (inherited)

Review profile: CHILL

Plan: Pro

Run ID: 725e503e-efe6-4d33-9520-51f1f1bacb16

📥 Commits

Reviewing files that changed from the base of the PR and between 5e40709 and 3ba3b77.

📒 Files selected for processing (5)
  • sensor/common/virtualmachine/index/handler.go
  • sensor/common/virtualmachine/index/handler_impl.go
  • sensor/common/virtualmachine/index/handler_impl_test.go
  • sensor/common/virtualmachine/index/mocks/handler.go
  • sensor/kubernetes/sensor/sensor.go

@vikin91
Copy link
Copy Markdown
Contributor Author

vikin91 commented Apr 10, 2026

Don’t record unknown SensorACK actions as ACK metrics.

@coderabbitai this is a design choice, not a bug. Counting all received ACKs (including unknown) is arguably the right thing - you want to know if Central is sending unexpected action types.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Apr 10, 2026

Tip

For best results, initiate chat on the files or code changes.

@vikin91 That's a fair point — counting all received SensorACK messages regardless of action type provides a useful signal for detecting unexpected behavior from Central (e.g., a new action type being introduced or a misconfiguration). I'll withdraw that suggestion. Thanks for the clarification!


📝 Learnings were identified and not saved (knowledge base disabled). Enable

@vikin91 vikin91 marked this pull request as ready for review April 10, 2026 15:03
@vikin91 vikin91 requested a review from a team as a code owner April 10, 2026 15:03
Copy link
Copy Markdown
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've found 1 issue

Prompt for AI Agents
Please address the comments from this code review:

## Individual Comments

### Comment 1
<location path="sensor/common/virtualmachine/index/handler_impl.go" line_range="143" />
<code_context>
-		// TODO(ROX-xxxxx): Implement retry logic or notifying VM relay.
-		// Currently, the VM relay has its own retry mechanism, but it's not aware of Central's rate limiting.
-	}
+	metrics.IndexReportAcksReceived.WithLabelValues(action.String()).Inc()

 	return nil
</code_context>
<issue_to_address>
**suggestion:** Metrics are incremented even for unknown ACK actions, which may pollute metric labels

In `forwardToCompliance`, the `default` branch treats any non-ACK/NACK as unknown and returns early, but the metric is still incremented with `action.String()`. This will introduce label values for unsupported actions that Compliance never processes. Consider only incrementing for known actions or tracking unknown actions in a separate metric/label to avoid polluting the primary series.

Suggested implementation:

```golang

```

`.

Here are the changes:

<file_operations>
<file_operation operation="edit" file_path="sensor/common/virtualmachine/index/handler_impl.go">
<<<<<<< SEARCH
	metrics.IndexReportAcksReceived.WithLabelValues(action.String()).Inc()
=======
>>>>>>> REPLACE
</file_operation>
</file_operations>

<additional_changes>
Inside `forwardToCompliance` (in `handler_impl.go`), adjust the switch that handles the ACK action (the one using `action` and with a `default` branch that returns early on unknown actions):

1. In the `case` branch that handles the ACK action, add:
   ```go
   metrics.IndexReportAcksReceived.WithLabelValues(action.String()).Inc()
   ```
   close to where the ACK is successfully processed (before returning / sending to Compliance).

2. In the `case` branch that handles the NACK action, also add:
   ```go
   metrics.IndexReportAcksReceived.WithLabelValues(action.String()).Inc()
   ```
   again, only after the NACK has been validated as a supported action and is being processed.

3. Do **not** add any metric increment in the `default` branch; keep it returning early for unknown actions. This ensures that `IndexReportAcksReceived` is only incremented for supported actions and avoids label pollution from unknown values.

4. If you want visibility into unsupported actions, consider adding a separate counter, e.g.:
   ```go
   metrics.IndexReportAcksUnknownActions.Inc()
   ```
   inside the `default` branch, using a metric with no `action` label (or with a bounded "reason" label) so it doesn't create unbounded series.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants