Skip to content

ROX-31012: Optimize memory for UpdateComputer connection deduper#17041

Merged
vikin91 merged 3 commits intomasterfrom
piotr/ROX-31012-less-memory-conn-deduper
Oct 22, 2025
Merged

ROX-31012: Optimize memory for UpdateComputer connection deduper#17041
vikin91 merged 3 commits intomasterfrom
piotr/ROX-31012-less-memory-conn-deduper

Conversation

@vikin91
Copy link
Copy Markdown
Contributor

@vikin91 vikin91 commented Sep 26, 2025

Implemented with Cursor using the following implementation plan: NetworkConn_BinaryHash_Migration_Plan.md

Description

Migrated NetworkConn deduplication from string-based keys to BinaryHash (uint64) keys for memory efficiency, achieving ~75% memory reduction (from ~32 bytes to 8 bytes per connection).

What changed:

  • Added NetworkConn.BinaryKey() method using xxhash
  • Updated connectionsDeduper from *set.StringSet to *set.Set[indicator.BinaryHash]
  • Updated closedConnTimestamps from map[string]closedConnEntry to map[indicator.BinaryHash]closedConnEntry
  • Updated related function signatures and memory calculations
  • Removed unused newStringSetPtr() helper and BinaryHashSet type alias

Why:
Connection deduplication was using hex string keys (16 chars + Go overhead), consuming significant memory. This change aligns NetworkConn with the existing BinaryHash pattern already used successfully for ContainerEndpoint and ProcessListening dedupers.

AI contribution:

  • AI generated initial implementation following the migration plan
  • User reviewed design decisions (xxhash vs FNV, uint64 vs [8]byte)
  • User identified and requested removal of unnecessary type alias
  • User verified all tests and code quality

User-facing documentation

Testing and quality

  • the change is production ready: the change is GA, or otherwise the functionality is gated by a feature flag
  • CI results are inspected

Automated testing

  • added unit tests
  • added e2e tests
  • added regression tests
  • added compatibility tests
  • modified existing tests

How I validated my change

  • Ran all unit tests locally for sensor/common/networkflow/... packages - all passing
  • Verified no linter errors in modified files
  • Confirmed successful compilation of all networkflow packages
  • Validated that the implementation follows the same pattern as existing BinaryHash usage for endpoints and processes
  • Tests will run on CI to verify integration with the broader codebase

This is an internal optimization with no external API changes or functional behavior differences.

@vikin91
Copy link
Copy Markdown
Contributor Author

vikin91 commented Sep 26, 2025

This change is part of the following stack:

Change managed by git-spice.

@openshift-ci
Copy link
Copy Markdown

openshift-ci bot commented Sep 26, 2025

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@rhacs-bot
Copy link
Copy Markdown
Contributor

rhacs-bot commented Sep 26, 2025

Images are ready for the commit at 96be698.

To use with deploy scripts, first export MAIN_IMAGE_TAG=4.10.x-67-g96be6988f6.

@codecov
Copy link
Copy Markdown

codecov bot commented Sep 26, 2025

Codecov Report

❌ Patch coverage is 78.94737% with 4 lines in your changes missing coverage. Please review.
✅ Project coverage is 48.81%. Comparing base (0eafc13) to head (96be698).
⚠️ Report is 27 commits behind head on master.

Files with missing lines Patch % Lines
...mon/networkflow/updatecomputer/transition_based.go 66.66% 4 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##           master   #17041   +/-   ##
=======================================
  Coverage   48.80%   48.81%           
=======================================
  Files        2724     2724           
  Lines      203154   203120   -34     
=======================================
- Hits        99157    99147   -10     
+ Misses      96211    96197   -14     
+ Partials     7786     7776   -10     
Flag Coverage Δ
go-unit-tests 48.81% <78.94%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@vikin91 vikin91 force-pushed the piotr/ROX-31012-less-memory-deduper branch from f636ddd to 425a08d Compare October 1, 2025 08:02
@vikin91 vikin91 force-pushed the piotr/ROX-31012-less-memory-conn-deduper branch from 784ee0f to 6fae477 Compare October 1, 2025 08:02
@vikin91 vikin91 force-pushed the piotr/ROX-31012-less-memory-conn-deduper branch from 6fae477 to 79b19fa Compare October 1, 2025 10:00
@vikin91 vikin91 force-pushed the piotr/ROX-31012-less-memory-deduper branch from 5e67e12 to 677d6ce Compare October 2, 2025 08:47
@vikin91 vikin91 force-pushed the piotr/ROX-31012-less-memory-conn-deduper branch from 79b19fa to a3d8491 Compare October 2, 2025 08:47
@vikin91 vikin91 force-pushed the piotr/ROX-31012-less-memory-deduper branch from 677d6ce to 175c010 Compare October 7, 2025 11:22
@vikin91 vikin91 force-pushed the piotr/ROX-31012-less-memory-deduper branch from bf181fd to 41d6369 Compare October 14, 2025 13:09
@vikin91 vikin91 force-pushed the piotr/ROX-31012-less-memory-conn-deduper branch from a3d8491 to 5f8e07c Compare October 14, 2025 13:34
Copy link
Copy Markdown
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey there - I've reviewed your changes - here's some feedback:

  • Consider aliasing set.Set[indicator.BinaryHash] to a descriptive type (e.g. BinaryHashSet) to improve readability and reduce type repetition.
  • The NetworkConn_BinaryHash_Migration_Plan.md is quite large—consider moving it to an internal design document or wiki to keep this repo focused on code.
  • If you’re logging or debugging connection keys, add a helper to format BinaryHash values as hex strings for readability.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- Consider aliasing set.Set[indicator.BinaryHash] to a descriptive type (e.g. BinaryHashSet) to improve readability and reduce type repetition.
- The NetworkConn_BinaryHash_Migration_Plan.md is quite large—consider moving it to an internal design document or wiki to keep this repo focused on code.
- If you’re logging or debugging connection keys, add a helper to format BinaryHash values as hex strings for readability.

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

@vikin91
Copy link
Copy Markdown
Contributor Author

vikin91 commented Oct 14, 2025

  • Consider aliasing set.Set[indicator.BinaryHash] to a descriptive type (e.g. BinaryHashSet) to improve readability and reduce type repetition.

I deliberately removed it. I am not a big fan of introducing type aliases if we don't define any functions on them.

  • Type aliases in Go are best used when you're adding methods to the type or when the underlying type is complex/unclear
  • set.Set[indicator.BinaryHash] is already descriptive and clear
  • We're not defining any methods on it
  • The codebase already uses concrete generic types elsewhere (e.g., map[indicator.BinaryHash]indicator.BinaryHash for endpoints)
  • The NetworkConn_BinaryHash_Migration_Plan.md is quite large—consider moving it to an internal design document or wiki to keep this repo focused on code.

Removed. Now it is attached to PR description.

  • If you’re logging or debugging connection keys, add a helper to format BinaryHash values as hex strings for readability.

Currently:

  • No existing code in the networkflow package logs or formats BinaryHash values
  • No debug logging in transition_based.go that would benefit from hash formatting

I am not adding it now because of:

  • YAGNI principle
  • No existing debug paths: There's no logging infrastructure that would use it
  • Easy to add later: If one needs it for debugging, it's trivial: fmt.Sprintf("%016x", uint64(value))

Base automatically changed from piotr/ROX-31012-less-memory-deduper to master October 15, 2025 20:09
@vikin91 vikin91 force-pushed the piotr/ROX-31012-less-memory-conn-deduper branch 4 times, most recently from 2f90967 to a8e0ccb Compare October 16, 2025 08:03
@vikin91 vikin91 marked this pull request as ready for review October 16, 2025 08:18
@vikin91 vikin91 requested a review from a team as a code owner October 16, 2025 08:18
Copy link
Copy Markdown
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey there - I've reviewed your changes and they look great!


Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

@vikin91 vikin91 requested a review from janisz October 17, 2025 11:38
…hash, uint64)

Migrated connection deduplication from string keys (~32B) to uint64
BinaryHash (8B), matching endpoint/process patterns. Updated dedupers,
tests, and memory calculations. Partially AI-assisted.
@vikin91 vikin91 force-pushed the piotr/ROX-31012-less-memory-conn-deduper branch from a8e0ccb to f6c26d4 Compare October 20, 2025 07:15
@vikin91
Copy link
Copy Markdown
Contributor Author

vikin91 commented Oct 21, 2025

/retest

@vikin91 vikin91 added the auto-retest PRs with this label will be automatically retested if prow checks fails label Oct 21, 2025
@vikin91 vikin91 merged commit 760b326 into master Oct 22, 2025
111 of 117 checks passed
@vikin91 vikin91 deleted the piotr/ROX-31012-less-memory-conn-deduper branch October 22, 2025 06:23
janisz added a commit that referenced this pull request Mar 4, 2026
Replace NumericEndpoint (56 bytes) with BinaryHash (uint64) as map keys,
following the pattern from PR #17041. This enables runtime.mapassign_fast64
for map operations instead of the slower generic mapassign.

Implementation:
- Add BinaryHash type and BinaryKey() method to NumericEndpoint
- Use xxhash for fast, non-cryptographic hashing
- Update all endpoint maps to use BinaryHash keys
- Hash endpoints once before map operations
- Update debug/logging to format hashes as hex

Benchmark results (Phase 1 → Phase 2):

Speed (sec/op):
- Geometric mean:        -51.66% (2x faster!)
- Large_Incremental:     7.764ms → 3.863ms  (-50.24%)
- Medium_Incremental:    1.454ms → 834µs    (-42.63%)
- ApplySingleNoLock:     -63% to -68%

Memory (B/op):
- Large workloads:       1.24MB → 932KB     (-24.63%)
- ApplyRepeated:         271KB → 196KB      (-27.74%)

Trade-off:
- Allocations:           +22% (hashing overhead)

Combined improvement (Baseline → Phase 2):

Speed:
- Geometric mean:        -51.66% (2x faster!)
- Medium_Incremental:    1.833ms → 834µs    (-54.47%)

Memory:
- Large workloads:       1.98MB → 932KB     (-53.02%)
- ApplyRepeated:         458KB → 196KB      (-57.31%)

Why this works:
- uint64 keys use mapassign_fast64 (single CMP instruction for comparison)
- NumericEndpoint keys use generic mapassign (56 bytes to hash and compare)
- Reduced map storage overhead (8 bytes vs 56 bytes per key)
- Better CPU cache utilization

Trade-offs:
- Public IP tracking optimization disabled (can't extract IP from hash)
- Hash collision probability: 0.027% for 100M endpoints (negligible for K8s scale)
- Debug output shows hashes instead of endpoint details

Related to ROX-33201

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ai-review area/sensor auto-retest PRs with this label will be automatically retested if prow checks fails

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants