ROX-30570: add repository-to-CPE mapping API to indexer#18703
ROX-30570: add repository-to-CPE mapping API to indexer#18703BradLugo wants to merge 1 commit intodc/sbom-ingest-central-apifrom
Conversation
️✅ There are no secrets present in this pull request anymore.If these secrets were true positive and are still valid, we highly recommend you to revoke them. 🦉 GitGuardian detects secrets in your source code to help developers and security teams secure the modern development process. You are seeing this because you or someone else with access to this repository has authorized GitGuardian to scan your pull request. |
There was a problem hiding this comment.
Hey - I've found 1 issue, and left some high level feedback:
- The new httputil.Updater and repository-to-CPE mapping code use slog while the rest of the scanner codebase appears to standardize on zlog; consider switching to zlog for consistency and to preserve existing logging context behavior.
- The updater interval is hardcoded as a package-level variable (24h) in scanner/internal/httputil; consider making this configurable (e.g., via constructor parameter or config) so you can tune refresh frequency without code changes and better support different environments.
- In RepositoryToCPEUpdater.NewUpdater you set a default URL when the passed URL is empty, but indexer.NewIndexer never calls NewUpdater with an empty URL (it returns early instead), so this fallback is currently dead code; either move the defaulting logic up into NewIndexer or remove it here to avoid confusion.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- The new httputil.Updater and repository-to-CPE mapping code use slog while the rest of the scanner codebase appears to standardize on zlog; consider switching to zlog for consistency and to preserve existing logging context behavior.
- The updater interval is hardcoded as a package-level variable (24h) in scanner/internal/httputil; consider making this configurable (e.g., via constructor parameter or config) so you can tune refresh frequency without code changes and better support different environments.
- In RepositoryToCPEUpdater.NewUpdater you set a default URL when the passed URL is empty, but indexer.NewIndexer never calls NewUpdater with an empty URL (it returns early instead), so this fallback is currently dead code; either move the defaulting logic up into NewIndexer or remove it here to avoid confusion.
## Individual Comments
### Comment 1
<location> `scanner/internal/httputil/updater.go:22` </location>
<code_context>
+
+// Updater returns a value that's periodically updated.
+// Largely based on https://github.com/quay/claircore/blob/v1.5.48/rhel/internal/common/updater.go.
+type Updater struct {
+ url string
+ typ reflect.Type
</code_context>
<issue_to_address>
**issue (complexity):** Consider refactoring Updater to use generics (or a factory function) instead of reflection and any to make the type handling clearer and safer.
The main complexity comes from the type-erasure via `any` + `reflect.Type` + `reflect.New`. You can keep the behavior (periodic HTTP JSON fetching with rate limiting and atomic swaps) while removing reflection and `any` by making `Updater` generic.
This reduces cognitive load and keeps it reusable if more types appear later.
### Suggested refactor using generics
```go
type Updater[T any] struct {
url string
value atomic.Pointer[T]
reqRate *rate.Limiter
mu sync.RWMutex // protects lastModified
lastModified string
}
func NewUpdater[T any](url string, init *T) *Updater[T] {
u := &Updater[T]{
url: url,
reqRate: rate.NewLimiter(interval, 1),
}
u.value.Store(init)
return u
}
// Get returns a pointer to the current copy of the value.
func (u *Updater[T]) Get(ctx context.Context, c *http.Client) (*T, error) {
var err error
if u.url != "" && u.reqRate.Allow() {
slog.DebugContext(ctx, "got unlucky, updating mapping file")
if err = u.Fetch(ctx, c); err != nil {
slog.ErrorContext(ctx, "error updating mapping file", "reason", err)
}
}
return u.value.Load(), err
}
func (u *Updater[T]) Fetch(ctx context.Context, c *http.Client) error {
log := slog.With("url", u.url)
log.DebugContext(ctx, "attempting fetch of mapping file")
req, err := http.NewRequestWithContext(ctx, http.MethodGet, u.url, nil)
if err != nil {
return err
}
u.mu.RLock()
if u.lastModified != "" {
req.Header.Set("if-modified-since", u.lastModified)
}
u.mu.RUnlock()
resp, err := c.Do(req)
if err != nil {
return err
}
defer resp.Body.Close()
switch resp.StatusCode {
case http.StatusOK:
case http.StatusNotModified:
log.DebugContext(ctx, "response not modified; no update necessary", "since", u.lastModified)
return nil
default:
return fmt.Errorf("received status code %d querying mapping url", resp.StatusCode)
}
var v T
if err := json.NewDecoder(resp.Body).Decode(&v); err != nil {
return fmt.Errorf("failed to decode mapping file: %w", err)
}
u.mu.Lock()
u.lastModified = resp.Header.Get("last-modified")
u.mu.Unlock()
u.value.Store(&v)
log.DebugContext(ctx, "atomic update of local mapping file complete")
return nil
}
```
Usage for your mapping type becomes type-safe and clearer:
```go
// Assuming type MappingFile struct { ... }
updater := NewUpdater[MappingFile](url, &MappingFile{})
m, err := updater.Get(ctx, httpClient)
// m is *MappingFile, no type assertions/reflect needed
```
This preserves:
- periodic fetch behavior
- `If-Modified-Since` caching
- `rate.Limiter`
- atomic swaps of the current value
while removing:
- `reflect.Type`
- `reflect.New`
- `any` + type-erasure
If generics are not an option in this codebase, a second-best simplification is to replace `reflect.Type` / `reflect.New` with a factory function:
```go
type Updater struct {
url string
newValue func() any
value atomic.Value
// ...
}
func NewUpdater(url string, newValue func() any, init any) *Updater {
u := &Updater{
url: url,
newValue: newValue,
reqRate: rate.NewLimiter(interval, 1),
}
u.value.Store(init)
return u
}
// in Fetch:
v := u.newValue()
if err := json.NewDecoder(resp.Body).Decode(v); err != nil { ... }
u.value.Store(v)
```
This keeps the abstraction but avoids `reflect.Type` and `reflect.New`, making it easier to follow.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
|
|
||
| // Updater returns a value that's periodically updated. | ||
| // Largely based on https://github.com/quay/claircore/blob/v1.5.48/rhel/internal/common/updater.go. | ||
| type Updater struct { |
There was a problem hiding this comment.
issue (complexity): Consider refactoring Updater to use generics (or a factory function) instead of reflection and any to make the type handling clearer and safer.
The main complexity comes from the type-erasure via any + reflect.Type + reflect.New. You can keep the behavior (periodic HTTP JSON fetching with rate limiting and atomic swaps) while removing reflection and any by making Updater generic.
This reduces cognitive load and keeps it reusable if more types appear later.
Suggested refactor using generics
type Updater[T any] struct {
url string
value atomic.Pointer[T]
reqRate *rate.Limiter
mu sync.RWMutex // protects lastModified
lastModified string
}
func NewUpdater[T any](url string, init *T) *Updater[T] {
u := &Updater[T]{
url: url,
reqRate: rate.NewLimiter(interval, 1),
}
u.value.Store(init)
return u
}
// Get returns a pointer to the current copy of the value.
func (u *Updater[T]) Get(ctx context.Context, c *http.Client) (*T, error) {
var err error
if u.url != "" && u.reqRate.Allow() {
slog.DebugContext(ctx, "got unlucky, updating mapping file")
if err = u.Fetch(ctx, c); err != nil {
slog.ErrorContext(ctx, "error updating mapping file", "reason", err)
}
}
return u.value.Load(), err
}
func (u *Updater[T]) Fetch(ctx context.Context, c *http.Client) error {
log := slog.With("url", u.url)
log.DebugContext(ctx, "attempting fetch of mapping file")
req, err := http.NewRequestWithContext(ctx, http.MethodGet, u.url, nil)
if err != nil {
return err
}
u.mu.RLock()
if u.lastModified != "" {
req.Header.Set("if-modified-since", u.lastModified)
}
u.mu.RUnlock()
resp, err := c.Do(req)
if err != nil {
return err
}
defer resp.Body.Close()
switch resp.StatusCode {
case http.StatusOK:
case http.StatusNotModified:
log.DebugContext(ctx, "response not modified; no update necessary", "since", u.lastModified)
return nil
default:
return fmt.Errorf("received status code %d querying mapping url", resp.StatusCode)
}
var v T
if err := json.NewDecoder(resp.Body).Decode(&v); err != nil {
return fmt.Errorf("failed to decode mapping file: %w", err)
}
u.mu.Lock()
u.lastModified = resp.Header.Get("last-modified")
u.mu.Unlock()
u.value.Store(&v)
log.DebugContext(ctx, "atomic update of local mapping file complete")
return nil
}Usage for your mapping type becomes type-safe and clearer:
// Assuming type MappingFile struct { ... }
updater := NewUpdater[MappingFile](url, &MappingFile{})
m, err := updater.Get(ctx, httpClient)
// m is *MappingFile, no type assertions/reflect neededThis preserves:
- periodic fetch behavior
If-Modified-Sincecachingrate.Limiter- atomic swaps of the current value
while removing:
reflect.Typereflect.Newany+ type-erasure
If generics are not an option in this codebase, a second-best simplification is to replace reflect.Type / reflect.New with a factory function:
type Updater struct {
url string
newValue func() any
value atomic.Value
// ...
}
func NewUpdater(url string, newValue func() any, init any) *Updater {
u := &Updater{
url: url,
newValue: newValue,
reqRate: rate.NewLimiter(interval, 1),
}
u.value.Store(init)
return u
}
// in Fetch:
v := u.newValue()
if err := json.NewDecoder(resp.Body).Decode(v); err != nil { ... }
u.value.Store(v)This keeps the abstraction but avoids reflect.Type and reflect.New, making it easier to follow.
ef9ea26 to
b202bc4
Compare
Adds a new RPC, GetRepositoryToCPEMapping, to the Indexer service that returns the Red Hat repository-to-CPE mapping used for RHEL package vulnerability matching. This mapping will be needed by the matcher's ScanSBOM API to enrich RHEL packages with CPE information during SBOM vulnerability scanning.
b202bc4 to
ae6392a
Compare
|
Images are ready for the commit at ae6392a. To use with deploy scripts, first |
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## dc/sbom-ingest-central-api #18703 +/- ##
==============================================================
- Coverage 49.20% 49.16% -0.04%
==============================================================
Files 2659 2661 +2
Lines 199858 200017 +159
==============================================================
- Hits 98344 98343 -1
- Misses 94082 94241 +159
- Partials 7432 7433 +1
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
Closing in favor of #18705 (changing branch name to correct issue number) |
Description
Adds a new RPC, GetRepositoryToCPEMapping, to the Indexer service that returns the Red Hat repository-to-CPE mapping used for RHEL package vulnerability matching. This mapping will be needed by the matcher's ScanSBOM API to enrich RHEL packages with CPE information during SBOM vulnerability scanning.
Alternatives
Stacked on #18484.
User-facing documentation
Testing and quality
Automated testing
No automated testing. Will follow up in ROX-27690.
How I validated my change
ROX_SBOM_SCANNINGfeature flag: