🐛 incremental improvement in catalogmetadata cache/client performance#965
Conversation
✅ Deploy Preview for olmv1 ready!
To edit notification comments on pull requests, go to your Netlify site configuration. |
822f614 to
cbac64e
Compare
| // if the catalog has not been successfully unpacked, report an error. This ensures that our | ||
| // reconciles are deterministic and wait for all desired catalogs to be ready. | ||
| if !meta.IsStatusConditionPresentAndEqual(catalog.Status.Conditions, catalogd.TypeUnpacked, metav1.ConditionTrue) { | ||
| errs = append(errs, fmt.Errorf("catalog %q is not unpacked", catalog.Name)) |
There was a problem hiding this comment.
This is a change in semantics. If we ignore catalogs that are not unpacked, users could run into issues if the same package exists in two different catalogs and they are expecting a specific result.
The new semantics are similar to the behavior in OLMv0, and there have been some complaints about it. For example, "I want to install CNV, I don't care if the community catalog is having issues". In order to solve this problem, I think we need to have a knob on the ClusterExtension API to allow users to select which ClusterCatalogs to resolve from (likely using a label selector)
There was a problem hiding this comment.
Do we have this captured as an issue somewhere or do you anticipate this falling under the umbrella of #740?
There was a problem hiding this comment.
I am going to make a separate issue for this. Just wanted to get consensus on this approach before I did.
| ) | ||
|
|
||
| func TestClient(t *testing.T) { | ||
| t.Run("Bundles", func(t *testing.T) { |
There was a problem hiding this comment.
I removed this outer/unnecessary t.Run wrapper (mainly because it caused my IDE to not be able to find the individual test cases to run/debug separately).
In order to review changes in this test, I advise ignoring whitespace.
| ) | ||
|
|
||
| func TestCache(t *testing.T) { | ||
| t.Run("FetchCatalogContents", func(t *testing.T) { |
There was a problem hiding this comment.
Similar to my other comment, I removed this unnecessary t.Run wrapper. When reviewing, it's much easier to see actual changes if you ignore whitespace.
There was a problem hiding this comment.
Note that I moved this into internal. I noticed go-apidiff complain that I changed the Bundles() function signature realized we shouldn't have a test utility in our public API, especially since it returns a slice of objects that are also in an internal package.
| tmpDir, err := os.MkdirTemp(fsc.cachePath, fmt.Sprintf(".%s-", catalog.Name)) | ||
| if err != nil { | ||
| return nil, fmt.Errorf("error creating cache file for Catalog %q: %s", catalog.Name, err) | ||
| return nil, fmt.Errorf("error creating temporary directory to unpack catalog metadata: %s", err) |
There was a problem hiding this comment.
| return nil, fmt.Errorf("error creating temporary directory to unpack catalog metadata: %s", err) | |
| return nil, fmt.Errorf("error creating temporary directory to unpack catalog metadata: %v", err) |
cbac64e to
0661a99
Compare
| UNIT_TEST_DIRS := $(shell go list ./... | grep -v /test/) | ||
| test-unit: $(SETUP_ENVTEST) #HELP Run the unit tests | ||
| eval $$($(SETUP_ENVTEST) use -p env $(ENVTEST_VERSION) $(SETUP_ENVTEST_BIN_DIR_OVERRIDE)) && go test -count=1 -short $(UNIT_TEST_DIRS) -coverprofile cover.out | ||
| eval $$($(SETUP_ENVTEST) use -p env $(ENVTEST_VERSION) $(SETUP_ENVTEST_BIN_DIR_OVERRIDE)) && CGO_ENABLED=1 go test -count=1 -race -short $(UNIT_TEST_DIRS) -coverprofile cover.out |
There was a problem hiding this comment.
CGO is required for the race detector. TIL
There was a problem hiding this comment.
yeah - got caught on that one rejiggering the v0 Makefile
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #965 +/- ##
==========================================
+ Coverage 79.76% 79.86% +0.09%
==========================================
Files 16 17 +1
Lines 1107 1157 +50
==========================================
+ Hits 883 924 +41
- Misses 155 161 +6
- Partials 69 72 +3
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
| if _, err = file.Seek(0, 0); err != nil { | ||
| return nil, fmt.Errorf("error resetting offset for cache file reader for Catalog %q: %s", catalog.Name, err) | ||
| if err := os.Rename(tmpDir, cacheDir); err != nil { | ||
| return nil, fmt.Errorf("error moving temporary directory to cache directory: %v", err) |
There was a problem hiding this comment.
curious to know what happens if fail here? Just run of the mill expbackoff? what happens to resolution during that period? Same thing?
| deprecationsMu sync.Mutex | ||
| ) | ||
|
|
||
| if err := declcfg.WalkMetasFS(ctx, packageFS, func(_ string, meta *declcfg.Meta, err error) error { |
There was a problem hiding this comment.
Is this function now guaranteed to run concurrently? Is there any guarantees on the same *declcfg.Meta object not being processed more than once?
There was a problem hiding this comment.
It runs concurrently by default (based on number of CPUs Go thinks are available). There is an exactly once guarantee for an individual *declcfg.Meta object.
There was a problem hiding this comment.
Ultimately, we're parsing n files at a time (where n is number of goroutines). You may notice that the cache writes metas into their own files. This is on purpose to make as many files as possible so that as many goroutines as possible (up to n, at least) have work to do.
| for _, bundle := range bundles { | ||
| slices.SortFunc(bundle.InChannels, func(a, b *catalogmetadata.Channel) int { return cmp.Compare(a.Name, b.Name) }) | ||
| } |
There was a problem hiding this comment.
Could you elaborate on the purpose of this?
There was a problem hiding this comment.
Because of the concurrency introduced in WalkMetasFS the order that channels are added processed is non-deterministic. That non-determinism manifests as non-deterministic ordering of bundle.InChannels. So this function gets us back to a deterministic order.
It doesn't really matter except that tests become more annoying to implement if the order isn't deterministic.
There was a problem hiding this comment.
Would you mind adding a comment just highlighting the why?
There was a problem hiding this comment.
3df60bb to
b86e629
Compare
Signed-off-by: Joe Lanford <joe.lanford@gmail.com>
b86e629 to
82303d7
Compare
Description
This partially resolves #914 by doing two things:
all.jsonresponse into separate directories organized by packageThis is still not completely resolved though because step (1) still happens synchronously during the reconciliation of a ClusterExtension. In order to finish the improvement (and to complete #948), we need to implement a controller that watches and reconciles ClusterCatalog changes so that it can create (and delete) caches for operator-controller.
Reviewer Checklist