-
Notifications
You must be signed in to change notification settings - Fork 429
Introduce TRAP caching #1172
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce TRAP caching #1172
Conversation
df5d008 to
8f867dc
Compare
henrymercer
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tell the extractor to write up to 1GB to the cache (tbc: is this a sensible amount for size of the Actions cache and the disk space of the default runners?).
1 GB per language sounds reasonable to me as a default. I can think of cases where we might want to use more or less space however — perhaps we should make the TRAP cache size configurable? Question: Should the total size or the size per language be configurable?
We should document this new feature, and the option to disable it, before we turn this on for any external users. But for now we're only going to turn on the feature flag for some internal repositories, so I think it is fine to not have it in the public facing documentation just yet.
Sounds reasonable.
Before we enabled this on more than one or two internal repos (which we can monitor by hand), I would also like to implement some telemetry so we can have some visibility over whether this is causing any issues. Again, I propose leaving this for a later PR.
Also sounds reasonable 👍
Indeed, we may eventually want configuration options to fine-tune this. I don't think adding this should block getting this prototype out of the door, though. In the telemetry I will be adding, I will include a field that records the size of the cache so we can use that to get a feel for what the default should be and how configurable we need this to be before we properly declare this GA. |
|
are there docs explaining what this is or why one would want to turn it on or off? I didn't see it mentioned in the docs linked from the README and this is chewing up our entire GHA cache since it seems to put many MB per commit |
|
@asottile-sentry I'm sorry to hear this feature has caused trouble for you. The changelog note here has some explanation of this feature and how to turn it off. The intended purpose is to speed up the "extraction" phase of CodeQL analysis (which processes your source code into a local database format so that it can be analysed), however this is completely optional and you are safe to turn it off in the workflow since it's using up your Actions cache quota. We have only rolled this out in a limited fashion for one language at the moment, so it is not in the public docs beyond the changelog. To help us understand and investigate the problem better, would you able to point us to your repo, and let us know if you're using the Actions cache for any other purposes? |
|
https://github.com/getsentry/sentry/actions/caches -- yes we use the cache for other things (I suspect this to be the common case -- so perhaps enabling this by default in codeql is the wrong default?) |
|
In case the additional context helps, trap caching is also causing my organization's monorepo to exhaust our cache very rapidly. I had to turn off the feature, even though I want to use it. We use the GHA cache for other purposes as well and can't afford to let this churn our cache. I've opened another issue to hopefully move the cache storage location: #2030 |
The `cpp` CodeQL job is adding a cache entry for each run on the master branch. One for Linux, another for Windows. Size: 68MB + 180MB = 248MB. In one week we got 50+ such entries, almost filling the available cache space. The cache entries don't seem to be used again. Following the recommendation in an open issue thread, this patch tries to disable this cache. Since it only affects master, the effect can only be verified after merging. Bug: curl#18528 (comment) Ref: github/codeql-action#1172 Ref: github/codeql-action#2030 Ref: github/codeql-action#2885 (comment)
The `cpp` CodeQL job is adding a cache entry for each run on the master branch. One for Linux, another for Windows. Size: 68MB + 180MB = 248MB. In one week we got 50+ such entries, almost filling the available cache space. Following the recommendation in an open issue thread, this patch tries to disable this cache. Since it only affects master, the effect can only be verified after merging. The latest cache is picked up in PRs. The performance impact is also to be seen after merge. Bug: #18528 (comment) Ref: github/codeql-action#1172 Ref: github/codeql-action#2030 Ref: github/codeql-action#2885 (comment) Follow-up to cc50f05 #18528 Closes #18613
This PR introduces everything that needs to happen on the Action's side for TRAP caching. In particular:
codeql resolve languages --format=betterjsonto see the extractor options.initstep. Then we use that cache in read-only mode during extraction. (sample run with some now-removed debugging output)All of this is gated behind a feature flag, although the
trap-cachingparameter to theinitstep can be used to force this to be on or off irrespective of the feature flag. I expect a few users will want to disable this if they're already using their Actions cache for something else so we're not competing for space.We should document this new feature, and the option to disable it, before we turn this on for any external users. But for now we're only going to turn on the feature flag for some internal repositories, so I think it is fine to not have it in the public facing documentation just yet.
Before we enabled this on more than one or two internal repos (which we can monitor by hand), I would also like to implement some telemetry so we can have some visibility over whether this is causing any issues. Again, I propose leaving this for a later PR.
Merge / deployment checklist