-
Notifications
You must be signed in to change notification settings - Fork 126
Extract more dependency ASTs #170
Conversation
|
Many thanks for your work on this; glad to see it finally made it into a PR! Before I start reviewing this, can you quickly outline how one enables build tracing? |
|
I've enabled the changes required to make it work globally and not added a flag, since I thought it might be superior in most cases:
|
As discussed previously, I would strongly prefer build tracing to be opt-in, since it is a new feature. Also, doesn't it currently only work for module-based code? |
Build tracing itself is opt-in, because the autobuilder is run by default, right? The behavior for extraction via autobuilder hasn't changed too much, the only change is some more dependencies may be extracted, but I think they should all be part of the project. I wouldn't mind adding a flag to make this behavior opt-in, though.
It falls back on package directory, which is basically what the old code did except in the case that there's a package nested under a package passed to the extractor explicitly. Actually, before review I'm going to run a dist-compare with a query that selects files; does the below query seem sensible? import go
from PackageName n
select n.getFile()I'll mark this as a draft in the interim. |
|
I'm afraid you have lost me. Perhaps I should just look at the code before asking any more questions. |
max-schaefer
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few initial questions; I'll need a few more passes to digest this.
|
As discussed offline, I would favour breaking this PR up into two parts, one being the general improvements to package handling, and one being the build-tracing specific bits. |
|
I've subverted this PR for extracting more dependencies because most of the comments were related to it. |
|
Comments addressed. |
max-schaefer
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few comments and questions.
I think it could be made a little clearer in the change note and perhaps the code that we now extract the (AST for the) entire module, even if the extractor was explicitly invoked on only one package inside that module.
2220897 to
b081d13
Compare
max-schaefer
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. I don't have a good intuitive feeling for how this will affect results, so I would suggest we
- do a dist-compare with
--no-cleanupand sanity check the results; - run an ad-hoc query on the databases produced by the dist-compare to get the list of files and compare those before and after the change;
- run a Go-Differences job to validate extractor performance.
Will do. Should I just run a few of the standard queries and a few checks that AST has actually been extracted?
This should now theoretically be in progress, though I've run it wrong about eight times now...
I'm not entirely sure how to do this; do I fill out the |
|
Also, the test failures seem to me to be Actions failures, so I've rerun them (the tests pass on my local machine). |
If the changes aren't too voluminous, a standard dist-compare should do the trick, I think.
Both. |
|
How is the evaluation of this PR coming along? |
|
Ah, I'd forgotten to go through and post this. Dist-compare now shows no results changes and a rather large slowdown in cadence evaluation times, which lines up with this report that says that 22k files were added to the cadence snapshot. The last Go-Differences job I ran seems to have failed, so I've started a new one. |
That job's failed because it apparently tries to checkout the sub-sha in the |
|
Yes, you cannot use a sub-sha with the Go-Differences job, you have to make a branch in the main repo that points the submodule to the right sha. |
We now extract packages that have the same module root as the specified packages, as determined by the `go list` command.
As suggested in code review Co-authored-by: Max Schaefer <54907921+max-schaefer@users.noreply.github.com>
smowton
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some comments, one or two of which appeared to already be done in a later commit.
Two high-level suggestions:
- Is it normal in this project/org to keep "review comments" commits or to squash them?
- Do we get feedback on the command-line indicating what it decided to extract and what it didn't?
extractor/extractor.go
Outdated
| dirList := strings.Split(relDir, string(filepath.Separator)) | ||
| if len(dirList) == 0 || dirList[0] != ".." { | ||
| // if dirList is empty, root is the same as the source dir | ||
| // if dirList starts with `".."`, it is not inside the root dir |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we care about paths that escape the rootdir like x/../..?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you satisfied with how this was addressed later on?
The latter; see the newly added internal Wiki article on the subject.
The extractor prints out the path of each file it extracts. (This is suppressed by |
|
@sauyon, any news on the evaluation? |
|
Sorry, I'd forgotten to review the job. Go-Differences seems to suggest no time differences. |
|
I was looking at the analysis time numbers, whoops. I would have expected cockroach to take more time as it now contains significantly more (generated) files. I'll take a look at concourse too, but I suspect that's a similar story. |
|
I'm not able to reproduce the difference in |
|
OK, thanks for checking. In that case I'll merge. |
A dist-compare shows 60 new results on kubernetes due to the new way that files are picked up.
The results that I looked at look like false positives, but are reasonable, so I'm going to open issues for them and not worry about them for this PR.
Also, evaluation has slowed significantly for both
cadenceandkubernetes.Matt established that the OS X issues I've been having are an issue with the tracer, and he's working on that now.