-
Notifications
You must be signed in to change notification settings - Fork 429
Add support for LGTM_INDEX_FILTERS #103
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
I made a test repo for this change at https://github.com/dsp-testing/robertbrignull-filter-test and you can see the changes that switches to this branch makes at https://github.com/dsp-testing/robertbrignull-filter-test/pull/1/checks?check_run_id=845738229 |
|
@aibaars, would you be able to give this a review? Can also ask someone else if you'd rather not. |
src/analysis-paths.ts
Outdated
| // It does not understand double-globs because that would require it to | ||
| // traverse the entire file tree to determine which files are matched. | ||
| if (config.paths.length !== 0) { | ||
| core.exportVariable('LGTM_INDEX_INCLUDE', config.paths.join('\n')); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The LGTM_INDEX_INCLUDE and LGTM_INDEX_EXCLUDE properties don't support any form of wildcards. I think you should only put proper paths into these fields. The * symbol is not allowed in paths on Windows, so you might even get an exception.
I just remembered that there is some good documentation about these settings in the javascript autobuilder: https://github.com/github/codeql/blob/bfb734e1d74df1ae810feb7198cb271b767c4b9a/javascript/extractor/src/com/semmle/js/extractor/AutoBuild.java#L80
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. Makes sense to me. I'll filter those out of here.
I'd also forgotten that the JS extractor is public so thanks for that link.
src/config-utils.ts
Outdated
|
|
||
| // Checks that a paths of paths-ignore entry is valid, possibly modifying it | ||
| // to make it valid, or if not possible then throws an error. | ||
| export function validateAndSanitisePath(originalPath: string, propertyName: string, configFile: string): string { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would also check for the special characters mentioned in https://docs.github.com/en/actions/reference/workflow-syntax-for-github-actions#filter-pattern-cheat-sheet that are not supported by the underlying project layout implementation. Just report an error telling that these are not allowed, to avoid users getting confused why their perfectly sensible pattern string does not match any files.
src/analysis-paths.ts
Outdated
|
|
||
| // Builds an environment variable suitable for LGTM_INDEX_INCLUDE or LGTM_INDEX_EXCLUDE | ||
| function buildIncludeExcludeEnvVar(paths: string[]): string { | ||
| return paths.filter(p => p.indexOf('**') === -1).join('\n'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should filter out paths containing any kind of special matching character mentioned in https://docs.github.com/en/actions/reference/workflow-syntax-for-github-actions#filter-pattern-cheat-sheet .
For windows I'd also paths containing any the forbidden characters mentioned in https://stackoverflow.com/a/31976060 . Otherwise things will crash at some point with a less friendly error message.
|
This looks fine to me. Two more suggestions to add more checks to improve or avoid error messages. |
src/config-utils.ts
Outdated
| // Check for any characters that are illegal in path names | ||
| if (process.platform === 'win32' && path.match(illegalWindowsCharactersRegex)) { | ||
| core.warning('"' + path + '" contains an invalid character and will be ignored. ' + | ||
| 'The characteres <, >, :, ", !, |, ?, * are forbidden to use in paths on windows.'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
characteres -> characters
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are no longer filtering *, so best to adjust the error message.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've moved this to analysis-paths.ts and kept it as not containing * because that is checked for separately on all platforms.
src/config-utils.ts
Outdated
| } | ||
|
|
||
| // Check for any characters that are illegal in path names | ||
| if (process.platform === 'win32' && path.match(illegalWindowsCharactersRegex)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I meant for this check to be performed when adding a path to the exclude/include env vars. The paths in those env variables will be interpreted as path strings by the autobuilders and should therefore only contain valid path characters.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
However, it should be fine here too as you no longer fail on *.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've moved this check to when we construct the env vars. After some thought I agree it should be here and the right thing to do is to ignore it. Since the path can't legally exist anyway there should be no effect from not adding it to the include/exclude env vars.
I'd also prefer to not output any kind of error/warning about this because as explained above it shouldn't affect anything as the path legally can't exist anyway.
|
|
||
| // Characters that are supported by filters in workflows, but not by us. | ||
| // See https://docs.github.com/en/actions/reference/workflow-syntax-for-github-actions#filter-pattern-cheat-sheet | ||
| const filterPatternCharactersRegex = /.*[\?\+\[\]!].*/; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you might also want to exclude \ . I think excluding windows-style path separators would be helpful to ensure patterns mean the right thing cross platform.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer to not do any special handling for \ for now. We can add that in later. This PR is getting complicated enough as it is and I don't understand exactly what you're proposing. I recommend once this PR is in it would be easiest for you to make the change you're suggesting for \/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest not allowing \ symbols at all. I don't know what the project layouts do with windows path separators and a windows user can just write /. I think it is better to be restrictive in what we accept now, and perhaps make things more liberal later if there is any use (which I doubt).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see the point that windows users can use \ instead, and it's nice to be restrictive in the start. I do note that linux paths can legitimately contain \ characters (although it is rather unlikely) so we would be actively preventing people from using filters with files with those characters.
Ok, I'll just ban them. I don't anticipate many problems if any from file legitimately containing this character, so let's hope it's fine. As you say we can make it less restrictive in the future and that's easier than going the other way.
src/analysis-paths.ts
Outdated
|
|
||
| // Builds an environment variable suitable for LGTM_INDEX_INCLUDE or LGTM_INDEX_EXCLUDE | ||
| function buildIncludeExcludeEnvVar(paths: string[]): string { | ||
| return paths.filter(p => p.indexOf('**') === -1).join('\n'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| return paths.filter(p => p.indexOf('**') === -1).join('\n'); | |
| return paths.filter(p => p.indexOf('*') === -1).join('\n'); |
Paths with either * or ** make no sense in the include/exclude env variables.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. You're right and this now checks for any instances of *.
9a357fd to
70980b9
Compare
|
Ok, I think this is finally ready. I've addressed all the comments from yesterday, I've just had another look through, and my tests over at https://github.com/dsp-testing/robertbrignull-filter-test are working as expected. @aibaars, do you want to give this another sanity check? Otherwise I will merge today in a little bit. |
| while (path.charAt(0) === '/') { | ||
| path = path.substring(1); | ||
| } | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suppose empty paths "" are fine and would be interpreted as ".", but might be good to check this and add a test case for both JS and Python analysis runs with such a path.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've tested manually and including / seems to be fine.
|
Looks good to me. |
|
I'm going to add some more filters to the integration. Can't believe I forgot to do that until now. Then I'll merge this. |
d5d447f to
64ca715
Compare
|
Seems while javascript is fine, the python autobuilder code borks if you include certain characters like |
64ca715 to
a1ba2b1
Compare
|
Also seems the python case throws an error if the INCLUDE env var includes paths to files, because it only accepts directories. |
a1ba2b1 to
189b6ef
Compare
|
Turns out adding Regardless, this PR is going to be merged as I've done enough testing now manually. In the future this situation needs to be sorted out to bring the two languages more in line because currently the behaviour is unacceptably different. |
|
I don't believe this PR will make any configs that were previous valid to be invalid, so from that point of view this is a safe change. There is a reasonable chance that we will find bugs or otherwise want to make changes to this area, but we should be able to response and address those problems when we find them. |
This is only possible solution to double-globs in path filters. We make use of the
LGTM_INDEX_FILTERSenvironment variable, as well as the two others that we already use, becauseLGTM_INDEX_FILTERSdoes support**.Merge / deployment checklist