-
Notifications
You must be signed in to change notification settings - Fork 8.1k
Dynamic keyword metadata reader #3169
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dynamic keyword metadata reader #3169
Conversation
…UsingModule to parser from symbol resolver
|
Hi @rjmholt, I'm your friendly neighborhood Microsoft Pull Request Bot (You can call me MSBOT). Thanks for your contribution!
TTYL, MSBOT; |
|
I have a rough idea of what needs fixing for the Unix support, but does AppVeyor use something other than |
|
project.json is it. You are likely missing a reference to System.Collections.Immutable. You can try to repro locally by clearing your nuget cache unless it was resolving that assembly to the gac. |
|
The System.Reflection.Metadata package already depends on System.Collections.Immutable. Should I add System.Collections.Immutable anyway? I also noticed it's not resolving on types from System.Reflection.Metadata in the error message -- I can only replicate the problem when building with |
|
Ah, yeah, that's the failure in AppVeyor - the target is |
|
@bergmeister @rjmholt was an intern on my team at the time this PR was submitted and went back to finish school shortly after. He's actually joining Microsoft as a full time employee on my team the first week of March. @rjmholt should work with @daxian-dbw to determine if we should continue with this PR as-is. |
|
@SteveL-MSFT Thanks for the update. Assuming that this gets fixed, would this also mean that if you published then new PowerShell referencing assemblies that tools such as PSSA using the AST parser could receive this fix to be able to parse types that are imported by |
|
@bergmeister PSSA still needs to support older version of PowerShell though? |
|
@SteveL-MSFT Yes, but if I understand correctly, the referencing assemblies (containing the parser) are compiled into PSSA. Therefore my hope that updated reference assemblies would solve it. WDYT? |
|
@bergmeister I see, if it's only the reference assemblies and not the runtime assemblies (don't know enough about the internals of PSSA), it may work. Since this work is not committed, my recommendation is to not build a dependency on it. |
|
@bergmeister - the reference assemblies contain no code - they are like a header file in C/C++ or an interface definition. It does make sense to produce nuget packages for some of PowerShell like the parser, but it's a ton of work to factor things in a reasonable way to be useful. |
|
I've been meaning to close this PR. I'll revisit it later with @daxian-dbw , but the code is stale now and if we proceed with it, it's probably worth rewriting a fair amount of it. There seems to be some interest in the feature though (#6194, #4016), so I would like to spend some more time on it. @bergmeister this PR is intended more to extend the |
|
Ah, the other thing to note is that having tests for this feature is very painful so long as we require unloading the assemblies that describe the |
|
@rjmholt I think a workaround to loaded assemblies is to start new pwsh processes |
|
It seems AppDomains come back in .Net Core 2.1? |
|
I have the changes in this PR rebased and moved to my fork at https://github.com/daxian-dbw/PowerShell/tree/dsl. It was rebased a while ago, but at least it was rebased to the point after we moved to msbuild. It won't be too much trouble to rebase it again. I have a separate branch |
|
@lzybkr It turns out, PSScriptAnalyzer is not using the reference assemblies of PowerShell but has a self-contained copy of |
|
I've discussed this PR with @daxian-dbw, and it seems simpler to approach DSL support without metadata-parsing for now (e.g. using PowerShell classes as schemas). Implementation is not a priority currently though. @bergmeister: I know there's a use-case for PSSA, but the implementation here was designed only to read DSL specifications and error on anything else. Possibly the parse-time assembly load meant that PSSA could discover types in |
|
@rjmholt Ok, I'll have to accept for now that this feature will not be implemented in PowerShell's parser. But what does the PowerShell team suggest to do for PSSA? PSSA depends on PowerShell's parser (it does not use the one being installed but has it's own version of it as part of a self-contained PSCore version of System.Management.Automation. Should PSSA then maintain its own customized fork of PowerShell in order to build and consume a custom parser? Catching the |
|
@bergmeister We might be speaking at cross-purposes a little here -- essentially the fact that this prototype feature loaded modules at parse-time was something we don't want, because we don't want arbitrary code execution when we're just parsing PowerShell. It's one of the several reasons this PR was rejected. In fact, for PSSA to avoid executing malicious code by just parsing PowerShell, it will need to avoid module loading too. I'm not familiar at all with PSSA's codebase though, so I didn't quite understand where you talked about a In terms of the parser, I think the System.Management.Automation.dll parser is definitely the one to use (even though it won't and shouldn't ever load a module in at parse time). But, the way I imagine it, PSSA could use the parser to get an AST as normal, and then extract any module-loading statements from that (maybe in |
|
@rjmholt Sorry for coming back to you later, I forgot about it. I understand and respect the decision that the PowerShell parser won't get those module loading features. In short PSSA is relying on the parser to analyse ASTs and calls |
|
Hmmm, do you know what error the parser throws when a type is used that isn't loaded. Philosophically, the parser shouldn't be looking up types at all. You should be able to use a string that is totally meaningless as long as it is syntactically representative of a type ( |
|
@rjmholt An example is using namespace Microsoft.Azure.Commands.ResourceManager.Cmdlets.SdkModels
using namespace Microsoft.Azure.Commands.Common.Authentication.Abstractions
Import-Module "AzureRm"
class MyClass { [IStorageContext]$StorageContext }We get the parse error |
|
Immediately after writing that last comment I went and tracked down the issue the code above comes from, because I remembered it being the problematic code in question. Which then brought me to this issue, and then this issue which really documents what's going on. And then I spent a fair amount of time thinking about this. For whatever reason I didn't understand that it was PowerShell that was producing the error here, just because I didn't think it would be trying to resolve types at parse time. I've heard before that it's to do with being able to integrate with .NET types (although I'm not sure why But to summarise, here are some dot points (most of which I imagine are already well known to @bergmeister , but I document here anyway):
The idea that a PowerShell script's AST/parse could be valid or invalid based on the environment its parsed in doesn't really sit well with me, but I'm sure there were good reasons for going that way. It looks like maybe we should implement the metadata reading, but the performance cost could be considerable, especially for complex webs of inter-referenced assemblies. Anyway, PSSA is correctly reporting that PowerShell won't run the script under normal circumstances. And in this case, I think you still get a fully-formed AST (you just also get an error). You can try this: $file = "$PWD\ex.ps1"
$problem = @'
using namespace Microsoft.Azure.Commands.ResourceManager.Cmdlets.SdkModels
using namespace Microsoft.Azure.Commands.Common.Authentication.Abstractions
Import-Module "AzureRm"
class MyClass { [IStorageContext]$StorageContext }
'@
Set-Content -Path $file -Value $problem
$tokens = @()
$errors = @()
$ast = [System.Management.Automation.Language.Parser]::ParseFile($file, [ref]$tokens, [ref]$errors)
$ast.EndBlock.Statements[1].Members[0].PropertyTypeA complete, well-formed AST is still given back, it's just that the error array is not empty. |
|
@rjmholt Thank you very much for putting so much thought into it. I did not know that the Parser can still continue and return an AST despite parse errors. Because PSSA throws on any parse errors, I made the wrong assumption that parsing cannot continue at all after a parse error. This will help PSSA very much because it can then simply ignore/remove |
|
Well the parser is designed (to my knowledge) not to throw at all, since it's trying to be helpful about errors. So it just collects errors and keeps parsing. If the parser actually throws, it's because something very bad has happened that it can't recover from. But otherwise it will try and make the most sense that it can and keep parsing. Even in cases where there's some malformed syntax, you will still get an AST, but it will contain error ASTs where bad script was, up until the point where the parser can understand what's going on again. |
|
I can add that the parser even tries to perform some recovery to continue the correct parsing after an error. If the parser does it wrong then this is a reason to open a issue. |
This PR adds a new entry point for defining
DynamicKeywords to PowerShell. Where currently these can be added at runtime in PowerShell, this now uses CIL metadata in an assembly as a specification of the syntax requirements, and performs some validation of the specification.I've also added the concept of nested keywords -- there are keywords that will only be parsed and invoked as keywords when they are in the scope of another keyword. This seemed like a common feature in Domain Specific Languages we've been looking at.
I still need to do some work on this PR, but I want to get the ball rolling because it's a substantial addition.
Work to still to do includes:
Types.ps1xml-like test caseI've been doing work on adding
DynamicKeywordruntime execution using C# code (so a domain-specific language for PowerShell can be defined entirely in a C#/dotnet module/assembly -- see here for a very rough prototype), and this PR is essentially the first step toward that. I intend to submit an issue describing the work and a roadmap soon.Finally, in this PR, there are some points I think that need special scrutiny:
UseModeconcept (restricting how many times a keyword may appear in a block) -- it may be unneccessary or unwanted, and is where the metadata reader starts to verge on too opinionated. I included it because it was part of the RFC I submitted.