Skip to content

Conversation

@JamesWTruher
Copy link
Collaborator

PR Summary

This PR attempts to manage the scenario where multiple quotes and strings with spaces are sent to the Native Command Processor. The current behavior of our native executable has the following results:

PS> testexe -echoargs 'one "two three" four' 
Arg 0 is <one two>
Arg 1 is <three four>

which is somewhat surprising. This PR changes the behavior to:

PS> testexe -echoargs 'one "two three" four'                                                                                   
Arg 0 is <one "two three" four>

somewhat more pathological is the following:

testexe -echoargs ' "this is" "a test" '  
Arg 0 is < this>
Arg 1 is <is a>
Arg 2 is <test >

with my change:

testexe -echoargs ' "this is" "a test" '      
Arg 0 is < "this is" "a test" >

This PR represents a breaking change

PR Context

see issue 1995

The implementation is somewhat straightforward, it inspects the string argument looking for " and prepends the native escape \ character to the ". If there is already a \ in front of the " no change is made.

PR Checklist

@ghost ghost assigned iSazonov Aug 19, 2020
@Artoria2e5
Copy link

Artoria2e5 commented Aug 19, 2020

It would be nice if the nullOrEmpty check on the arg is removed too. Almost every platform has a concept of an empty string appearing in argv.

The whole only-quote-when-unquoted still feels wonky to reason about, but ehhhh the test cases are there. (FWIW, I would really prefer forgoing the wonky code for "need quote" altogether.)

@TSlivede
Copy link

TSlivede commented Aug 19, 2020

First of all, thanks for working on this long overdue issue. And thanks for showing the (incomplete?) code in a work-in-progress pull request.

If there is already a \ in front of the " no change is made.

I think this is a very dangerous half-hearted measure. I'd honestly prefer to not change the current behavior, than to further modify the "guess what the user wanted and try to do something maybe that could be right" method.

Implementing it this way would still not allow to pass arbitrary sting contents to executables. But instead of knowing that I have to correctly escape "-chars, I would than have to write an even more complex regex replace construct to do the right thing.
<Edit> I thought about it, the regex doesn't get more complicated, just different. </Edit>

This would mean changing from one inconsistent behavior to another more complicated inconsistent behavior...

Another disadvantage: The current behavior is at least relatively obvious. It is at least somewhat likely that a user would test if a string containing a quote is passed correctly. The proposed behavior would would mean, that people would only notice the problem if they explicitly test if a backslash followed by a quote is passed correctly...


Note: The correct way to escape N backslashes followed by " is 2N+1 backslashes followed by ". (\"\\\", \\"\\\\\",...)

@Artoria2e5
Copy link

Looks like this is triggering the same failures as in #13099. The Pester stuff really is a goldmine for strange hacks relating to this.

@TSlivede
Copy link

The whole only-quote-when-unquoted still feels wonky to reason about

Yes, I would again strongly recommend, to not do these strange tests - that always means, that one can not pass arbitrary content as an argument to an executable. Please always treat any quotes within an argument as literal quotes that need to be escaped.

@SteveL-MSFT SteveL-MSFT added Breaking-Change breaking change that may affect users CL-BreakingChange Indicates that a PR should be marked as a breaking change in the Change Log and removed Breaking-Change breaking change that may affect users labels Aug 19, 2020
@SteveL-MSFT
Copy link
Member

@TSlivede the intent of not modifying an already escaped quote \" is to ensure that users already using that workaround to get quote passed aren't suddenly broken

@Artoria2e5
Copy link

Well, since this is already a breaking change, I think ensuring this sort of thing is unnecessary.

@TSlivede
Copy link

TSlivede commented Aug 20, 2020

@TSlivede the intent of not modifying an already escaped quote \" is to ensure that users already using that workaround to get quote passed aren't suddenly broken

Yes, I already guessed that - but doing it this way does not solve the problem in general. It just makes the problem harder to notice. This is not a solution - this is hiding the problem.

I am fully aware that many people would be happy, that quotes would then "just work". So there would be less complaints to powershell. But it does not change the fact, that argument passing to external executables would be still fundamentally flawed.

@mklement0
Copy link
Contributor

mklement0 commented Aug 20, 2020

I wholeheartedly agree, @TSlivede and @Artoria2e5.

The only worthwhile fix is one that fixes the problem once and for all, even though that means breaking existing workarounds.

To recap, that means:

  • On Unix, switch to ProcessStartInfo.ArgumentList: nothing else is required.

  • On Windows, switch to ProcessStartInfo.ArgumentList by default, but also make accommodations for batch files and high-profile msiexec-style CLIs, as detailed in Arguments for external executables aren't correctly escaped #1995 (comment)

    • These accommodations (as well as the ProcessStartInfo.ArgumentList fix in effect) are implemented in the ie function (if you want to test it standalone, execute $script:needQuotingWorkaround = $true first; otherwise, use Install-Module Native), but note that the implementation is complicated both by needing to work around the current quoting problems and wanting to remain PSv3+-compatible - neither aspect applies to a direct fix in the engine.

    • As for --%: it's probably fine to leave as-is, despite its limitations, given that the only remaining, hopefully exceedingly rare use for it would be for "rogue" CLIs on Windows that aren't handled by the accommodations; also, a future ins / Invoke-NativeShell could provide a less limited experience that makes embedding PowerShell values easier (e.g.,
      ins "someRogueExe.exe `"a&b`" $((Get-Date).Year)".

      • On a probably academic note: If we did want to fix --%, it should only be supported on Windows and only support the form --% <single-string>; in other words: --% should only be special as the first argument and only accept a single - potentially expandable - string, that fully expresses the pass-through command line (and is not subject to any further interpretation - no expansion of %...% variables).

@TSlivede
Copy link

Well, since this is already a breaking change, I think ensuring this sort of thing is unnecessary.

I agree. Especially as I assume, that there are more cases that would brake anyway and only a few cases that would additionally break if this would be done correctly.

I assume, that there are more calls of the form msiexec …… 'property="value with space"' (calls where quotes in the argument are passed through as delimiting quotes) than there are calls with manually backslash escaped quotes. And the former will definitely break, there is no way around that.

@TSlivede
Copy link

TSlivede commented Aug 20, 2020

  • On Windows, switch to ProcessStartInfo.ArgumentList by default, but also make accommodations for batch files and high-profile msiexec-style CLIs, as detailed in #1995 (comment)

I have to admit, that that I'm really not a fan of checking which executable is called and then changing the quoting. That is absolutely horrible to debug if someone has a problem.

If there must be accommodations for msiexec style CLIs, then arguments should simply always be quoted as prop="some value" if they match that form. It is no problem to always to that, because the arguments still work perfectly fine for any executable, that follows the common rules. (Example implementation here, @SteveL-MSFT Regarding the powershellgallery module, that you created from that code: Could you either update the code or transfer the ownership to me?)


And I'm not even sure, that we should implement such accommodations: It would be much much better if starting with the next update, Microsoft would make their executables accept arguments wich are quoted from beginning to end (e.g. "a=b c"). That can be done completely backwards compatible.

If powershell implements special handling for these parameters, then these programs will probably never be fixed and any other cross platform tools will forever be unable to properly call these programs...

@mklement0
Copy link
Contributor

mklement0 commented Aug 20, 2020

I have to admit, that that I'm really not a fan of checking which executable is called and then changing the quoting.

Neither am I, but we would do it as a courtesy to shield users from the underlying command-line anarchy, which seems well worth doing given the high profile of msiexec, msdeploy, cmdkey, and possibly others.

And, of course, in order to accommodate batch files - which as CLI entry points are probably often used without the user realizing that a batch file is involved - there's no other option than to check whether the (resolved) executable path ends in .cmd or .bat.

Also note that the prop="foo bar" msiexec-style accommodation is benign should it be applied to a CLI that doesn't actually need it - conventional CLIs treat "prop=foo bar" and prop="foo bar" as equivalent.

That is absolutely horrible to debug if someone has a problem.

(a) The accommodations are easy to conceptualize and document - for which the proposed and sorely needed about_Native_Calls topic is the right place.

(b) We should ship something akin to echoArgs.exe or dbea / Debug-NativeExecutable (which also supports batch files) as part of PowerShell, again to be prominently documented in about_Native_Calls.

If there must be accommodations for msiexec style CLIs, then arguments should simply always be quoted as prop="some value" if they match that form

That would mean engaging with and indulging the command-line anarchy, whereas we want to abstract it away - for the remaining cases, if any, where that doesn't work (to be diagnosed with dbea), there's --%.

PowerShell is a shell, which means that however I pass an argument that I want the target executable to see as verbatim foo=bar baz should be fine, as long as I satisfy PowerShell's syntax requirements.

E.g., $propValue = 'bar baz'; msiexec ... foo=$propValue should just work, without you having to worry about non-conventional quoting requirements enforced by a rogue CLI - would you want to think about what the value of $propValue could be and whether it may require quoting? (On a purely technical level, I'm not even sure that the parameter binder preserves the original quoting in a way that makes it still available to the command-line construction code - but, the more important point is: we should not do it).

It would be much much better if starting with the next update, Microsoft would make their executables accept arguments wich are quoted from beginning to end (e.g. "a=b c"). That can be done completely backwards compatible.

Absolutely - these CLIs should have been fixed a long time ago, incidentally not just with respect to prop="foo bar", but also with respect to currently not accepting \"-escaped " chars.

By contrast, the accommodations for batch files can never go away (without massively breaking backward compatibility).

But even fixing the CLIs will not help older systems that may, however, install a fixed PowerShell version.

If powershell implements special handling for these parameters, then these programs will probably never be fixed and any other cross platform tools will forever be unable to properly call these programs...

Again - I don't see how any of these accommodations could break anything - do tell us if you know of any cases.

(Hypothetically, breakage is possible in the following scenario, but I do not think it is a real-world concern: A prop=<value> / /opt:<value> / -opt:<value> is being passed and embedded " are used and the target executable doesn't also recognize these as ""-escaped; most CLIs on Windows do support "", usually in addition to \", and the only exceptions I'm aware of are some shell/scripting hybrids that accept pieces of scripts as arguments, such as PowerShell's own CLI and Ruby and Perl; it's easy to exclude PowerShell CLI calls from the accommodation, and I think it's also worth doing for Ruby and Perl.)

@TSlivede
Copy link

TSlivede commented Aug 20, 2020

Details

Maybe I didn't express clear enough, what I wanted to say.
(english is not my native language, sorry)

And, of course, in order to accommodate batch files - which as CLI entry points are probably often used without the user realizing that a batch file is involved - there's no other option than to check whether the (resolved) executable path ends in .cmd or .bat.

Ok, for batch files I can accept that - its easily documentable and its not some magic list of executables.

Also note that the prop="foo bar" msiexec-style accommodation is benign should it be applied to a CLI that doesn't actually need it - conventional CLIs treat "prop=foo bar" and prop="foo bar" as equivalent.

Yes, that was my main point - just always quote a string of the form prop=value with space as prop="value with space". This way it does not depend on the executable name and it will still always work for executables that follow the common rules.

That is absolutely horrible to debug if someone has a problem.

(a) The accommodations are easy to conceptualize and document - for which the proposed and sorely needed about_Native_Calls topic is the right place.

I agree, that such a help topic is absolutely necessary. But adding an additional section to it, that explains that some executables are treated differently doesn't exactly make the topic easier to understand...

(b) We should ship something akin to echoArgs.exe or dbea / Debug-NativeExecutable (which also supports batch files) as part of PowerShell, again to be prominently documented in about_Native_Calls.

Agreed. But such a echoArgs.exe would be much less useful if not all .exe files are treated the same.

If there must be accommodations for msiexec style CLIs, then arguments should simply always be quoted as prop="some value" if they match that form

That would mean engaging with and indulging the command-line anarchy, whereas we want to abstract it away - for the remaining cases, if any, where that doesn't work (to be diagnosed with dbea), there's --%.

PowerShell is a shell, which means that however I pass an argument that I want the target executable to see as verbatim foo=bar baz should be fine, as long as I satisfy PowerShell's syntax requirements.

E.g., $propValue = 'bar baz'; msiexec ... foo=$propValue should just work, without you having to worry about non-conventional quoting requirements enforced by a rogue CLI - would you want to think about what the value of $propValue could be and whether it may require quoting?

No, of course msiexec ... foo=$propValue should always be equivalent to msiexec ... foo="$propValue". (It's bad enough, that echoargs.exe $val and echoargs.exe "$val" already aren't equivalent, but I guess fixing that is a lost cause. (Behaves different if $val is an array...))

(On a purely technical level, I'm not even sure that the parameter binder preserves the original quoting in a way that makes it still available to the command-line construction code - but, the more important point is: we should not do it).

No, we should absolutely not do that! That's not what I meant.
I have thought about that in the past, and I think I mentioned that suggestion in some comment somewhere. But after thinking about it some more (and reading some of your comments), I definitely don't think that's a good idea.

What I meant was: Always construct the lpCommandLine with arguments quoted in a way that is understood by msiexec, etc. but also (and this is of course the most important aspect) always parsed correctly by executables that follow the common rules.

If I write in powershell

echoArgs.exe my"arg=test va"lue

it should result in the lpCommandLine

echoArgs.exe myarg="test value"

If powershell implements special handling for these parameters, then these programs will probably never be fixed and any other cross platform tools will forever be unable to properly call these programs...

Again - I don't see how any of these accommodations could break anything - do tell us if you know of any cases.

No, that can't break anything. My point is: If powershell doesn't implement any accommodations, microsoft would have more pressure to finally fix msiexec.

@TSlivede

This comment has been minimized.

@mklement0
Copy link
Contributor

Thanks, @TSlivede - I don't think there's any disagreement in substance here.

I am not advocating for a hard-coded list of "rogue" CLIs (I had considered that earlier, but decided against it), I am indeed recommending that any verbatim argument seen by PowerShell on Windows that looks like foo=bar baz or /foo:bar baz / -foo:bar baz be passed to an external executable as foo="bar baz" or /foo:"bar baz" / -foo:"bar baz" on the process command line.

Of course, this is an accommodation, but a highly beneficial one.

As for the other, closely related accommodation I proposed, namely to switch to ""-escaping in the presence of such arguments:

That CommandLineToArgvW doesn't support "" is a good reason not to do that, so I agree with you that we should not attempt this, which means that msiexec / msdeploy calls with embedded " chars. (e.g.,
msiexec ... PROP='bar "n" baz') will fail, unless they use --%, as you state - to be documented in about_Native_Calls.

But such a echoArgs.exe would be much less useful if not all .exe files are treated the same.

Given the above, the only scenarios in which echoArgs.exe then wouldn't tell you the truth is when you call batch files.

Therefore, we either need a separate echoArgs.cmd or dbea, whose -UseBatchFile switch uses a batch-file behind the scenes.

@Artoria2e5
Copy link

Artoria2e5 commented Aug 20, 2020

What I meant was: Always construct the lpCommandLine with arguments quoted in a way that is understood by msiexec, etc. but also (and this is of course the most important aspect) always parsed correctly by executables that follow the common rules.

If I write in powershell

echoArgs.exe my"arg=test va"lue

it should result in the lpCommandLine

echoArgs.exe myarg="test value"

This can be trivially done by just escaping the two pieces separately. My PR has a TODO on that while I decide on exactly which separators I am going to need to accommodate: anything more than = and :?

@mklement0

This comment has been minimized.

@TSlivede
Copy link

In my implementation I also use : and = as seperators, but I allowed : as a seperator only for keys starting with / or -. Reason: I don't want paths quoted as C:"\Program Files" - although that works for almost all programs, it looks stupid. Alternatively one could explicitly exclude single letter keys with : as seperator...

@mklement0
Copy link
Contributor

Good point about paths, @TSlivede (as an aside: very nice, concise implementation; \d isn't necessary, it's covered by \w), I actually prevent that in ie in a similar fashion, which still covers msiexec / msdeploy and cmdkey.

The question is: does it cover all high-profile cases?

We should strive for a pattern that is easy to conceptualize and document (yours qualifies), so I suggest we avoid exceptions for single-letter keys, if possible.

@mklement0
Copy link
Contributor

mklement0 commented Aug 21, 2020

Below is a proof-of-concept written in PowerShell that implements all accommodations:

Note:

  • Code applies to Windows only; on Unix, ProcessStartInfo.ArgumentList is sufficient.

  • On Windows, the code can be bypassed in favor of ProcessStartInfo.ArgumentList if neither (a) a batch file (or cmd.exe directly) is being called and (b) none of the arguments match regex '^([/-]\w+[=:]|\w+=)(.*? .*)$', which is the previously discussed pattern for msiexec-style arguments.

Running the code, which includes Pester tests, yields the following, which shows how the verbatim input arguments must be transformed to form part of the command-line string that is passed to ProcessStartInfo.Arguments (the ExpectedIfDifferent column would only be filled in if the quoting didn't work as expected):

--- NOT calling a batch file:

Verbatim             Quoted                   ExpectedIfDifferent
--------             ------                   -------------------
                     ""
foo=bar none         foo="bar none"
/foo:bar none        /foo:"bar none"
-foo:bar none        -foo:"bar none"
foo=bar "quote" none foo="bar \"quote\" none"
foo:not me           "foo:not me"
foo=notme            foo=notme
notmeeither          notmeeither
c:\program files\    "c:\program files\\"
a "b c" d            "a \"b c\" d"
a \"b c\" d          "a \\\"b c\\\" d"
3" of snow           "3\" of snow"
ab\                  ab\
a b\\                "a b\\\\"
a&b                  a&b
a"b                  a\"b

--- Calling a batch file:

Verbatim             Quoted                   ExpectedIfDifferent
--------             ------                   -------------------
                     ""
foo=bar none         foo="bar none"
/foo:bar none        /foo:"bar none"
-foo:bar none        -foo:"bar none"
foo=bar "quote" none foo="bar ""quote"" none"
foo:not me           "foo:not me"
foo=notme            foo=notme
notmeeither          notmeeither
c:\program files\    "c:\program files\\"
a "b c" d            "a ""b c"" d"
a \"b c\" d          "a \\""b c\\"" d"
3" of snow           "3"" of snow"
ab\                  ab\
a b\\                "a b\\\\"
a&b                  "a&b"
a"b                  "a""b"

Code:

# WINDOWS ONLY

# Sample arguments, as seen verbatim by PowerShell before constructing
# the command line.
# Note: The first, empty line represents an empty argument ('' / "").
$sampleArgs = @'

foo=bar none
/foo:bar none
-foo:bar none
foo=bar "quote" none
foo:not me
foo=notme
notmeeither
c:\program files\
a "b c" d
a \"b c\" d
3" of snow
ab\
a b\\
a&b
a"b
'@ -split '\r?\n'

# Non-batch files: expected quoted forms:
# * \"-escaping
# * double-quoting only for space-less arguments.
# * msiexec-style partial argument double-quoting
$expectedQuoted = @'
""
foo="bar none"
/foo:"bar none"
-foo:"bar none"
foo="bar \"quote\" none"
"foo:not me"
foo=notme
notmeeither
"c:\program files\\"
"a \"b c\" d"
"a \\\"b c\\\" d"
"3\" of snow"
ab\
"a b\\\\"
a&b
a\"b
'@ -split '\r?\n'

# Batch files: expected quoted forms:
# * ""-escaping instead of \"-escaping
# * double-qouting also for space-less args with cmd.exe metachars.
# * msiexec-style partial argument double-quoting
$expectedQuotedForBatchFile = @'
""
foo="bar none"
/foo:"bar none"
-foo:"bar none"
foo="bar ""quote"" none"
"foo:not me"
foo=notme
notmeeither
"c:\program files\\"
"a ""b c"" d"
"a \\""b c\\"" d"
"3"" of snow"
ab\
"a b\\\\"
"a&b"
"a""b"
'@ -split '\r?\n'

$false, $true | ForEach-Object {

  $callingABatchFile = $_

  "--- $($callingABatchFile ? 'C' : 'NOT c')alling a batch file: " | Write-Host -ForegroundColor Green

  # Quote the arguments for use as part of the raw command line.
  $quotedArgs = $sampleArgs | ForEach-Object {

    if ('' -eq $_) { return '""' } # Empty argument, always represented as ""

    # Analyze the argument.
    $mustDQuote = $_.Contains(' ') -or ($callingABatchFile -and $_ -match '["&|<>^,;]')
    $mustPartiallyDQuote = $mustDQuote -and $_ -match '^([/-]\w+[=:]|\w+=)(.+)$'
    $escapedDQuote = $callingABatchFile ? '""' : '\"'

    if ($mustPartiallyDQuote) { 
      $prefix, $quotedArg = $Matches[1], $Matches[2]
    }
    else {
      $prefix = ''
      $quotedArg = $_
    }

    # Escape any embedded " first, including doubling \ instances before them.
    $quotedArg = $quotedArg -replace '\\+(?=")', '$&$&' -replace '"', $escapedDQuote

    # If double-quoting must be used, trailing '\'s must be doubled.
    if ($mustDQuote) { $quotedArg = $quotedArg -replace '\\+$', '$&$&' }

    # Apply double-quoting, if necessary, and output:
    $prefix + ($mustDQuote ? ('"{0}"' -f $quotedArg) : $quotedArg)

  }

  $expected = ($callingABatchFile ? $expectedQuotedForBatchFile : $expectedQuoted)

  # Output to the host to see the transformations.
  0..($sampleArgs.Count-1) | ForEach-Object {
    [pscustomobject] @{ Verbatim = $sampleArgs[$_]; Quoted = $quotedArgs[$_]; 'ExpectedIfDifferent' = if ($quotedArgs[$_] -ne $expected[$_]) { $expected[$_]} }
  } | Out-Host

  # Run the Pester test.
  try {
    $quotedArgs | Should -BeExactly $expected
  } catch { 
    $_ | Write-Error # Write error and continue.
  }

}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Empty (but non-null, of course) arguments are totally legitimate for command lines, so IMO there's no need of filtering them out.

@ghost ghost added the Review - Needed The PR is being reviewed label Aug 30, 2020
@ghost
Copy link

ghost commented Aug 30, 2020

This pull request has been automatically marked as Review Needed because it has been there has not been any activity for 7 days.
Maintainer, please provide feedback and/or mark it as Waiting on Author

@JamesWTruher JamesWTruher force-pushed the NativeCommandQuoteEscape branch from 35c18dd to 723f53f Compare October 15, 2020 21:02
@JamesWTruher
Copy link
Collaborator Author

i'm going to close this and take a different tack. Some things have changed underneath this which need to be addressed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CL-BreakingChange Indicates that a PR should be marked as a breaking change in the Change Log Review - Needed The PR is being reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants