Skip to content

Parsing of compound command-line tokens into arguments is surprising #6467

@mklement0

Description

@mklement0
  • The primary purpose of this issue to highlight how the existing behavior may be surprising to newcomers (and even for seasoned users hard to remember and therefore predict in specific situations).

  • Given that the behavior has been around since the beginning, changing it is probably not an option, but documenting how the behavior may be surprising is worth doing; while some aspects of the behavior can be inferred from about_Parsing, the overall implications aren't clear.

  • While this may ultimately become a documentation-repo issue, I would first like to solicit some feedback as to what code changes, if any, should be considered, after all (there are edge cases), and as to whether my summary is accurate.

  • While there may be good technical reasons for these behaviors, the problem is that they may be unexpected, especially given the deviation from how cmd.exe and bash handle these scenarios.

  • All surprising behaviors can be avoided with proper quoting - which is also worth pointing out in the docs - but the need to quote negates one of the convenient aspects of argument syntax. If at the end of the day you cannot remember whether a particular token requires quoting, your choices are to resort to trial and error, or to quote everything, which is cumbersome.

Note: By compound command-line token I mean a logically single token composed of the following syntax constructs by direct concatenation:

  • barewords (unquoted strings without special meaning; e.g., foo)
  • quoted strings (e.g.; 'bar', "baz")
  • variable references ($foo) and subexpressions ($(...))

Examples:

  • bar${foo} and ${foo}bar and $($foo)bar
  • foo"bar" and "bar"foo

The surprising aspects are:

Asymmetry: Whether a quoted string or subexpression comes first matters:

Placing the quoted string or subexpression first results in 2 arguments:

PS> Write-Output 'hi'there
hi
there

> Write-Output $($env:HOME)/Documents
/Users/jdoe
/Documents

Otherwise, a single argument is passed, as expected:

PS> Write-Output hi'there'
hithere

PS> Write-Output Documents:$($env:HOME)  # Note: just $env:HOME without $() is not affected.
Documents:/Users/jdoe

. following a leading quoted string is interpreted as [string] property access

PS> Write-Output 'hi'.there
   # NO output, because an attempt is made to access property .there on the string instance

. as the initial char. followed by a quoted string results in 2 arguments

PS> Write-Output .'hi'
.
hi

- as the initial char. in something that turns out not to be a parameter name results in 2 arguments with the presence of : or .

See also: #6292 and #6360.

PS> Write-Output -foo:bar
-foo:
bar

PS> Write-Output -ip=10.0.0.1
-ip=10
.0.0.1

- as the initial char. causes embedded variable references not to be expanded

See also: #4624.

PS> Write-Output a$HOME
a/home/jdoe  # e.g.; $HOME was expanded, as expected

PS> Write-Output -a$HOME
-a$HOME  # !! $HOME was NOT expanded

Note that the problem only surfaces if the there's at least 1 literal char. following the - before the variable reference; for instance, in -$HOME the variable is expanded.

Environment data

Written as of:

PowerShell Core v6.0.2

Metadata

Metadata

Assignees

Labels

Issue-Discussionthe issue may not have a clear classification yet. The issue may generate an RFC or may be reclassifResolution-AnsweredThe question is answered.WG-Languageparser, language semantics

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions