Skip to content

Conversation

@SteveL-MSFT
Copy link
Member

@SteveL-MSFT SteveL-MSFT commented Jun 7, 2017

Fix #1103

On Linux, when a shebang is encountered (first two bytes are #!), it looks for the interpreter to pass the pass to. In this case, if it is powershell it calls powershell with a path to the script. If the script doesn't
have a .ps1 extension, powershell treats it like a native command and executes it as such. The OS
sees the shebang and again calls powershell with the script. This causes a hang (and will eventually consume all the OS resources as it keeps spawning new powershells).

Fix is to inspect the file to see if it contains the shebang magic number. If so, we check if the running
powershell is the interpreter. If so, we treat it the same as a .ps1 script. Otherwise, we execute it as
a native command and expect the OS to find the correct interpreter (could be different version of powershell, for example).

@mklement0
Copy link
Contributor

I suggest taking a step back, in the context of our discussion to align PowerShell's CLI with that of POSIX-like shells (#3743):

POSIX-like shells interpret an operand (in POSIX speak: an argument other than an option / option-argument) as a script file to execute rather than an arbitrary shell command.

Translated into PowerShell terms that means:

  • A positional argument should bind to -File, not -Command.

If this change were made, no extra accommodations would have to made for shebang line - except that -File <file-without-ps1-extension> doesn't currently work:

Processing -File '...' failed because the file does not have a '.ps1' extension.

@SteveL-MSFT SteveL-MSFT added the Review - Committee The PR/Issue needs a review from the PowerShell Committee label Jun 8, 2017
@SteveL-MSFT
Copy link
Member Author

SteveL-MSFT commented Jun 8, 2017

@mklement0 that would certainly be a breaking change, although it would align semantics with POSIX (but break many of our tests that expect the positional parameter to be a command). We'll discuss the implications at the next committee meeting (next week).

…ithout

a ps1 extension, powershell treats it as a native command so this ends up
in a recursive loop.  fix is to inspect the command to see if it is a
shebang script and one we should handle.  if so, just treat it like a ps1
script.
Copy link
Collaborator

@iSazonov iSazonov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Leave a comment

bool isShebangScript = false;
try
{
using(FileStream fileStream = new FileStream(path, FileMode.Open, FileAccess.Read))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a lot of code! Can we just read the line (with length < Path.Max) and check Shebang with RegEx.IsMatch()? And it seems we're going to have less trouble with BOM.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted to limit how much gets read (first two bytes) since every file will get analyzed

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In either case, the entire file will be read by OS for native processing or PowerShell processing. The question is only in this local buffer. I guess 64K is not a problem.

using(StreamReader file = new StreamReader(fileStream))
{
string interpreter = file.ReadLine();
System.Reflection.Assembly assembly = System.Reflection.Assembly.GetEntryAssembly();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should have an internal flag isInterprter?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should certainly cache this since this if using shebang with PowerShell becomes popular. I have to explicitly check if the interpreter is exactly the same as the running PowerShell otherwise I let the OS handle it as we would want to support scripts that target specific versions of PowerShell instead of just the system default one. Actually this reminds me that the current code won't work with

#!/usr/bin/env powershell

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I open PR #3969 to cache the reflection - we can use Utils.GetApplicationBaseDefaultPowerShell()

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting tests

  1. interpreter can not be supported on IOS
  2. First line effective length is 512 on IOS and 127 on Linux.

@SteveL-MSFT
Copy link
Member Author

SteveL-MSFT commented Jun 8, 2017

the more I think about this, I agree with @mklement0. there's really only two approaches:

  1. we keep command as the positional parameter and only support scripts with .ps1 extension, we would still need some code to avoid the recursion and return an error
  2. we make file be the positional parameter, which is a breaking change, but we treat any given file as a script. this simplifies the shebang as it's back to just being a comment and no need to parse and we let the OS resolve the interpreter

@iSazonov
Copy link
Collaborator

iSazonov commented Jun 8, 2017

Can we use a shebang like:
#!/usr/bin/powershell -f

@lzybkr
Copy link
Contributor

lzybkr commented Jun 8, 2017

I'm not sure this is the right fix.

First, the difference between -File and -Command is significant and important here in terms of setting the exit code.

Second, I don't think we want shebang PowerShell scripts to run in process, which this change would allow.

I believe the correct fix will involve detecting that we're invoked via shebang, which might mean looking at arg[0] in the console host entry point - it probably isn't powershell.

@SteveL-MSFT
Copy link
Member Author

@iSazonov #!/usr/bin/powershell -f is possible, but only if the script has the .ps1 extension

@mklement0
Copy link
Contributor

mklement0 commented Jun 9, 2017

@lzybkr:

Fundamentally: Is aligning with the CLI of POSIX-like shells a goal in principle?

If so:

  • breaking changes are inevitable.

  • the awkward emulation of the shebang-line mechanism considered here isn't needed.

  • supporting scripts that do not have extension (suffix) .ps1 is a must, given that extensions have no intrinsic meaning on Unix platforms and the executability of a file is solely governed by its permission bits relating to its ownership relative to the invoking user.

POSIX-like shells have no analog to the -File option, but they consistently interpret their 1st operand (POSIX speak for: positional argument, as distinct from an option or option-argument) as:

  • the path to a script file to execute
  • in a non-interactive shell (session)
  • that exits when the script exits
  • with the script's exit code getting passed through.

(Conversely, to pass an arbitrary shell command line, you must use the -c option.)

The shebang mechanism - built into the kernel of Unix platforms - relies on that very behavior:

The path to the script file (exactly as specified when the script was directly invoked) is passed as an (operand) argument to the interpreter executable specified by full name in the shebang line, [edit] followed by the arguments given to the script on invocation.
(Platforms differ in how many fixed arguments they support as part of the shebang line; only one is guaranteed to work portably.)

In terms of the command line that is invoked by the kernel, and the argv[0] ($0, in POSIX shell speak) reported in the resulting process:

  • Unless explicitly changed, $0 reflects the path to the script file exactly as invoked.

    • argv[0] / $0 can be set to arbitrary values (see below).
  • ps -o args= $$ tells you the command line as invoked ($$ is the current PID in POSIX-like shells; I think what that ps command reports can't be tampered with, but I'm not sure; in any event, necessary shell quoting around the original arguments will not be reflected).

A simple example: Let's assume script t, placed in the current directory and made executable with chmod +x t:

#!/bin/sh

echo "\$0: [$0]"
echo "Originating command line: [$(ps -o args= $$)]"

Let's invoke that script directly, letting the kernel interpret the shebang line:

$ ./t one two
$0: [./t]
Originating command line: [/bin/sh ./t one two]

Now let's see how we can change argv[0] / $0 (and also demonstrate how the command line that ps reports doesn't reflect the original shell quoting):

$ sh -c '. ./t' 'foo bar'
$0: [foo bar]
Originating command line: [sh -c . ./t foo bar]

Note when a POSIX-like shell is explicitly invoked with a script filename operand - e.g., sh ./t - it:

  • does not care whether the script file is executable - all that's needed is the ability to read the file.
  • ignores the shebang line (which may or may not specify the shell the script is being passed to), because it is a comment from the shell's perspective.

I don't think we want shebang PowerShell scripts to run in process

It would run in-process in the powershell process, and that's exactly what's expected (and what happens when you run sh ./t, for instance).
In-process considerations apply only when you have a parent session from which you launch other commands, but here there's only 1 session here, whose sole purpose is to run the script, and - at least by default - immediately exit afterwards.

@lzybkr
Copy link
Contributor

lzybkr commented Jun 9, 2017

@mklement0 - if you run ./t and sh ./t, you are starting a new process in both cases.

With the proposed change, running ./t from bash would start a new PowerShell process, but it would not when running the same command line from PowerShell.

This should be a intentional design choice, not an accidental side effect of fixing the recursive hang. My inclination is that starting a new process is desirable (so perhaps more POSIX-like), but I'm open to a discussion, especially given PowerShell's slow startup cost.

As for -File - I mentioned that only in regards to semantics - we want shebang to behave as though -File was specified, in part so the exit code is set correctly.

And as for $0, I was hinting at what I thought it would take to implement what I propose above - that we have a new code path that behaves as though you ran powershell -File foo when argv[0] == 'foo'.

@mklement0
Copy link
Contributor

mklement0 commented Jun 9, 2017

but it would not when running the same command line from PowerShell.

Again, if we're talking about aligning with POSIX-like shells:
It certainly should run in a child process, because that invocation should be treated as the invocation of an external utility (delegated to a system call such as execv on Unix).

I get that this is not how it currently works with invoking *.ps1 files, which do run in the same process, but perhaps the extension / lack thereof could be the distinguishing factor:

  • Continue to run *.ps1 files in-process from within a PowerShell session, for backward compatibility.

    • User from a Unix background would certainly need to be made aware of that fact, because their expectation will be that any script not explicitly invoked with . runs in a child process.
  • Treat files without that extension like any other external utility - invoke it via the platform's system calls (you don't even need to know / care what they are and whether their shebang line happens to also target PowerShell).

    • If, alternatively, in a given scenario, the explicit intent is to execute a file in-process, dot-sourcing is an option (which already works with files that do not have a .ps1 extension).

What interpreter ultimately processes a given executable file that happens to have a shebang line should really be considered an implementation detail, and there's no reason to reflect that interpreter in the filename extension (even though, sadly, there's a widespread bad habit in the Unix world of naming shell scripts *.sh).

@iSazonov
Copy link
Collaborator

iSazonov commented Jun 9, 2017

What is the behavior for other interpreters (bash, perl, python, etc.)? Run a native script in-process?

@SteveL-MSFT
Copy link
Member Author

perl follows the POSIX convention of having the positional parameter point to a script and using an explicit -c for a command
bash, of course, does the same
python accepts the first positional as the script and subsequent positional parameters passed as args to the script

We should strongly consider adopting the POSIX conventions for PowerShell (which means closing this PR):

  1. we've already seen that existing Linux tools that work with shells have expectations on parameter semantics and syntax and if we don't adopt them, it means there are some tools people use that aren't compatible with PowerShell
  2. if we stay different, this will impact our adoption on Linux
  3. since this is a breaking change, if we ever do this, we need to do this with the 6.0.0 release

@iSazonov
Copy link
Collaborator

iSazonov commented Jun 9, 2017

My question was how they behave internally: if we type in Bash "bash-script-file-name" - do Bash run it "in-process"?

@mklement0
Copy link
Contributor

mklement0 commented Jun 9, 2017

@SteveL-MSFT: An (interfaith) amen to that.

accepts the first positional as the script and subsequent positional parameters passed as args to the script

Interpreting all remaining arguments after the script filename as the arguments to pass through to the script is also part of "shebang-line compatibility", so all POSIX-like shells and Perl support it too; Ruby are Node.js are other examples.

The ability to use -c (per POSIX) for passing a command string (the contents of a script) is less universally supported: among the interpreters mentioned, Perl, Ruby, and Node.js require -e, unfortunately.

Combining -c with positional arguments is supported and required by POSIX, but involves a pitfall:

The first argument after the command string binds to $0 (the equivalent of $myInvocation.MyCommand.Name), not $1, which means that it isn't reflected in the $@ ($*) array of positional parameters.

$ sh -c 'printf "%s " "$@"' ARG0 one two
one two    # !! Note how ARG0 is not listed.

PowerShell does not support this, currently:

# !! BROKEN
PS> powershell -noprofile -command '$args' one two
Unexpected token 'one' in expression or statement.

# !! BROKEN
PS> powershell -noprofile -command '"`$args: $args"' one two
$args : The term '$args' is not recognized as the name of a cmdlet, function, script file, or operable program

It looks like PowerShell simply string-appends the additional arguments directly to the command string, which, of course, only works in limited cases, whereas -c in POSIX-like shells basically allows passing an entire script ad-hoc, with arguments passed as positional parameters, as to a script.

The only way to get this to work currently appears to be:

$ powershell -noprofile -command '& { "`$args: $args" }' one two
$args: one two

From within Powershell:

  • This actually BREAKS:
# !! BROKEN when invoked from PowerShell
PS> powershell -noprofile -command '& { "`$args: $args" }' one two
$args: : The term '$args:' is not recognized as the name of a cmdlet, function, script file, or operable program.
  • Using an actual script block (not a string representation) breaks too, because arguments are apparently not supported:
# !! DOES NOT WORK - arguments not supported.
PS> powershell -noprofile -command { "`$args: $args" } one two
(prints CLI help text)

# Only works *without arguments*:
PS> powershell -noprofile -command { "`$args: [$args]" }
$args: []

Other problematic aspects of -Command are discussed in #3223.

@mklement0
Copy link
Contributor

mklement0 commented Jun 9, 2017

@iSazonov:

if we type in Bash "bash-script-file-name" - do Bash run it "in-process"?

No: The only way to make a POSIX-like shell run a script in-process in an existing session is to use . (or its nonstandard, effective alias, source).

Conversely, anything that is neither a shell builtin (command) nor a shell-language statement is considered an external utility to be invoked via a system call, which makes it inevitably run in a child process, and it is irrelevant (a) whether the script invoked is indeed a script or a binary and (b) if the former, what shell/interpreter will process it.

Note, however, that this only works with executable shell scripts (whereas invocation with . doesn't require this, but will, of course, only work if the script "speaks the same language" as the invoking shell).

An otherwise well-formed script with shebang line that isn't executable [by the current user] will simply result in a Permission denied error.

If it is executable, a system call is made to invoke it, at which point the kernel's shebang-line processing will kick in.

As a courtesy, if a script is executable but lacks a (valid) shebang line, POSIX-like shells will fall back to processing the script themselves, but - again - in a separate instance in a child process.

On a side note: A UTF-8 BOM would break a shebang line.

@mklement0
Copy link
Contributor

mklement0 commented Jun 9, 2017

To more explicitly contrast the invoke-a-script-written-in-the-shell's-own-language scenarios:

  • In POSIX-like shells, invoking a script is an all-or-nothing choice:

    • Invoked with . (source), everything the script does potentially affects the current shell; (e.g., a non-local variable created in the script will linger).

    • Invoked by file path only, it will run in a child process that cannot affect the calling shell's environment at all.

  • In PowerShell, the distinctions are more nuanced:

    • Invoked with . (source), the same applies as in POSIX-like shells: everything the script does potentially affects the current session and its global state.

    • Invoked by file path only, the script still runs in-process, but in a child scope:

      • While that commendably localizes variables and function definitions,

      • it still allows for explicit and - pitfall alert - implicit modification of the session's global state - something that will catch Unix shell users by surprise; e.g.:

        • Simply changing the current location in the script (using Set-Location, the equivalent of cd) changes the location globally.
        • By contrast, changing a global variable, for instance, requires a more deliberate effort.

Personally, I think the implicit and invariably global location-changing behavior is problematic.

@powercode
Copy link
Collaborator

Another area is static members on classes.
They will also be shared implicitly.

Maybe we should have syntactic sugar for invoking scripts easily out-of-proc.
like

.\a.ps1
. .\a.ps1
! .\a.ps1 # -> powershell.exe -noprofile -file .\a.ps1

@powercode
Copy link
Collaborator

Or change the defaults before PowerShell v6 ships, so that you explicitly have to opt in to run in-proc.

@mklement0
Copy link
Contributor

mklement0 commented Jun 9, 2017

Another area is static members on classes.

Good point, and definitely worth documenting, though probably less of a problem in practice.

Maybe we should have syntactic sugar

Introducing another symbol-based operator may not be worth the effort (as an aside: I assume ! was just an example, but to be clear: it's already taken), especially given that, once the CLI has been harmonized with POSIX, all you'd need to do to run out-of-process is:

powershell ./foo.ps1

so that you explicitly have to opt in to run in-proc.

As long as it doesn't come with surprises, I think that running in-process by default is a great strength:

  • No startup cost (no child-process creation, no PowerShell startup cost)
  • Integration with the calling session with respect to input/output types (no need for serializing / deserializing of objects to to/from strings on in-/output).

@SteveL-MSFT SteveL-MSFT added Committee-Reviewed PS-Committee has reviewed this and made a decision and removed Review - Committee The PR/Issue needs a review from the PowerShell Committee labels Jun 14, 2017
@SteveL-MSFT
Copy link
Member Author

@PowerShell/powershell-committee reviewed this and agree that the right change is to have -File be the positional parameter instead of -Command. Also allow scripts used with -File (or implicit) to not require a .ps1 extension. A new PR will be submitted.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Committee-Reviewed PS-Committee has reviewed this and made a decision

Projects

None yet

Development

Successfully merging this pull request may close these issues.

PowerShell Scripts require .ps1 extension

6 participants