Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 0 additions & 7 deletions src/System.Management.Automation/engine/lang/parserutils.cs
Original file line number Diff line number Diff line change
Expand Up @@ -723,13 +723,6 @@ private static object SplitWithPattern(ExecutionContext context, IScriptExtent e
}
}

if ((options & (SplitOptions.Multiline | SplitOptions.Singleline)) ==
(SplitOptions.Multiline | SplitOptions.Singleline))
{
throw InterpreterError.NewInterpreterException(null, typeof(ParseException),
errorPosition, "InvalidSplitOptionCombination", ParserStrings.InvalidSplitOptionCombination);
}

if ((options & SplitOptions.SimpleMatch) != 0)
{
separatorPattern = Regex.Escape(separatorPattern);
Expand Down
242 changes: 242 additions & 0 deletions test/powershell/Language/Operators/SplitOperator.Tests.ps1
Original file line number Diff line number Diff line change
@@ -0,0 +1,242 @@
Describe "Split Operator" -Tags CI {
Context "Binary split operator" {
It "Binary split operator can split array of value" {
$res = "a b", "c d" -split " "
$res.count | Should Be 4
$res[0] | Should Be "a"
$res[1] | Should Be "b"
$res[2] | Should Be "c"
$res[3] | Should Be "d"
}

It "Binary split operator can split a string" {
$res = "a b c d" -split " "
$res.count | Should Be 4
$res[0] | Should Be "a"
$res[1] | Should Be "b"
$res[2] | Should Be "c"
$res[3] | Should Be "d"
}

It "Binary split operator can works with max substring limit" {
$res = "a b c d" -split " ", 2
$res.count | Should Be 2
$res[0] | Should Be "a"
$res[1] | Should Be "b c d"

$res = "a b c d" -split " ", 0
$res.count | Should Be 4
$res[0] | Should Be "a"
$res[1] | Should Be "b"
$res[2] | Should Be "c"
$res[3] | Should Be "d"

$res = "a b c d" -split " ", -1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a valid scenario, or should PowerShell report an error when the count is less than 1?

Copy link
Collaborator Author

@iSazonov iSazonov Sep 5, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I get this from our documentation - see <Max-substrings>.
So it will be a breaking change if we add a exception here. I don't see the need for that.

@mklement0 Could you please comment too?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@iSazonov: Yes, the documentation states "A value of 0 and negative values return all the substrings".

Supporting 0 definitely makes sense, because in order to be able to specify options you must also specify a <max-substrings> value for syntactical reasons, so you need a way to signal the default of "return all".

Negative values serve no real purpose - and could lead people to mistakenly assume that they have special meaning - but given that they're documented and that we could break existing code, I don't think a change is warranted.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mklement0 Thanks for good comment!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is an argument in favor of a breaking change - is a negative count a likely bug? For example, if I compute the count and swapped my operands, this feels like a bug PowerShell should catch.

I do think it's an unlikely breaking change, mostly because the count it most commonly a constant.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant something like:

$cnt = $b - $a # Should have been $a - $b
'a b c d' -split ' ', $cnt

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it, thanks. Preventing negative values categorically is certainly at odds with what I just proposed, but I'd still consider it an improvement.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is interesting improvement:

PS C:\temp> 'a b c d' -split ' ', 2
a
b c d
PS C:\temp> 'a b c d' -split ' ', 1
a b c d
PS C:\temp> 'a b c d' -split ' ', 0
a
b
c
d
PS C:\temp> 'a b c d' -split ' ', -1
a b c d
PS C:\temp> 'a b c d' -split ' ', -2
a b c
d

Copy link
Contributor

@mklement0 mklement0 Sep 6, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@iSazonov: Yes, though your -2 behavior is what I suggest for -1 - analogous to index [-1] retrieving the last element of a collection.
Not sure if providing a way to not split at all is necessary.

> 'a b c d' -split ' ', -1  # last 1
a b c
d

> 'a b c d' -split ' ', -2  # last 2
a b
c
d

Also, it's probably worth always returning a first item that represents the remaining, unsplit part - even if that part is empty:

> 'a b' -split ' ', -2  # last 2, with nothing remaining
   # empty string  to signal that there's no remaining part
a
b

That enables the following idiom, irrespective of whether there's a remaining part (prefix) or not.

> $prefix, $tokens = 'a b' -split ' ', -2

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mklement0 I like the "breaking change". If @lzybkr approve we should open Issue Enhancement.

$res.count | Should Be 4
$res[0] | Should Be "a"
$res[1] | Should Be "b"
$res[2] | Should Be "c"
$res[3] | Should Be "d"
}

It "Binary split operator can works with freeform delimiter" {
$res = "a::b::c::d" -split "::"
$res.count | Should Be 4
$res[0] | Should Be "a"
$res[1] | Should Be "b"
$res[2] | Should Be "c"
$res[3] | Should Be "d"
}

It "Binary split operator can preserve delimiter" {
$res = "a1:b1:c1:d" -split "(1:)"
$res.count | Should Be 7
$res[0] | Should Be "a"
$res[1] | Should Be "1:"
$res[2] | Should Be "b"
$res[3] | Should Be "1:"
$res[4] | Should Be "c"
$res[5] | Should Be "1:"
$res[6] | Should Be "d"

$res = "a1:b1:c1:d" -split "1(:)"
$res.count | Should Be 7
$res[0] | Should Be "a"
$res[1] | Should Be ":"
$res[2] | Should Be "b"
$res[3] | Should Be ":"
$res[4] | Should Be "c"
$res[5] | Should Be ":"
$res[6] | Should Be "d"
}

It "Binary split operator can be case-insensitive and case-sensitive" {
$res = "abcBd" -split "B"
$res.count | Should Be 3
$res[0] | Should Be "a"
$res[1] | Should Be "c"
$res[2] | Should Be "d"

$res = "abcBd" -isplit "B"
$res.count | Should Be 3
$res[0] | Should Be "a"
$res[1] | Should Be "c"
$res[2] | Should Be "d"

$res = "abcBd" -csplit "B"
$res.count | Should Be 2
$res[0] | Should Be "abc"
$res[1] | Should Be "d"

$res = "abcBd" -csplit "B", 0 , 'IgnoreCase'
$res.count | Should Be 3
$res[0] | Should Be "a"
$res[1] | Should Be "c"
$res[2] | Should Be "d"
}

It "Binary split operator can works with script block" {
$res = "a::b::c::d" -split {$_ -eq "b" -or $_ -eq "C"}
$res.count | Should Be 3
$res[0] | Should Be "a::"
$res[1] | Should Be "::"
$res[2] | Should Be "::d"
}

}

Context "Binary split operator options" {
BeforeAll {
# Add '%' in testText2 in order to second line doesn't start with 'b'.
$testCases = @(
@{ Name = '`n'; testText = "a12a`nb34b`nc56c`nd78d"; testText2 = "a12a`n%b34b`nc56c`nd78d"; newLine = "`n" }
@{ Name = '`r`n'; testText = "a12a`r`nb34b`r`nc56c`r`nd78d"; testText2 = "a12a`r`n%b34b`r`nc56c`r`nd78d"; newLine = "`r`n" }
)
}

It "Binary split operator has no Singleline and no Multiline by default (new line = '<Name>')" -TestCases $testCases {
param($testText, $testText2, $newLine)
# Multiline isn't default
$res = $testText -split '^b'
$res.count | Should Be 1

# Singleline isn't default
$res = $testText -split 'b.+c'
$res.count | Should Be 1
}

It "Binary split operator works with Singleline (new line = '<Name>')" -TestCases $testCases {
param($testText, $testText2, $newLine)
$res = $testText -split 'b.+c', 0, 'Singleline'
$res.count | Should Be 2
$res[0] | Should Be "a12a$($newLine)"
$res[1] | Should Be "$($newLine)d78d"

$res = $testText2 -split 'b.+c', 0, 'Singleline'
$res.count | Should Be 2
$res[0] | Should Be "a12a$($newLine)%"
$res[1] | Should Be "$($newLine)d78d"
}

It "Binary split operator works with Multiline (new line = '<Name>')" -TestCases $testCases {
param($testText, $testText2, $newLine)
$res = $testText -split '^b', 0, 'Multiline'
$res.count | Should Be 2
$res[0] | Should Be "a12a$($newLine)"
$res[1] | Should Be "34b$($newLine)c56c$($newLine)d78d"
}

It "Binary split operator works with Singleline,Multiline (new line = '<Name>')" -TestCases $testCases {
param($testText, $testText2, $newLine)
$res = $testText -split 'b.+c', 0, 'Singleline,Multiline'
$res.count | Should Be 2
$res[0] | Should Be "a12a$($newLine)"
$res[1] | Should Be "$($newLine)d78d"

$res = $testText2 -split 'b.+c', 0, 'Singleline,Multiline'
$res.count | Should Be 2
$res[0] | Should Be "a12a$($newLine)%"
$res[1] | Should Be "$($newLine)d78d"

$res = $testText -split '^b.+c', 0, 'Singleline,Multiline'
$res.count | Should Be 2
$res[0] | Should Be "a12a$($newLine)"
$res[1] | Should Be "$($newLine)d78d"

$res = $testText2 -split '^b.+c', 0, 'Singleline,Multiline'
$res.count | Should Be 1
}

It "Binary split operator works with IgnorePatternWhitespace" {
$res = "a: b:c" -split ': '
$res.count | Should Be 2
$res[0] | Should Be "a"
$res[1] | Should Be "b:c"

$res = "a: b:c" -split ': ',0,'IgnorePatternWhitespace'
$res.count | Should Be 3
$res[0] | Should Be "a"
$res[1] | Should Be " b"
$res[2] | Should Be "c"
}

It "Binary split operator works with ExplicitCapture" {
$res = "a:b" -split "(:)"
$res.count | Should Be 3
$res[0] | Should Be "a"
$res[1] | Should Be ":"
$res[2] | Should Be "b"

$res = "a:b" -split "(:)", 0, 'ExplicitCapture'
$res.count | Should Be 2
$res[0] | Should Be "a"
$res[1] | Should Be "b"
}

It "Binary split operator works with SimpleMatch" {
$res = "abc" -split "B", 0, 'SimpleMatch,IgnoreCase'
$res.count | Should Be 2
$res[0] | Should Be "a"
$res[1] | Should Be "c"
}

It "Binary split operator works with RegexMatch" {
$res = "abc" -split "B", 0, 'RegexMatch,Singleline'
$res.count | Should Be 2
$res[0] | Should Be "a"
$res[1] | Should Be "c"
}

It "Binary split operator doesn't works with RegexMatch,SimpleMatch" {
{ "abc" -split "B", 0, 'RegexMatch,SimpleMatch' } | ShouldBeErrorId "InvalidSplitOptionCombination"
}
}

Context "Unary split operator" {
It "Unary split operator has higher precedence than a comma" {
$res = -split "a b", "c d"
$res.count | Should Be 2
$res[0][0] | Should Be "a"
$res[0][1] | Should Be "b"
$res[1] | Should Be "c d"
}

It "Unary split operator can split array of values" {
$res = -split ("a b", "c d")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wow didn't know you can do that!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vors:
Turns out I didn't know either, mostly because the documentation:

  • in part suggests that only string scalars are supported (<String> in the syntax diagram)

  • where it does mention arrays (multiple input strings), mistakenly claims that only the binary form supports them ("To split more than one string, use the binary split operator ").

I've created a doc issue to address that.

(I originally thought that when you pass an array it simply gets stringified based on $OFS, but that's not the case.)

$res.count | Should Be 4
$res[0] | Should Be "a"
$res[1] | Should Be "b"
$res[2] | Should Be "c"
$res[3] | Should Be "d"
}

It "Unary split operator can split a string" {
$res = -split "a b c d"
$res.count | Should Be 4
$res[0] | Should Be "a"
$res[1] | Should Be "b"
$res[2] | Should Be "c"
$res[3] | Should Be "d"
}
}
}