Improve Null-Like behaviour in comparisons #10422

vexx32 · 2019-08-22T17:23:00Z

PR Summary

This changes the operator binders for -lt/-gt (etc.) to recognise null-like values the same as it does for null, so that it treats them as a 0 value when comparing to a numeric value.

[nullstring]::Value -gt 3 # False
[dbnull]::Value -gt 2 # False

PR Context

Fixes #10404

PR Checklist

src/System.Management.Automation/engine/runtime/Binding/Binders.cs

iSazonov · 2019-08-23T05:10:15Z

One question is performance degradation.
I guess we need replace
public static bool IsNullLike(object obj) => obj == DBNull.Value || obj == NullString.Value || IsNull(obj);
with
public static bool IsNullLike(object obj) => IsNull(obj) || obj == DBNull.Value || obj == NullString.Value;

vexx32 · 2019-08-23T15:55:03Z

Good call, @iSazonov, made that change. 💖

src/System.Management.Automation/engine/runtime/Binding/Binders.cs

daxian-dbw · 2019-09-03T21:21:10Z

I played the changes a bit and found an issue:

> [System.DBNull]::Value -ge [NullString]::Value
True
> [NullString]::Value -le [System.DBNull]::Value
True

However, we don't consider [DBNull]::Value -eq [NullString]::Value , see https://github.com/PowerShell/PowerShell/pull/9794/files#diff-c6cad5a3db5d6bcb21228e483b7506d1R31-R42

GitHub
Consider DBNull.Value and NullString.Value the same as $null by vexx32 · Pull Request #9794 · PowerShell/PowerShell
PR Summary Closes #9561
Adds LanguagePrimitives.IsNullLike() method to account for DBNull.Value and NullString.Value so that they can be considered the same as a null value where sensible in Power...

vexx32 · 2019-09-03T21:42:24Z

@daxian-dbw yeah I can see how that would arise, for sure. It is an anomaly, but it makes some sense here.

With -ge operators we are explicitly using either only a string or a numeric comparison mode. As a result, both operands must be converted into a numeric form or a string form for the comparison. Both cast to 0 or an empty string if we treat them the same way PowerShell currently handles conversions to these types from $null.

Is there a better option available for us to use here?

iSazonov · 2019-09-04T03:30:46Z

Taking that [DBNull]::Value -eq [NullString]::Value is false we should fix -ge and -le I believe.

vexx32 · 2019-09-04T03:41:48Z

Is there a sensible result for either comparison? In what sense is one version of a null value numerically greater or lesser than another? Or should we be doing something very unusual and always returning $false for any such comparison?

In other words, should this be the case?

$NullString = [NullString]::Value
$DBNull = [DBNull]::Value

$NullString -ge $DBNull # -> false
$NullString -le $DBNull # -> false
$NullString -gt $DBNull # -> false
$NullString -lt $DBNull # -> false
$DBNull -ge $NullString # -> false
$DBNull -le $NullString # -> false
$DBNull -gt $NullString # -> false
$DBNull -lt $NullString # -> false

While this is possible, I'm not sure it's sensible. -ge, -gt, and their counterparts are deliberately implemented from the beginning as different from -eq and -ne. The latter convert types when needed only; the former will always evaluate as either a numeric or a string comparison.

It boils down to a pretty simple problem: what is the expected result here? In my view, it should be reasonably intuitive, and although perhaps initially surprising, I think the current behaviour makes more sense than the alternatives I can see at the moment.

iSazonov · 2019-09-04T04:31:22Z

We could add samples for $null in the list.

vexx32 · 2019-10-02T22:28:18Z

@iSazonov all comparisons with null-like values and $null reflect that they're equal. I'll have to verify -ge and -le but I believe they evaluate as equal in all cases. 🙂

vexx32 · 2019-10-03T20:17:07Z

@iSazonov updated list of what the suggested change $DBNull -ge $NullString => $false and $DBNull -le $nullString=> $false would look like with $null included:

(Personally, I don't think this makes sense; these comparisons have always worked on a slightly different basis to -eq/-ne as they're stringly numeric/string comparisons anyway.)

$NullString = [NullString]::Value
$DBNull = [DBNull]::Value

$NullString -ge $DBNull # -> false
$NullString -le $DBNull # -> false
$NullString -ge $null # -> true
$NullString -le $null # -> true

$NullString -gt $DBNull # -> false
$NullString -lt $DBNull # -> false
$NullString -gt $null # -> true (?)
$NullString -lt $null # -> false (?)

$DBNull -ge $NullString # -> false
$DBNull -le $NullString # -> false
$DBNull -ge $null # -> true
$DBNull -le $null # -> true

$DBNull -gt $NullString # -> false
$DBNull -lt $NullString # -> false
$DBNull -gt $null # -> true (?)
$DBNull -lt $null # -> false (?)

I think that despite $NullString and $DBNull being not equal to each other, comparisons with -ge or -le (or indeed -gt or -lt) are explicitly either based on numeric value or string sort order. As such, they should convert to 0 or empty string and give a $true result for both -ge/-le and a $false result for -gt/-lt — this is how the current implementation in this branch would work.

Thoughts, @daxian-dbw? 🙂

vexx32 · 2019-10-12T19:52:19Z

@daxian-dbw @SteveL-MSFT friendly ping for review! 💖 😊

SteveL-MSFT · 2019-10-15T00:44:41Z

test/powershell/Language/Scripting/NullRepresentatives.Tests.ps1

Any particular reason 0 is left hand side unlike the tests below? For completeness, it seems -le should be covered.

Not that I can recall. Will add the -le test as well. 🙂

Addressed and pushed new commits. Also rebased since this branch is pretty old. 🙂

Handling null-like items (DBNull.Value/NullString.Value) the same way as true null here allows database-interfacing code to do numeric comparisons such as `$dbEntry.Value -gt 4` and get a sensible response of true/false depending on operator. Without this change, these operations would result in errors.

Use -ge not -eq for test For fairly plain reasons, PS doesn't consider $null as equal to zero This is only a good approximation in definitively numeric circumstances

vexx32 · 2019-10-15T12:17:46Z

The -eq case doesn't return true because it's not explicitly a mathematical operation, equality is more than just math. A greater/less than comparison, though, is a mathematical operation. Or string sorting, in our cases.

Pretty sure I have your last option there completely implemented. There should be a test confirming your point about $null -lt 0 but I will double check.

mklement0 · 2019-10-15T12:25:33Z

Good point about 0 -ge $null / 0 -gt $null / 0 -le $null currently misbehaving too, @vexx32

There, the case is clear-cut: the numeric LHS should coerce the $null to a number, which should be 0.

Would this need to be a separate fix?

A greater/less than comparison, though, is a mathematical operation.

Well, it's whatever the type's IComparable implementation defines it to be - and that doesn't have to be numeric comparison, which is what we're - inconsistently - assuming here.

Or string sorting, in our cases.

Again, this comes from the LHS dictating the behavior (in line with what @KirkMunro says) - and the consistent implementation of that for $null would be to say: you cannot compare $null (except to itself, with -eq).

vexx32 · 2019-10-15T12:34:22Z

I am happy to commit that fix separately in another PR if we prefer. Now that I see where it went wrong, it's a fairly simple asymmetry to resolve.

iSazonov · 2019-10-15T13:15:29Z

Would this need to be a separate fix?

Will it again a breaking change?

mklement0 · 2019-10-15T14:05:38Z

@iSazonov:

It would technically be a breaking changing, but given that the current behavior is useless in the cases we'd be fixing, I don't expect that to be a problem.

Let me try to summarize the discussion, to make sure we're on the same page:

$null -eq 0 retains its current behavior of evaluating to $false, because $null as the LHS with -eq must continue to function as the is-it-$null[-like] test.
-ge / -le / -gt / -lt behavior will be modified so that $null on either side is implicitly treated like 0, to make the current behavior with these operators consistent.
While this distinction between -eq vs. -ge / -le / -gt / -lt behavior amounts to an overall inconsistency with respect to letting the LHS drive type coercions, given the current behavior, it strikes me as a reasonable way forward:
- The truly consistent behavior - failing with Cannot compare $null because it is not IComparable with $null as the LHS when combined with -ge / -le / -gt / -lt - is probably not that useful and may break existing scripts (though the treat-as-0 behavior currently only works in some cases - see tests below).
- Absent that, I suppose another option is to always return $false if the RHS isn't null-like, but that could again break existing scripts.

At the very least, the cases where $null[-like] is the RHS and currently doesn't behave like 0 with a numeric LHS should be fixed.

To put it in terms of Pester tests:

Null-like as the RHS:

# These currently fail and should be fixed in any event.
# $null[-like] should consistently be coerced to 0 in these cases.
0 -eq $null | Should -BeTrue
0 -ne $null | Should -BeFalse
0 -gt $null | Should -BeFalse
0 -le $null | Should -BeTrue

# These already work as expected.
0 -ge $null | Should -BeTrue
0 -lt $null | Should -BeFalse

Null-like as the LHS:

# Passes, and shouldn't change.
# Anything other than a null-like type on the RHS should result in $false
$null -eq 0 | Should -BeFalse

# These are currently the only cases with $null as the LHS were
# numeric interpretation works as expected:
$null -le 0 | Should -BeTrue
$null -lt 0 | Should -BeFalse

# These currently fail from a treat-null-as-0 perspective.
$null -ge 0 | Should -BeTrue
$null -lt 0 | Should -BeFalse

iSazonov · 2019-10-15T14:27:51Z

@mklement0 Thanks for the summarization!
This is twice in example above

0 -gt $null | Should -BeFalse

Difference $null -eq 0 and 0 -eq $null is too unexpected and I believe this should not be in the scripting language, it is very bad UX.

mklement0 · 2019-10-15T15:14:29Z

Thanks, @iSazonov, I've removed the duplicate test from my previous comment.

Difference $null -eq 0 and 0 -eq $null is too unexpected

I agree that it is unfortunate, but not fundamentally - asymmetrical -eq behavior is very much part of PowerShell (see below) - but because it creates a surprising behavioral difference between -eq / -ne on the one hand and -ge / -le / -gt / -lt on the other, which doesn't exist for other types (again, if $null -ge 0 is $true because $null is treated like 0, why wouldn't $null -eq 0 be $true as well?)

However, we cannot change $null -eq 0 / $null -ne 0, because historically $null -eq ... has been the only robust test for comparing any value to $null.

Pragmatically speaking, interpretation of LHS $null as 0 with -ge / -le / -gt / -lt does make sense, especially if you consider use of uninitialized variables that evaluate to $null on the LHS (e.g., if ($uninitializedVar -gt 0)), but the - unavoidable - downside is the different behavior of $uninitializedVar -eq ... / $uninitializedVar -ne ...

As for the fundamental asymmetry with -eq, due to type coercions: consider the following case:

PS> 2 -eq '+2'
True

PS> '+2' -eq 2
False

vexx32 · 2019-10-15T16:03:06Z

Yeah, we can't avoid that difference. If we try to mess around with that we'll get in the way of PS's array-filtering semantics with -eq and similar operators, as well as the type coercion semantics.

It is also a bit confusing, but it does follow clearly defined rules -- unlike the current behaviour with -ge/-le 😁

iSazonov · 2019-10-15T17:12:50Z

Yeah, we can't avoid that difference.

We have been living with this for a long time.
I think the best way for the project with a long history is to correct specific problems that arise in practical scenarios. As long as there is no significant business scenario, it makes no sense to make a breaking change. I would prefer such a conservative way until significant improvement ideas appear for which users will be ready to make significant changes in their bussines.

vexx32 · 2019-10-15T17:33:18Z

I understand your concerns there @iSazonov. But I would prefer the language behave as consistently as possible; the fewer strange exceptions to the rules are, the better.

@daxian-dbw can you or the @PowerShellTeam weigh in on this aspect? It's not critical to this PR, but I feel it would make sense to take care of within the scope of this PR if we're going to do so.

Quick summary: the -ge and -le operators treat $null as though it's a -epsilon (larger than any negative number but smaller than zero) value when the other side of the operator is a numeric value. This is counter to all other coercions involving $null as a number where it is considered equal to 0. I propose that 0 -le $null and 0 -ge $null both return $true, and $null be considered precisely 0 in a purely numeric context.

SteveL-MSFT · 2019-11-14T00:44:20Z

@PowerShell/powershell-committee reviewed this and is concerned that changing this behavior will break other scenarios that rely on the variants of null to be different due to loss of information. We did not conclude on this but agreed it isn't necessary for PS7.

vexx32 · 2019-11-14T01:28:52Z

@SteveL-MSFT is that specifically about having -ge and -le treat $null as 0 that we're wanting to retain? Or are you talking about this PR as a whole? (If so; the current behaviour for NullString/DBNull with -ge and -le is simply to throw an error, so I don't think we'd be breaking much here.) 🙂

SteveL-MSFT · 2019-12-04T23:48:02Z

cc @JamesWTruher

SteveL-MSFT · 2020-01-15T17:46:41Z

@PowerShell/powershell-committee has discussed this last week as part of a broader [DBNull] discussion. Concern is whether we made the right decision to allow [DBNull]::value -eq $null which has brought up many other related issues and also introducing regressions. We agreed to revert the change where [DBNull]::value -eq $null and encourage continued discussion particularly with DB users of PowerShell to understand the pain points related to use of DBNull and how to best address it in PowerShell thinking holistically across the language including other operators and method invocations, etc... that have come up.

iSazonov · 2020-01-16T03:11:01Z

encourage continued discussion particularly with DB users of PowerShell

Please add reference to the issue.

SteveL-MSFT · 2020-01-16T19:21:19Z

Created #11604