-
Notifications
You must be signed in to change notification settings - Fork 8.1k
Remove extra cast to PSObject in ToStringEnumerable() method #10389
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove extra cast to PSObject in ToStringEnumerable() method #10389
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By wrapping the object into a PSObject, PSObject.ToString will be able to find the ToString(...) script method/code method defined for the type of the object in types.ps1xml files, so as to use that in precedence.
So simply removing this step will be a breaking change.
I looked at this issue myself, and currently ToStringEnumerator and ToStringEnumerable behave differently in terms of how the ToString(...) methods defined in types.ps1xml files are honored for the element objects.
From the completeness perspective, ToStringEnumerator should do the same as ToStringEnumerable -- wrapping the object before passing to PSObject.ToString.
A simpler and narrow-scoped fix would be to special case string objects:
if (obj != null)
{
object mshObj = obj -is string ? obj : PSObject.AsPSObject(obj);
returnValue.Append(PSObject.ToString(context, mshObj, separator, format, formatProvider, false, false));
}It's technically possible that a ToString method is defined for string type in types.ps1xml, but I don't think that makes any sense practically.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My thoughts were that:
- having TryFastTrackPrimitiveTypes() implies that we don't want to find ToString() in types.ps1xml for primitive types.
- it is unlikely breaking change. Why would anybody want to define custom ToString() for primitive types? I see no use. Even if so then I believe right way is to define a custom type with custom ToString().
- only
-Joinis affected and affected only on half as you mentioned - ToStringEnumerator() does not cast to PSObject. So we should fix either ToStringEnumerable() or ToStringEnumerator().
I tend to accept the unlikely breaking change and fix ToStringEnumerable(), not ToStringEnumerator() - we get more benefits (great perf win as minimum).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why would anybody want to define custom ToString() for primitive types?
The thing is, you don't know whether the element of enumerable here is a primitive type. I can perfectly do (Get-ChildItem c:\) -join ";".
For primitive types, I agree it's unlikely that a ToString method is defined in ETS for them, but they are more likely than string (yes, string is not a primitive type).
only -Join is affected and affected only on half as you mentioned
This is not true. PSObject.ToStringEnumerable is used in PSObject.ToString recursively, and thus any calls to PSObject.ToString could be affected.
ToStringEnumerator() does not cast to PSObject. So we should fix either ToStringEnumerable() or ToStringEnumerator().
ToStringEnumerator() is obviously the one that should be fixed. It's missing a functionality that powershell should have provided -- honor the custom ToString method defined in ETS for a type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for explanations!
If we call PSObject.ToString() for non-PSObject object we always falls in IFormattable code path as last resort and don't honor ETS - is it right?
The IFormattable code path doesn't allow me to fix PSObject.ToString() in universal way so I think we could call TryFastTrackPrimitiveTypes() in ToStringEnumerable() (and ToStringEnumerator()) before casting to PSObject and calling PSObject.ToString().
Last commit demo this.
You will need to update this PR description since casting to |
|
I will add perf results after we get a consensus about that the fix should be. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DateTime has a higher chance to get a custom ToString definition in ETS.
I think we should just strictly restrict the special handling to string and true primitive types (namely type.IsPrimitive is true).
var objType = obj.GetType();
if (objType != typeof(string) && !objType.IsPrimitive)
{
obj = PSObject.AsPSObject(obj);
}
returnValue.Append(PSObject.ToString(context, obj, separator, format, formatProvider, false, false));
returnValue.Append(separatorToUse);There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about Decimal? Also the PR intention is performance. In your sample we add more checks.
Then I think about new test and get:
>1,$PSVersionTable,2 -join ";"
1;System.Management.Automation.PSVersionHashTable;2
>1,@(3,4),2 -join ";"
1;System.Object[];2I would expect recursion to work. What do you thing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In my mind, a perf change shouldn't result in any behavioral changes, and that's why I'm concerned about you being aggressive here.
I would much prefer to be conservative
- Only treat string specially and keep on wrapping other values into
PSObjectif we go for the most conservative manner; - Only treat
IsPrimitiveobjects specially and keep wrapping other values if we go a bit less conservative
As for decimal, it's OK that we don't cover it. Again, it's not a functionality change that we should cover all scenarios, but a perf change that targets improve common scenarios.
As for IntPtr being a primitive type, I think it's fine too, as it's not really possible to use IntPtr directly in PowerShell script anyways.
My personal preference: only treat string specially. /cc @JamesWTruher @rjmholt what's your thoughts on this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a perf change shouldn't result in any behavioral changes, and that's why I'm concerned about you being aggressive here
I am full agree and very appreciate your help!
The "aggressive" come from original issue where we consider primitive types too. I agree that for join operator strings are the most expected types for the operator. Nevertheless, I believe that there are many scenarios (like custom output formatting) when users will use the join operator in loops with primitive types. Notice that we change only an intermediate code and we added tests to exclude breaking change. So I believe we could be less conservative here. But I do not found how test the enumerator change and I'd revert the change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/cc @JamesWTruher @rjmholt what's your thoughts on this?
I think specialising for string only makes the most sense. It's by far the most common, and the place where we don't break things. I don't think there are many people out there doing lots of string joins on other primitive types, but by breaking ETS we'd lose functionality
|
Tests should be added to verify that both |
17306e7 to
8c72633
Compare
|
I did not find how make a test for ToStringEnumerator(). |
|
The thing I'm most concerned about is the breaking nature of this change (and not being labeled initially as such). I think that we need to be very careful about addressing performance issues which result in behavioral change. And in this case, I'm not sure it's needed. I hope that we can query for type extensions (and specifically a I think there's far too much utility in ETS to turn our back on them. It doesn't matter if I can't see why anyone would do that, I'm sure our customers would be able to provide a number of reasons. That's the nature of a platform - it will be used in ways we can't imagine. |
|
@JamesWTruher Thanks! I see your point about ETS and a breaking change. Please see the follow example: Add-Type -TypeDefinition @"
namespace TestNS
{
public class TestClass
{
public override string ToString()
{
return "1";
}
}
}
"@
$a1=[TestNS.TestClass]::new()
$a1
# Output is "1"
$a1 -join ";"
# Output is "1"
Update-TypeData -TypeName "TestNS.TestClass" -MemberType ScriptMethod -MemberName "ToString" -Value { return "test"}
$a1
# Output is "test"
$a1 -join ";"
# Output is "1" - unexpected?
$a1 -like "1"
# Output is "True" - unexpected?
$a1,$a1 -like "1"
# Output is - unexpected?
# "1"
# "1"The example demonstrates two inconsistency (possible fix will be a breaking change):
Now question is - should we have a consistency here and accept a breaking change, |
|
@daxian-dbw It seems we have gathered the opinions of all who wanted to speak out. Common conclusion is a concern about ETS and that only string type could be fixed. Based on my previous comment I thing the string type case is also subject to the concern and I think close the PR and related issue as "By-design" with workaround to use |
PR Summary
PerfView screenshot: left - before the change, right - after the fix.

PR Context
Fix #10377
For the scenario:
in
$w2we have an array of strings (not PSObjects!) and join the strings. Before the fix in ToStringEnumerable() we did extra cast every string from $w2 to PSObject which caused unnecessary memory allocations and slow down the scenario because of skipping fast code path for value types.PR Checklist
.h,.cpp,.cs,.ps1and.psm1files have the correct copyright headerWIP:or[ WIP ]to the beginning of the title (theWIPbot will keep its status check atPendingwhile the prefix is present) and remove the prefix when the PR is ready.