-
Notifications
You must be signed in to change notification settings - Fork 8.1k
Closed
Labels
Breaking-Changebreaking change that may affect usersbreaking change that may affect usersCommittee-ReviewedPS-Committee has reviewed this and made a decisionPS-Committee has reviewed this and made a decisionIssue-Enhancementthe issue is more of a feature request than a bugthe issue is more of a feature request than a bugResolution-FixedThe issue is fixed.The issue is fixed.WG-Cmdlets-Utilitycmdlets in the Microsoft.PowerShell.Utility modulecmdlets in the Microsoft.PowerShell.Utility module
Description
As of pwsh 6.1, preview-4, we have quadratic performance when the number of unique values -> n.
By gathering the input first, sorting it, and then only each new value to the last group, we can come much closer to n * log(n) instead of n * n.
I have a prototype with the following perf measurements:
| Count | Unique | OldImpl | newImpl | Speedup | Command |
|---|---|---|---|---|---|
| 10689 | 8220 | 00:00:06.81 | 00:00:00.23 | 29,1 | $allItemsInPowerShellSrcTree | group {[io.path]::GetFileName($_)} |
| 1690765 | 3761 | 00:02:30.34 | 00:00:22.32 | 6,7 | $u | group |
where $u is a dataset of string values out of which 3700 is unique.
The only downside I have seen is that the both the order of the output objects are different, and so is the order within the groups.
Is that part of the public contract?
Is it worth a PR?
Metadata
Metadata
Assignees
Labels
Breaking-Changebreaking change that may affect usersbreaking change that may affect usersCommittee-ReviewedPS-Committee has reviewed this and made a decisionPS-Committee has reviewed this and made a decisionIssue-Enhancementthe issue is more of a feature request than a bugthe issue is more of a feature request than a bugResolution-FixedThe issue is fixed.The issue is fixed.WG-Cmdlets-Utilitycmdlets in the Microsoft.PowerShell.Utility modulecmdlets in the Microsoft.PowerShell.Utility module