Improve performance of Group-Object

As of pwsh 6.1, preview-4, we have quadratic performance when the number of unique values -> n.

By gathering the input first, sorting it, and then only each new value to the last group, we can come much closer to `n * log(n)` instead of `n * n`.

I have a prototype with the following perf measurements:

Count| Unique| OldImpl | newImpl | Speedup | Command
------| ---------|--------- | ----------| ----------|-----------
10689| 8220 | 00:00:06.81 | 00:00:00.23|  29,1 | $allItemsInPowerShellSrcTree \| group {[io.path]::GetFileName($_)}
1690765 |3761|00:02:30.34 | 00:00:22.32 |  6,7 | $u \| group

where `$u` is a dataset of string values out of which 3700 is unique.

The only downside I have seen is that the both the order of the output objects are different, ~~and so is the order within the groups~~.

Is that part of the public contract?
~~Is it worth a PR?~~ 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Improve performance of Group-Object #7409

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Count	Unique	OldImpl	newImpl	Speedup	Command
10689	8220	00:00:06.81	00:00:00.23	29,1	$allItemsInPowerShellSrcTree \| group {[io.path]::GetFileName($_)}
1690765	3761	00:02:30.34	00:00:22.32	6,7	$u \| group

Improve performance of Group-Object #7409

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions