Skip to content

Conversation

@iSazonov
Copy link
Collaborator

@iSazonov iSazonov commented Mar 4, 2019

PR Summary

  • Use Type.EmptyTypes, remove custom implementation
  • Use Array.Empty(), remove custom implementation

PR Context

The APIs was missing in .Net Core 1.0/1.1 and is now in .Net Core 2.0/2.1 and we can remove our temporary code.

PR Checklist

@iSazonov iSazonov force-pushed the remove-emptytypes branch from 4bca7ef to 677366a Compare March 4, 2019 12:57
@iSazonov iSazonov added the CL-CodeCleanup Indicates that a PR should be marked as a Code Cleanup change in the Change Log label Mar 4, 2019
@iSazonov iSazonov changed the title WIP: Use Type.EmptyTypes and Array.Empty<T>() Use Type.EmptyTypes and Array.Empty<T>() Mar 4, 2019
@lzybkr
Copy link
Contributor

lzybkr commented Mar 4, 2019

Can you verify that Array.Empty is faster than the private implementation? I recall measuring and surprisingly the private implementation was faster despite the code being identical - presumably the assembly containing the generic code has (had?) some impact on code generation.

IIRC, the private code was necessary to keep Windows PowerShell running with supported versions of the desktop CLR, but I measured anyway because I was curious.

@iSazonov
Copy link
Collaborator Author

iSazonov commented Mar 5, 2019

Can you verify that Array.Empty is faster than the private implementation?

@lzybkr I believe that make sense only after we move to .Net Core 3.0 - the new runtime has huge performance improvements. Although now I could do a simple test using BenchmarkDotNet.

@iSazonov
Copy link
Collaborator Author

iSazonov commented Mar 6, 2019

It seems performance of both comparable with .Net Core 3.0 Preview2 taking in account measure errors.

BenchmarkDotNet=v0.11.4, OS=Windows 10.0.17763.253 (1809/October2018Update/Redstone5)
Intel Core i5-2410M CPU 2.30GHz (Sandy Bridge), 1 CPU, 4 logical and 2 physical cores
.NET Core SDK=3.0.100-preview-010184
  [Host]      : .NET Core 3.0.0-preview-27324-5 (CoreCLR 4.6.27322.0, CoreFX 4.7.19.7311), 64bit RyuJIT
  TC Disabled : .NET Core 3.0.0-preview-27324-5 (CoreCLR 4.6.27322.0, CoreFX 4.7.19.7311), 64bit RyuJIT
  TC Enabled  : .NET Core 3.0.0-preview-27324-5 (CoreCLR 4.6.27322.0, CoreFX 4.7.19.7311), 64bit RyuJIT

Runtime=Core  Toolchain=.NET Core 3.0  
Method Job EnvironmentVariables Mean Error StdDev Median Ratio RatioSD Gen 0/1k Op Gen 1/1k Op Gen 2/1k Op Allocated Memory/Op
CoreFXEmptyArray TC Disabled COMPlus_TieredCompilation=0 4.2667 ns 0.1326 ns 0.1241 ns 4.3343 ns 1.00 0.00 - - - -
CustomEmptyArray TC Disabled COMPlus_TieredCompilation=0 4.3588 ns 0.1441 ns 0.1348 ns 4.4062 ns 1.02 0.04 - - - -
CoreFXEmptyArray TC Enabled COMPlus_TieredCompilation=1 0.0047 ns 0.0108 ns 0.0101 ns 0.0000 ns ? ? - - - -
CustomEmptyArray TC Enabled COMPlus_TieredCompilation=1 0.0349 ns 0.0406 ns 0.0380 ns 0.0252 ns ? ? - - - -

@lzybkr
Copy link
Contributor

lzybkr commented Mar 6, 2019

@iSazonov - did your benchmark use types defined in System.Management.Automation? I seem to recall that was where I saw a difference.

@iSazonov
Copy link
Collaborator Author

iSazonov commented Mar 6, 2019

The test code is:

        [Benchmark(Baseline = true)]
        public string[] CoreFXEmptyArray()
        {
            return Array.Empty<string>();
        }

        [Benchmark]
        public string[] CustomEmptyArray()
        {
            return EmptyArray<string>();
        }

@iSazonov
Copy link
Collaborator Author

iSazonov commented Mar 6, 2019

String and object - these are most common.

@lzybkr
Copy link
Contributor

lzybkr commented Mar 6, 2019

Do try types in other assemblies. The code is minimal, and if there is a performance difference, I see no reason for the change.

@iSazonov
Copy link
Collaborator Author

iSazonov commented Mar 7, 2019

Today I can not compile SMA with .Net Core 3.0 and get types from it.

I recall measuring and surprisingly the private implementation was faster despite the code being identical - presumably the assembly containing the generic code has (had?) some impact on code generation.

What runtime you measure with? I guess .Net FrameWork? .Net Core 3.0 has a lot of new optimizations. I saw that 3.0 more faster 2.1 while working on Simple Case Folding.
Previously the custom code (EmptyArray) could be faster because of AggresiveInlining attribute. Both Framework and Core haven't this attribute but 3.0 uses TC and new code gen.

In my measure 3.0 generate identical code for both if TC=0 and different code if TC=1 - the second is so fast that error > value (I could get better result). Disasm file attached

System.Text.CaseFolding.IntroBenchmarkBaseline-asm.raw.html.zip

@lzybkr
Copy link
Contributor

lzybkr commented Mar 7, 2019

I most likely measured it with 4.6.2.

I don't know how important SMA is, you could try using a type defined in you benchmark assembly to see if that is different than object or string.

@iSazonov
Copy link
Collaborator Author

iSazonov commented Mar 7, 2019

BenchmarkDotNet=v0.11.4, OS=Windows 10.0.17763.253 (1809/October2018Update/Redstone5)
Intel Core i5-2410M CPU 2.30GHz (Sandy Bridge), 1 CPU, 4 logical and 2 physical cores
.NET Core SDK=3.0.100-preview-010184
  [Host]      : .NET Core 3.0.0-preview-27324-5 (CoreCLR 4.6.27322.0, CoreFX 4.7.19.7311), 64bit RyuJIT
  TC Disabled : .NET Core 3.0.0-preview-27324-5 (CoreCLR 4.6.27322.0, CoreFX 4.7.19.7311), 64bit RyuJIT
  TC Enabled  : .NET Core 3.0.0-preview-27324-5 (CoreCLR 4.6.27322.0, CoreFX 4.7.19.7311), 64bit RyuJIT

Runtime=Core  Toolchain=.NET Core 3.0  
Method Job EnvironmentVariables Mean Error StdDev Median Ratio RatioSD
CoreFXEmptyArrayString TC Disabled COMPlus_TieredCompilation=0 4.1844 ns 0.1102 ns 0.1031 ns 4.2175 ns 1.00 0.00
CoreFXEmptyArrayTestType TC Disabled COMPlus_TieredCompilation=0 4.2873 ns 0.1597 ns 0.1494 ns 4.1989 ns 1.03 0.05
CustomEmptyArrayString TC Disabled COMPlus_TieredCompilation=0 4.3933 ns 0.1218 ns 0.1140 ns 4.4280 ns 1.05 0.04
CustomEmptyArrayTestType TC Disabled COMPlus_TieredCompilation=0 4.3542 ns 0.1843 ns 0.1724 ns 4.3928 ns 1.04 0.06
CoreFXEmptyArrayString TC Enabled COMPlus_TieredCompilation=1 0.0539 ns 0.0530 ns 0.0496 ns ? ?
CoreFXEmptyArrayTestType TC Enabled COMPlus_TieredCompilation=1 0.0589 ns 0.0858 ns 0.0803 ns 0.0395 ns ? ?
CustomEmptyArrayString TC Enabled COMPlus_TieredCompilation=1 0.0601 ns 0.0602 ns 0.0563 ns 0.0473 ns ? ?
CustomEmptyArrayTestType TC Enabled COMPlus_TieredCompilation=1 0.0859 ns 0.0604 ns 0.0535 ns 0.0933 ns ? ?

The same results. For TC=0 we have reliable results and approximately equal values ​​within the error. For TC=1 error exceeds the value - all is very fast.

I most likely measured it with 4.6.2.

This suggests that this measures should be repeated for .Net Core (for each version with which we release?).

@lzybkr
Copy link
Contributor

lzybkr commented Mar 7, 2019

Thanks for testing - I don't have any concerns with the change.

Copy link
Member

@daxian-dbw daxian-dbw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks @iSazonov for making these changes!

@daxian-dbw daxian-dbw merged commit 759c4ab into PowerShell:master Mar 10, 2019
@iSazonov iSazonov deleted the remove-emptytypes branch March 10, 2019 05:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CL-CodeCleanup Indicates that a PR should be marked as a Code Cleanup change in the Change Log

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants