-
Notifications
You must be signed in to change notification settings - Fork 8.1k
Add hash to ModuleAnalysisCache filename to differentiate different pwsh on the system #8174
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
This is a breaking change, isn't it? |
|
@anmenaga the analysis cache file is an internal implementation detail and shouldn't be used directly by users |
| { | ||
| featureNames[index++] = featureName.ToLowerInvariant(); | ||
| } | ||
| hashBytes = sha1.ComputeHash(Encoding.UTF8.GetBytes(Utils.DefaultPowerShellAppBase)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why use a cryptographic hash at all? Why not use .GetHashCode()?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need a stable result but the GetHashCode could be randomized.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe GetHashCode() in .NET Core is now randomized as @iSazonov stated and we need something stable
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CRC and MD5 are stable and both take less computation than SHA1. I don't believe this is a security issue, if it is, we shouldn't use SHA1. CRC is relatively simple to calculate we can implement it ourselves as I don't see anything to calculate it in DotNet Core.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://github.com/dotnet/corefx/search?utf8=%E2%9C%93&q=crc32&type=
but I don't understand how much we will win.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I presume that this has to be calculated at most every startup. Switching to a less intensive hash algorithm will reduce our startup time which is a big factor in how people perceive performance.
cc @daxian-dbw
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@TravisEz13 and I discussed this offline yesterday and I agree that we should go with CRC. The main gain is to potentially avoid loading System.Security.Cryptography.Algorithms.dll at powershell startup time, plus there is also gain in CPU time when running the hash algorithm. I will do this change as part of my startup improvement work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got this chart from StackOverflow about the collision rate of CRC32:

In our scenario, hashing would be done on each powershell core installation on a machine, and in case of any experimental features is enabled, the combination of feature names. I think 100 different hash inputs should be a fair large number for a local machine environment in our scenario, and thus the collision rate would be less than 1 in a million, which should be good.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could use SSE from System.Runtime.Intrinsics.Experimental. We get this in runtime after .Net Core 3.0 release.
PR Summary
It will be common for users to have at least pwsh and pwsh-preview on the system. Having them share the same ModuleAnalysisCache may result in unexpected behavior as the wrong module may be loaded. Fix is to add a 8 character hash derived from
$PSHOMEpath.Fix #8172
PR Checklist
.h,.cpp,.cs,.ps1and.psm1files have the correct copyright headerWIP:to the beginning of the title and remove the prefix when the PR is ready.[feature]if the change is significant or affects feature tests