Skip to content

Conversation

@rjmholt
Copy link
Collaborator

@rjmholt rjmholt commented Jun 21, 2018

PR Summary

Refactors the code for ModuleSpecifications:

  • Use nameof where appropriate
  • Factor out a repeated entry.Key.ToString() call
  • Use a StringBuilder in the ToString() method
  • Dissect a trick comparison method into something simpler
  • Use HashCode.Combine() method

Each of the above is its own commit, so we can take any change out.

PR Checklist

@rjmholt rjmholt force-pushed the refactor-module-specification branch from 3e377af to f814d7f Compare June 21, 2018 01:45
@rjmholt rjmholt added Issue-Code Cleanup the issue is for cleaning up the code with no impact on functionality WG-Cmdlets-Core cmdlets in the Microsoft.PowerShell.Core module labels Jun 21, 2018
@iSazonov
Copy link
Collaborator

@iSazonov iSazonov self-assigned this Jun 21, 2018
@rjmholt
Copy link
Collaborator Author

rjmholt commented Jun 21, 2018

@iSazonov I actually was thinking of that exact StackOverflow answer. I didn’t want to change the actual numerical result of GetHashCode() without approval on the basis that it could break things...

But I definitely agree that that would be a better hash algorithm and can change it over in this PR if that is desired.

@iSazonov
Copy link
Collaborator

iSazonov commented Jun 21, 2018

GetHashCode result is not persistent.
From https://msdn.microsoft.com/en-us/library/system.object.gethashcode%28v=vs.110%29.aspx

•You should not assume that equal hash codes imply object equality.

•You should never persist or use a hash code outside the application domain in which it was created, because the same object may hash across application domains, processes, and platforms.

If changing GetHashCode would breang something it would be a bug due to misuse of the method.

@rjmholt
Copy link
Collaborator Author

rjmholt commented Jun 21, 2018

Ah wasn't aware of the second principle! In that case I shall change it use Mr Skeet's implementation

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at this I remembered that we have

internal static int CombineHashCodes(int h1, int h2, int h3, int h4, int h5)

in Utils.cs.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well on the one hand it would be nice to deduplicate our code.

On the other hand:

  • The original intent of the refactor here was to roll the repeated operations into a loop. CombineHashCodes() doesn't have a params int[] hs overload, so we'd have to unroll the loop again.
  • The implementation in CombineHashCodes() still uses XOR. This StackOverflow answer discusses the implementation we borrowed from System.Web.Util.HashCodeCombiner and I notice that another answer is Jon Skeet again disrecommending XOR for hashcodes.

Not for this PR, but maybe it's worth:

  • Changing CombineHashCodes() to use the add/multiply method, and
  • Adding a CombineHashCodes(int h, param int[] hs) overload

If we did that it would probably then be worth moving the ModuleSpecificationComparer.GetHashCode() method over to use Utils.CombineHashCodes().

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree about XOR but we use it very frequently - I propose to don't block the PR and to postpone the change to a special PR where cleanup all code base. Please open new tracking Issue if you agree.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For reference. We have also:

internal static int SequenceGetHashCode<T>(this IEnumerable<T> xs)

public int GetHashCode(Tuple<ITypeName, TypeResolutionState> obj)

public override int GetHashCode(ICollection<T> obj)

@rjmholt rjmholt force-pushed the refactor-module-specification branch from 7150bf3 to 7628499 Compare June 21, 2018 18:09
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the only use of p is to initialize result why not just initialize result directly.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I figured it looks better than just throwing 23 in there (I noticed a few stack overflow answers on this tended to say "I dunno why we used that number", so I thought it was worth calling them p and q). But not attached to it -- can happily change them back to inline constants.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While the loop is nicer code it does mean that you doing an allocation where there wasn't one before.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I tried to look for a better way to do this (without the allocation), but I'm guessing it's not possible. Either we keep the allocation or we unroll the loop. Thought the loop was nicer initially, but can go either way.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that we should avoid such allocations.
As we discussed above we could review all code base to clean up GetHashCode methods to use unified approach.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So everybody put the same questions I suggest now use
internal static int CombineHashCodes(int h1, int h2, int h3, int h4, int h5)
and defer improvements to new PR.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I imagine CombineHashCodes may not get improved for a while if ever and for large numbers of modules, a better distribution is helpful. How about if I unroll the loop but use the new hash algorithm?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we do end up updating the CombineHashCodes method, we should circle back here and call that.

@rjmholt rjmholt mentioned this pull request Jun 22, 2018
11 tasks
@rjmholt
Copy link
Collaborator Author

rjmholt commented Jun 22, 2018

Ok, I've written some tests in #7140. Please take a look.

@iSazonov
Copy link
Collaborator

Waiting #7140

@rjmholt rjmholt changed the title Refactor module specification logic WIP: Refactor module specification logic Jun 25, 2018
@rjmholt
Copy link
Collaborator Author

rjmholt commented Jun 25, 2018

Going to change the hashcode logic to use .NET Core's native implementation

@rjmholt rjmholt changed the title WIP: Refactor module specification logic Refactor module specification logic Jun 26, 2018
@iSazonov
Copy link
Collaborator

iSazonov commented Jun 28, 2018

@rjmholt Please rebase to get new tests from #7140.

@rjmholt rjmholt force-pushed the refactor-module-specification branch from 314354a to cd065f8 Compare June 28, 2018 16:59
@rjmholt
Copy link
Collaborator Author

rjmholt commented Jun 28, 2018

It looks like this is passing all the new tests I wrote 😃

@iSazonov
Copy link
Collaborator

@BrucePay @daxian-dbw Could you please review the PR?

@iSazonov iSazonov merged commit de41ca5 into PowerShell:master Jul 3, 2018
@rjmholt rjmholt deleted the refactor-module-specification branch June 11, 2021 20:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Issue-Code Cleanup the issue is for cleaning up the code with no impact on functionality WG-Cmdlets-Core cmdlets in the Microsoft.PowerShell.Core module

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants