Conversation
This avoids performing the potentially expensive hash computation in the event that the Hash is empty, as there cannot possibly be a match.
In the event that the `#hash` method has side effects, any optimisation with respect to empty hashes will affect the results of the operations. For MRI, the empty-hash optimisation seems to be applied everywhere in 2.0.0 and beyond. In 1.9.3 the #hash method is not called for #[] and #key?, but strangely enough the performance is not improved for empty hashes. In 1.8.7, the #hash method is always called even if empty. This does not match the semantics of MRI exactly, as this avoids calling the #hash method from #values_at in 1.9.3, but that would be slightly more involved to implement.
|
jruby_1_7 192cfb5, 2.0 mode, java 8u11, warm JIT, indy off: same as above, with patch applied: Excellent speedup. I'll test this against master too and post those results. |
|
master (abf3c49), JIT off, indy off, warm: master (abf3c49) with patch applied, same as above: |
|
Sorry for dithering on this for so long. My two observations on it when I originally looked it was:
I guess an empty check is not a huge added complexity but it would be really cool if someone could qualify how often this case happens. Personally, I am not against merging this, but it would be nice to know if this can be measured outside this microbenchmark. |
|
@enebo, in terms of motivation, I think the Ruby idiom Regarding the side-effects, I take it you'd prefer if I remove the second commit? |
|
@grddev I am cool with the second commit. I mostly was just saying following it is not what I consider to be semantics (so more qualitative statement about behavior like this not being mandatory). As for the justification I was hoping for some actual usage statistics. I know active_support has it a lot in code but that does not mean it is actually used a lot (consider many methods pass in an options hash so opts = {} never actually executes. That said, I have talked too much about this as it is a simple optimization and MRI decided it is worth doing. Sorry to drag too long on this :) |
While running some performance comparisons between MRI and JRuby, I realised that MRI (or at least new versions of) optimises access to empty hashes. For that purpose I implemented a small benchmark:
In addition to checking the performance of a few operations against an empty and a small Hash, it also tries the operation with a
Detectorinstance, which has a side-effect in its#hashmethod in order to detect wether the method is actually called or not in the different Ruby versions I tested.The pull request consists of two commits, one which implements the general performance tweak (which would certainly be enough for master) and another which strives to restrict the optimisation to the behaviour of the emulated MRI version. The solution is not perfect, but I'm also not sure it is even worthwhile to emulate the behaviour in this case. I strongly suspect that the number of times the
#hashfunction is called is not part of any meaningful contract forHash.The JRuby behaviour without this pull request is in line with MRI 1.8.7, but is technically wrong for 1.9.3 and beyond (I don't have any versions between 1.8.7 and 1.9.3 to verify exactly when MRI implemented their change). Personally, I would vote for implementing the performance tweak independent of version, and potentially break an implementation relying on
#hashside effects.Below are results first for JRuby 1.7.2, and then after that MRI versions 1.8.7, 1.9.3, and 2.0.0 followed by their emulated counterparts with the pull request applied.
jruby 1.7.12 (1.9.3p392) 2014-04-15 643e292 on Java HotSpot(TM) 64-Bit Server VM 1.7.0_40-b40 [darwin-x86_64]
ruby 2.0.0p451 (2014-02-24 revision 45167) [x86_64-darwin12.5.0]
jruby 1.7.13-SNAPSHOT (2.0.0p195) 2014-05-01 68d4185 on Java HotSpot(TM) 64-Bit Server VM 1.7.0_40-b40 [darwin-x86_64]
ruby 1.9.3p545 (2014-02-24 revision 45159) [x86_64-darwin13.1.0]
jruby 1.7.13-SNAPSHOT (1.9.3p392) 2014-05-01 68d4185 on Java HotSpot(TM) 64-Bit Server VM 1.7.0_40-b40 [darwin-x86_64]
ruby 1.8.7 (2013-06-27 patchlevel 374) [i686-darwin13.1.0]
jruby 1.7.13-SNAPSHOT (ruby-1.8.7p370) 2014-05-01 68d4185 on Java HotSpot(TM) 64-Bit Server VM 1.7.0_40-b40 [darwin-x86_64]