Optimize SipHash using sun.misc.Unsafe#1681
Conversation
As these loops will be executed only once for every #hash invokation, it would make sense to defer the decision to unroll the loops to the runtime.
In principle, this should allow the JIT compiler to remove all range checks within the loop. I haven't had time to verify this though.
While one could wish that JIT compilation optimised the eight sequential byte reads into a single long read, it in fact does not. This implementation should fallback to the slow implementation in a context where Unsafe fails to load, but I haven't figured out how to test that properly.
There was a problem hiding this comment.
You can just assign byte value to variable of type long:
byte byteValue = 127;
long longValue = byteValue;
There was a problem hiding this comment.
@dkarpenko, while I did not write this code (you can see the same code in the original), I think the reason for the long type conversion is that the shifts should be applied to longs, as they would otherwise produce ints if I'm not mistaken.
There was a problem hiding this comment.
@grddev I think you're right about that. Bit twiddling can be a little tweaky with Java's automatic type promotions.
|
Hey great find. If I remember right, I did a quick experiment with Unsafe when we put siphash into the codebase but never actually landed this. your patches look good; we'll review and get it installed. I'm not sure we'll want to make SipHash the default, since as you say PerlHash is still faster for small strings (and I wouldn't expect people to be hashing really big strings) but it will be excellent to get SipHash closer to raw native performance. Thank you! |
|
Here's my try in 2012. I added some uncommitted sources (Murmur, Perl, etc) so that Benchmark.java should run. I don't remember the details but Unsafe version is slower. Mine calls Long.reverseBytes() so optimization could have not enough. |
|
@nahi, I tried the benchmark with an added Long.reverseBytes() in the reader, and here are the results: jruby 1.7.13-SNAPSHOT (1.9.3p392) 2014-05-01 c02fcd7 on Java HotSpot(TM) 64-Bit Server VM 1.8.0_05-b13 [darwin-x86_64] This is indeed not faster than the 'safe' version, thus the unsafe version is only relevant with matching endianness. |
|
Going for it. Thanks! |
Optimize SipHash using sun.misc.Unsafe
As SipHash is designed to consume the inputs 8 bytes at a time, a key part of the algorithm is reading the eight bytes as once. Unfortunately, there is no way to directly read eight bytes at once from a byte array without resorting to
Unsafe. This implements the unsafe operation with a fallback on the original slow implementation for cases whereUnsafecannot be loaded (or where the native byte ordering is not little endian, as needed by SipHash).The following performance comparison that tries strings of increasing lengths shows that the unsafe implementation is roughly 25% faster for really long strings, and about the same speed for short strings.
jruby 1.7.12 (1.9.3p392) 2014-04-15 643e292 on Java HotSpot(TM) 64-Bit Server VM 1.8.0_05-b13 [darwin-x86_64]
jruby 1.7.13-SNAPSHOT (1.9.3p392) 2014-05-01 c02fcd7 on Java HotSpot(TM) 64-Bit Server VM 1.8.0_05-b13 [darwin-x86_64]
Running the same benchmark against PerlHash, we can see that the unsafe implementation of SipHash is about the same speed for strings around length 64, and for longer strings the unsafe implementation of SipHash is significantly faster than PerlHash:
Even though this implementation of SipHash is faster than the previous, I don't think it is fast enough to replace PerlHash entirely, as the majority of
Hashkeys are most probably predominantly short strings. Keep in mind though, that the 20% difference in PerlHash's advantage represents less than 100ns per invokation, whereas the 20% difference in SipHash's advantage represents a much longer time.