Optimize SipHash using sun.misc.Unsafe by grddev · Pull Request #1681 · jruby/jruby

grddev · 2014-05-05T09:56:40Z

As SipHash is designed to consume the inputs 8 bytes at a time, a key part of the algorithm is reading the eight bytes as once. Unfortunately, there is no way to directly read eight bytes at once from a byte array without resorting to Unsafe. This implements the unsafe operation with a fallback on the original slow implementation for cases where Unsafe cannot be loaded (or where the native byte ordering is not little endian, as needed by SipHash).

The following performance comparison that tries strings of increasing lengths shows that the unsafe implementation is roughly 25% faster for really long strings, and about the same speed for short strings.

require 'benchmark'

N = 30_000_000

puts RUBY_DESCRIPTION
Benchmark.bmbm do |x|
  [0,1,8,16,32,64,1024,65536].each do |len|
    string = 'x'*len
    n = N/(0.25*len+10) + N/20_000
    x.report(len.to_s) { n.to_i.times { string.hash } }
  end
end

jruby 1.7.12 (1.9.3p392) 2014-04-15 643e292 on Java HotSpot(TM) 64-Bit Server VM 1.8.0_05-b13 [darwin-x86_64]

            user     system      total        real
0       0.650000   0.000000   0.650000 (  0.653000)
1       0.630000   0.000000   0.630000 (  0.626000)
8       0.600000   0.010000   0.610000 (  0.599000)
16      0.590000   0.000000   0.590000 (  0.584000)
32      0.570000   0.000000   0.570000 (  0.573000)
64      0.550000   0.000000   0.550000 (  0.541000)
1024    0.470000   0.000000   0.470000 (  0.471000)
65536   0.780000   0.000000   0.780000 (  0.783000)

jruby 1.7.13-SNAPSHOT (1.9.3p392) 2014-05-01 c02fcd7 on Java HotSpot(TM) 64-Bit Server VM 1.8.0_05-b13 [darwin-x86_64]

            user     system      total        real
0       0.650000   0.000000   0.650000 (  0.643000)
1       0.660000   0.000000   0.660000 (  0.654000)
8       0.610000   0.010000   0.620000 (  0.608000)
16      0.570000   0.000000   0.570000 (  0.559000)
32      0.530000   0.000000   0.530000 (  0.527000)
64      0.440000   0.000000   0.440000 (  0.442000)
1024    0.350000   0.000000   0.350000 (  0.342000)
65536   0.580000   0.000000   0.580000 (  0.583000)

Running the same benchmark against PerlHash, we can see that the unsafe implementation of SipHash is about the same speed for strings around length 64, and for longer strings the unsafe implementation of SipHash is significantly faster than PerlHash:

            user     system      total        real
0       0.460000   0.000000   0.460000 (  0.452000)
1       0.420000   0.000000   0.420000 (  0.416000)
8       0.440000   0.000000   0.440000 (  0.444000)
16      0.430000   0.010000   0.440000 (  0.422000)
32      0.410000   0.000000   0.410000 (  0.414000)
64      0.470000   0.000000   0.470000 (  0.462000)
1024    0.420000   0.000000   0.420000 (  0.425000)
65536   0.750000   0.000000   0.750000 (  0.750000)

Even though this implementation of SipHash is faster than the previous, I don't think it is fast enough to replace PerlHash entirely, as the majority of Hash keys are most probably predominantly short strings. Keep in mind though, that the 20% difference in PerlHash's advantage represents less than 100ns per invokation, whereas the 20% difference in SipHash's advantage represents a much longer time.

As these loops will be executed only once for every #hash invokation, it would make sense to defer the decision to unroll the loops to the runtime.

In principle, this should allow the JIT compiler to remove all range checks within the loop. I haven't had time to verify this though.

While one could wish that JIT compilation optimised the eight sequential byte reads into a single long read, it in fact does not. This implementation should fallback to the slow implementation in a context where Unsafe fails to load, but I haven't figured out how to test that properly.

dkarpenko · 2014-06-05T15:06:10Z

core/src/main/java/org/jruby/util/SipHashInline.java

You can just assign byte value to variable of type long:

byte byteValue = 127; long longValue = byteValue;

@dkarpenko, while I did not write this code (you can see the same code in the original), I think the reason for the long type conversion is that the shifts should be applied to longs, as they would otherwise produce ints if I'm not mistaken.

@grddev I think you're right about that. Bit twiddling can be a little tweaky with Java's automatic type promotions.

headius · 2014-06-08T09:14:39Z

Hey great find. If I remember right, I did a quick experiment with Unsafe when we put siphash into the codebase but never actually landed this. your patches look good; we'll review and get it installed. I'm not sure we'll want to make SipHash the default, since as you say PerlHash is still faster for small strings (and I wouldn't expect people to be hashing really big strings) but it will be excellent to get SipHash closer to raw native performance. Thank you!

nahi · 2014-06-08T13:34:43Z

Here's my try in 2012. I added some uncommitted sources (Murmur, Perl, etc) so that Benchmark.java should run.
https://github.com/nahi/siphash-java-inline/tree/master/perf

I don't remember the details but Unsafe version is slower. Mine calls Long.reverseBytes() so optimization could have not enough.
https://github.com/nahi/siphash-java-inline/blob/master/perf/SipHashInlineTry.java#L18-22

grddev · 2014-06-09T08:13:06Z

@nahi, I tried the benchmark with an added Long.reverseBytes() in the reader, and here are the results:

jruby 1.7.13-SNAPSHOT (1.9.3p392) 2014-05-01 c02fcd7 on Java HotSpot(TM) 64-Bit Server VM 1.8.0_05-b13 [darwin-x86_64]

            user     system      total        real
0       0.680000   0.000000   0.680000 (  0.673000)
1       0.690000   0.000000   0.690000 (  0.687000)
8       0.660000   0.000000   0.660000 (  0.651000)
16      0.620000   0.010000   0.630000 (  0.625000)
32      0.590000   0.000000   0.590000 (  0.582000)
1024    0.470000   0.000000   0.470000 (  0.478000)
65536   0.810000   0.000000   0.810000 (  0.804000)

This is indeed not faster than the 'safe' version, thus the unsafe version is only relevant with matching endianness.

headius · 2014-11-02T20:54:02Z

Going for it. Thanks!

Optimize SipHash using sun.misc.Unsafe

grddev added 3 commits May 5, 2014 09:00

Avoid manual unroll of non-hot SipHash loops

23570f5

As these loops will be executed only once for every #hash invokation, it would make sense to defer the decision to unroll the loops to the runtime.

Hoist SipHashInline range checks

43b2ef4

In principle, this should allow the JIT compiler to remove all range checks within the loop. I haven't had time to verify this though.

dkarpenko reviewed Jun 5, 2014
View reviewed changes

headius added a commit that referenced this pull request Nov 2, 2014

Merge pull request #1681 from grddev/unsafe-siphash-opt

cb4581c

Optimize SipHash using sun.misc.Unsafe

headius merged commit cb4581c into jruby:jruby-1_7 Nov 2, 2014

headius added this to the JRuby 1.7.17 milestone Nov 2, 2014

headius added core performance labels Nov 2, 2014

headius self-assigned this Nov 2, 2014

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Optimize SipHash using sun.misc.Unsafe#1681

Optimize SipHash using sun.misc.Unsafe#1681
headius merged 3 commits intojruby:jruby-1_7from
grddev:unsafe-siphash-opt

grddev commented May 5, 2014

Uh oh!

dkarpenko Jun 5, 2014

Uh oh!

grddev Jun 8, 2014

Uh oh!

headius Nov 2, 2014

Uh oh!

headius commented Jun 8, 2014

Uh oh!

nahi commented Jun 8, 2014

Uh oh!

grddev commented Jun 9, 2014

Uh oh!

headius commented Nov 2, 2014

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Uh oh!

Conversation

grddev commented May 5, 2014

Uh oh!

dkarpenko Jun 5, 2014

Choose a reason for hiding this comment

Uh oh!

grddev Jun 8, 2014

Choose a reason for hiding this comment

Uh oh!

headius Nov 2, 2014

Choose a reason for hiding this comment

Uh oh!

headius commented Jun 8, 2014

Uh oh!

nahi commented Jun 8, 2014

Uh oh!

grddev commented Jun 9, 2014

Uh oh!

headius commented Nov 2, 2014

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants