MINOR: Dry up RubyArray and improve performance in a few spots#4751
MINOR: Dry up RubyArray and improve performance in a few spots#4751kares merged 1 commit intojruby:masterfrom
Conversation
| return new RubyArray(runtime, values); | ||
| } | ||
|
|
||
| public static RubyArray newArray(Ruby runtime, List<? extends IRubyObject> list) { |
There was a problem hiding this comment.
still'd like to have the List version around for binary compatibility - it also does not need toArray for packed
There was a problem hiding this comment.
brought it back here https://github.com/jruby/jruby/pull/4751/files#diff-008115c4ee37b136355736c9aa675fa3R264 with nicely separate packed and not packed paths :)
| IRubyObject hash = TypeConverter.checkHashType(context.runtime, args[args.length - 1]); | ||
| if (!hash.isNil()) { | ||
| IRubyObject[] rets = ArgsUtil.extractKeywordArgs(context, (RubyHash) hash, new String[] { "random" }); | ||
| IRubyObject[] rets = ArgsUtil.extractKeywordArgs(context, (RubyHash) hash, |
There was a problem hiding this comment.
let's keep it consistently a one liner maybe :) ?
| } | ||
|
|
||
| private static final int optimizedCmp(ThreadContext context, IRubyObject a, IRubyObject b, int token, CachingCallSite op_cmp, CallSite op_gt, CallSite op_lt) { | ||
| if (token == ((RubyBasicObject) a).getMetaClass().generation) { |
There was a problem hiding this comment.
this cast is all over the place - it used to help inlining ... without the JIT wasn't just smart-enough to get it "fast"
There was a problem hiding this comment.
@kares ah I see, phew I'm not sure about this point, but in Java 8 this should not help anything anymore right. I'd rather expect this to potentially be harmful because it increases the size of the method and the type profiler will figure out that this is always a RubyBasicObjecteventually right? (I could investigate this a little with JitWatch I guess :))
There was a problem hiding this comment.
kk, but Java 7 still supported.
so it might be valuable to measure using that as well, although Java 8 first I guess
(9K is already considerably slower on 7)
There was a problem hiding this comment.
@kares wow I stand 100% corrected here, check this out:
Tried to use a slightly deeper (in the sense of inheritance) case and benchmarked this:
@Benchmark
@OperationsPerInvocation(EVENTS_PER_INVOCATION)
public final void castPerformanceImpactWithout(final Blackhole blackhole) throws Exception {
for (int i = 0; i < EVENTS_PER_INVOCATION; ++i) {
blackhole.consume(random().isTrue());
}
}
@Benchmark
@OperationsPerInvocation(EVENTS_PER_INVOCATION)
public final void castPerformanceImpactWith(final Blackhole blackhole) throws Exception {
for (int i = 0; i < EVENTS_PER_INVOCATION; ++i) {
blackhole.consume(((RubyBoolean) random()).isTrue());
}
}Results are:
# Run complete. Total time: 00:01:03
Benchmark Mode Cnt Score Error Units
CastExperimentBenchmark.castPerformanceImpactWith thrpt 20 92950.858 ± 1169.586 ops/ms
CastExperimentBenchmark.castPerformanceImpactWithout thrpt 20 69318.454 ± 2354.502 ops/msdidn't expect this at all (same effect for JDK 7 and 8 btw.) => will revert + thanks for the very educational moment :)
There was a problem hiding this comment.
private static IRubyObject random() {
return RubyUtil.RUBY.newBoolean(RANDOM.nextBoolean());
} private static final Random RANDOM = new Random();was my object generator :)
There was a problem hiding this comment.
The overhead in these cases seems to be the keeping of the type profile itself, inlining is not affected as such and method size is even a byte larger from the cast as I expected:
5189 2084 4 org.logstash.benchmark.CastExperimentBenchmark::castPerformanceImpactWith (28 bytes)
@ 9 org.logstash.benchmark.CastExperimentBenchmark::random (13 bytes) inline (hot)
@ 6 java.util.Random::nextBoolean (14 bytes) inline (hot)
@ 2 java.util.Random::next (47 bytes) inline (hot)
@ 8 java.util.concurrent.atomic.AtomicLong::get (5 bytes) inline (hot)
@ 32 java.util.concurrent.atomic.AtomicLong::compareAndSet (13 bytes) inline (hot)
@ 9 sun.misc.Unsafe::compareAndSwapLong (0 bytes) (intrinsic)
@ 9 org.jruby.Ruby::newBoolean (16 bytes) inline (hot)
@ 15 org.jruby.RubyBasicObject::isTrue (17 bytes) inline (hot)
@ 18 org.openjdk.jmh.infra.Blackhole::consume (28 bytes) disallowed by CompilerOracle
2917 2053 4 org.logstash.benchmark.CastExperimentBenchmark::castPerformanceImpactWithout (27 bytes)
@ 9 org.logstash.benchmark.CastExperimentBenchmark::random (13 bytes) inline (hot)
@ 6 java.util.Random::nextBoolean (14 bytes) inline (hot)
@ 2 java.util.Random::next (47 bytes) inline (hot)
@ 8 java.util.concurrent.atomic.AtomicLong::get (5 bytes) inline (hot)
@ 32 java.util.concurrent.atomic.AtomicLong::compareAndSet (13 bytes) inline (hot)
@ 9 sun.misc.Unsafe::compareAndSwapLong (0 bytes) (intrinsic)
@ 9 org.jruby.Ruby::newBoolean (16 bytes) inline (hot)
@ 12 org.jruby.RubyBasicObject::isTrue (17 bytes) inline (hot)
@ 12 org.jruby.RubyBasicObject::isTrue (17 bytes) inline (hot)
\-> TypeProfile (140240/280503 counts) = org/jruby/RubyBoolean$False
\-> TypeProfile (140263/280503 counts) = org/jruby/RubyBoolean$True
@ 17 org.openjdk.jmh.infra.Blackhole::consume (28 bytes) disallowed by CompilerOracleThere was a problem hiding this comment.
thanks, and yes it surprising (and IDE gives warnings all over) also did not know about that.
... but I recall @headius lecture on some of the issues along the way of making JRuby.
think he also raised the issue to the HotSpot team - maybe for 9 thay might do smt about it.
or maybe they won't and eventually just push GraalVM's optimizations for all as Java 10 :)
03cd174 to
51cd6a2
Compare
|
@kares thanks for the swift review, addressed all comments I think/hope :) |
51cd6a2 to
5152772
Compare
|
@kares reverted the removed cast and added some benchmarks here #4751 (comment) (just in case GH hides it now that I pushed over the change, kind of interesting imo :)) |
|
no one complained in a week or so, thus let's ship it ... |
Some cleanups I came across debugging
RubyArrayperformance recently :)RubyArrayUSE_PACKED_ARRAYS == truepaths to a singleifbranch to be more JIT friendly hereisPackedArraywill be nicely dead code eliminated now for theCollectioncase and we save an expensivesizecall here and there (+ bytecode becomes smaller :))staticthat had no instance reference anywherelist.toArray(new IRubyObject[list.size()]))with the faster versioncollection.toArray(IRubyObject.NULL_ARRAY)Just a few suggestions, let me know what you think :)