Refactor Java invoker cache logic to allow encapsulation, sync.#3395
Refactor Java invoker cache logic to allow encapsulation, sync.#3395
Conversation
The move to IntHashMap in RubyToJavaInvoker improved throughput for multiple-arity calls, but since IntHashMap is not synchronized this could cause corrupted data under concurrent load as seen in CallableSelector to encapsulate the cache and synchronize access to it. I believe this is a minimum effort to make the cache access safe and fix #3394. I have not measured the impact of this synchronization in a high concurrency setting. It may indeed turn out to be more overhead and a worse bottleneck than using ConcurrentHashMap with Integer objects. Unfortunately there's no middle ground here unless we build or adopt some concurrency-friendly int-based map. Another heavy option would be to maintain two references to IntHashMap in RubyToJavaInvoker, copying from a synchronized writable copy to a read-only copy after every write. This would double the size of the cache structures in memory but would provide a fast, unsynchronized cache for read-mostly signature lookups. This pattern is also easier to support with the refactoring in this commit.
|
FWIW here's the diff that would implement a copy-on-write strategy here to avoid synchronization on reads. I suspect most of these caches are reasonably small, but I have done no measurement of either the performance impact of synchronization nor the memory impact (and cold cache startup impact) of copy-on-write. diff --git a/core/src/main/java/org/jruby/java/invokers/RubyToJavaInvoker.java b/core/src/main/java/org/jruby/java/invokers/RubyToJavaInvoker.java
index 222636f..29ac1a4 100644
--- a/core/src/main/java/org/jruby/java/invokers/RubyToJavaInvoker.java
+++ b/core/src/main/java/org/jruby/java/invokers/RubyToJavaInvoker.java
@@ -35,7 +35,8 @@ public abstract class RubyToJavaInvoker<T extends JavaCallable> extends JavaMeth
// in case multiple callables (overloaded Java method - same name different args)
// for the invoker exists CallableSelector caches resolution based on args here
- final IntHashMap<T> cache;
+ final IntHashMap<T> writeCache;
+ volatile IntHashMap<T> readCache;
private final Ruby runtime;
@@ -54,7 +55,8 @@ public abstract class RubyToJavaInvoker<T extends JavaCallable> extends JavaMeth
minVarArgsArity = getMemberArity(member) - 1;
}
- cache = NULL_CACHE; // if there's a single callable - matching (and thus the cache) won't be used
+ writeCache = NULL_CACHE; // if there's a single callable - matching (and thus the cache) won't be used
+ readCache = writeCache;
this.javaCallable = callable;
this.javaCallables = null;
@@ -84,7 +86,8 @@ public abstract class RubyToJavaInvoker<T extends JavaCallable> extends JavaMeth
}
callables = null;
- cache = NULL_CACHE; // if there's a single callable - matching (and thus the cache) won't be used
+ writeCache = NULL_CACHE; // if there's a single callable - matching (and thus the cache) won't be used
+ readCache = writeCache;
}
else {
callable = null; maxArity = -1; minArity = Integer.MAX_VALUE;
@@ -139,7 +142,8 @@ public abstract class RubyToJavaInvoker<T extends JavaCallable> extends JavaMeth
varargsCallables = varArgs.toArray( createCallableArray(varArgs.size()) );
}
- cache = newCallableCache();
+ writeCache = newCallableCache();
+ readCache = (IntHashMap<T>)writeCache.clone();
}
this.javaCallable = callable;
@@ -192,14 +196,13 @@ public abstract class RubyToJavaInvoker<T extends JavaCallable> extends JavaMeth
}
public T getSignature(int signatureCode) {
- synchronized (cache) {
- return cache.get(signatureCode);
- }
+ return readCache.get(signatureCode);
}
public void putSignature(int signatureCode, T callable) {
- synchronized (cache) {
- cache.put(signatureCode, callable);
+ synchronized (writeCache) {
+ writeCache.put(signatureCode, callable);
+ readCache = (IntHashMap<T>)writeCache.clone();
}
}
diff --git a/core/src/main/java/org/jruby/util/collections/IntHashMap.java b/core/src/main/java/org/jruby/util/collections/IntHashMap.java
index fb9a16c..1a09297 100644
--- a/core/src/main/java/org/jruby/util/collections/IntHashMap.java
+++ b/core/src/main/java/org/jruby/util/collections/IntHashMap.java
@@ -397,6 +397,19 @@ public class IntHashMap<V> {
}
}
+ @Override
+ public Object clone() {
+ IntHashMap<V> newMap = new IntHashMap<>(table.length, loadFactor);
+ for (int i = 0; i < table.length; i++) {
+ Entry<V> entry = table[i];
+ while (entry != null) {
+ newMap.put(entry.getKey(), entry.getValue());
+ entry = entry.next;
+ }
+ }
+ return newMap;
+ }
+
@SuppressWarnings("unchecked")
public static <U> IntHashMap<U> nullMap() { return NullMap.INSTANCE; }
|
|
👍 looks good. unless the user constantly comes up with a different parameters signature (very unlikely) the write cache won't be used, so it should be good. |
|
For the record, I tried using a read/write lock instead of copy-on-write, and the results indicate that COW is still probably a better option. Based on COW of IntHashMap: Single IntHashMap with read/write lock: So the read/write lock is slightly slower in the uncontended case but considerably slower in the contended case. I think we'll go with the COW version for now. The patch, for future reference: diff --git a/core/src/main/java/org/jruby/java/invokers/RubyToJavaInvoker.java b/core/src/main/java/org/jruby/java/invokers/RubyToJavaInvoker.java
index 29ac1a4..b80af7a 100644
--- a/core/src/main/java/org/jruby/java/invokers/RubyToJavaInvoker.java
+++ b/core/src/main/java/org/jruby/java/invokers/RubyToJavaInvoker.java
@@ -6,6 +6,9 @@ import java.lang.reflect.Member;
import java.lang.reflect.Method;
import java.lang.reflect.Modifier;
import java.util.ArrayList;
+import java.util.concurrent.locks.Lock;
+import java.util.concurrent.locks.ReadWriteLock;
+import java.util.concurrent.locks.ReentrantReadWriteLock;
import org.jruby.Ruby;
import org.jruby.RubyModule;
@@ -35,8 +38,8 @@ public abstract class RubyToJavaInvoker<T extends JavaCallable> extends JavaMeth
// in case multiple callables (overloaded Java method - same name different args)
// for the invoker exists CallableSelector caches resolution based on args here
- final IntHashMap<T> writeCache;
- volatile IntHashMap<T> readCache;
+ final IntHashMap<T> cache;
+ final ReadWriteLock lock = new ReentrantReadWriteLock();
private final Ruby runtime;
@@ -55,8 +58,7 @@ public abstract class RubyToJavaInvoker<T extends JavaCallable> extends JavaMeth
minVarArgsArity = getMemberArity(member) - 1;
}
- writeCache = NULL_CACHE; // if there's a single callable - matching (and thus the cache) won't be used
- readCache = writeCache;
+ cache = NULL_CACHE; // if there's a single callable - matching (and thus the cache) won't be used
this.javaCallable = callable;
this.javaCallables = null;
@@ -86,8 +88,7 @@ public abstract class RubyToJavaInvoker<T extends JavaCallable> extends JavaMeth
}
callables = null;
- writeCache = NULL_CACHE; // if there's a single callable - matching (and thus the cache) won't be used
- readCache = writeCache;
+ cache = NULL_CACHE; // if there's a single callable - matching (and thus the cache) won't be used
}
else {
callable = null; maxArity = -1; minArity = Integer.MAX_VALUE;
@@ -142,8 +143,7 @@ public abstract class RubyToJavaInvoker<T extends JavaCallable> extends JavaMeth
varargsCallables = varArgs.toArray( createCallableArray(varArgs.size()) );
}
- writeCache = newCallableCache();
- readCache = (IntHashMap<T>)writeCache.clone();
+ cache = newCallableCache();
}
this.javaCallable = callable;
@@ -196,13 +196,20 @@ public abstract class RubyToJavaInvoker<T extends JavaCallable> extends JavaMeth
}
public T getSignature(int signatureCode) {
- return readCache.get(signatureCode);
+ lock.readLock().lock();
+ try {
+ return cache.get(signatureCode);
+ } finally {
+ lock.readLock().unlock();
+ }
}
public void putSignature(int signatureCode, T callable) {
- synchronized (writeCache) {
- writeCache.put(signatureCode, callable);
- readCache = (IntHashMap<T>)writeCache.clone();
+ lock.writeLock().lock();
+ try {
+ cache.put(signatureCode, callable);
+ } finally {
+ lock.writeLock().unlock();
}
}
|
|
Also FYI, the numbers for COW IntHashMap versus unsynchronized IntHashMap are roughly identical, as expected. I did not do a comparison with ConcurrentHashMap. Another note: CHM is big. Having two copies of the comparatively tiny IntHashMap probably isn't anywhere near as large in memory. |
Refactor Java invoker cache logic to allow encapsulation, sync.
|
very nice - thank you for attempting and sharing this, |
The move to IntHashMap in RubyToJavaInvoker improved throughput
for multiple-arity calls, but since IntHashMap is not synchronized
this could cause corrupted data under concurrent load as seen in
CallableSelector to encapsulate the cache and synchronize access
to it. I believe this is a minimum effort to make the cache access
safe and fix #3394.
I have not measured the impact of this synchronization in a high
concurrency setting. It may indeed turn out to be more overhead
and a worse bottleneck than using ConcurrentHashMap with Integer
objects. Unfortunately there's no middle ground here unless we
build or adopt some concurrency-friendly int-based map.
Another heavy option would be to maintain two references to
IntHashMap in RubyToJavaInvoker, copying from a synchronized
writable copy to a read-only copy after every write. This would
double the size of the cache structures in memory but would provide
a fast, unsynchronized cache for read-mostly signature lookups.
This pattern is also easier to support with the refactoring in
this commit.