-
-
Notifications
You must be signed in to change notification settings - Fork 942
Description
Environment Information
Provide at least:
- JRuby version: commit 6f9df83 (pom.xml says 9.4.10.0-SNAPSHOT)
- Operating system: Linux 6.1.0-26-amd64 break script engine #1 SMP PREEMPT_DYNAMIC Debian 6.1.112-1 (2024-09-30) x86_64 GNU/Linux
Test Case
This code creates 10k threads, each of them executing a trivial amount of JRuby code, to simulate a heavily multi-threaded server application churning through threads. It then drops a heap dump for analysis.
Note: This simulates a use case seen in a production Cantaloupe server, which for some reason seems to create and terminate ~4 threads per minute in our case (I don#t understand Jetty's automatic thread pool management there). Because it keeps running for weeks at a time, it churns through tens of thousands of threads, accumulating tens of thousands of ~100kB LocalContext objects waiting to be cleaned up in terminate(). It eventually slows to a crawl due to memory starvation, or runs out of memory entirely and crashes.
Expected Behavior
JRuby creates a LocalContext for each thread, holds it in a ThreadLocal, and cleans it up when that particular thread has terminated.
The heap dump is a few MB and its size does not scale with the number of terminated threads.
The synthetic example doesn't actually have any concurrent threads; they all are more or less sequential. Thus it should be capable of running with a very small heap, no matter how high you crank the total number of threads that will be created and destroyed (line 36).
Actual Behavior
JRuby doesn't remove the LocalContext for each thread until the ScriptingContainer itself is disposed. In a long-running server application, that would be "never". In the example, it doesn't happen before the heap dump is written, at which point the JVM is essentially terminating.
The heap dump is hundreds of MB and scales with the number of threads that have been terminated. It is completely dominated by thousands of stale local contexts held by the ScriptingContainer's ConcurrentLocalContextProvider.
Setting the number of threads to a high value causes massive heap consumption, and even the synthetic example will eventually run out of memory.
Probable Cause
ConcurrentLocalContextProvider creates a LocalContext for each thread, and only disposes of it in terminate(). I do not know the lifecycles of most objects involved, but evidently termiante() doesn't get called before JVM termination here. Because these objects are still reachable (via ConcurrentLocalContextProvider's contextRefs member), they cannot be garbage-collected.
Suggested Fix
Hold a Reference (a PhantomReference should do, actually) on each thread that a LocalContext is created for. When the thread is terminated and becomes unreachable, that Reference will show up in its associated ReferenceQueue. A background service / cleaner thread can then watch that ReferenceQueue and call remove() on the relevant LocalContext objects. terminate() obviously would need to get rid of that service thread, and remove() the remaining LocalContexts.
Alternatively, the service thread could periodically scan the contextRefs array and remove() any contexts whose thread has died. This should clean up the context more quickly if anything is still holding a reference to the terminated thread, keeping it form being garbage collected.
Maybe it is also possible to piggy-back scanning onto some other operation, but I'm not sure what the performance impacts of that would be. Or what to do if that operation never happens.

