Fix memory/disk usage leak when creating many ScriptingContaiers#4747
Fix memory/disk usage leak when creating many ScriptingContaiers#4747jpinsonault wants to merge 3 commits intojruby:masterfrom
Conversation
| return ""; | ||
| } | ||
|
|
||
| public List<URL> getTempUrls(){ |
There was a problem hiding this comment.
would be great to have the naming a bit consistent with existing API e.g. addURL(URL)
... getCachedURLs() or maybe indicate that there's temp file caching (that those URLs are actually local files)
in which case we should rather return paths than URLs, makes sense (getTempPaths or getCachedPaths)?
There was a problem hiding this comment.
Yeah, that makes sense. What about cachedJarPaths and getCachedJarPaths? Or is that too verbose?
There was a problem hiding this comment.
looks much better than the current naming ... go for whatever you feel is appropriate.
getter sounds fine to me esp. if its simply returning a field
Since these are always paths to files, cache the paths instead of URLs Also rename tempUrls to cachedJarPaths
|
I pushed a commit changing the variable name to |
|
looking good, is it still a WiP? |
| in.close(); | ||
| url = f.toURI().toURL(); | ||
|
|
||
| cachedJarPaths.add(url.getPath()); |
There was a problem hiding this comment.
not sure how much its relevant but URL had issues - hopefully not for file: protocol but it might be worth a check
... why org.jruby.util.URLUtil.getPath exists and whether it should be used here
There was a problem hiding this comment.
I looked into the issue, and from what I was able to understand, people recommend using Paths/Path objects if you're on java 7 or higher. Does jruby need to support java 6?
I also saw recommendations to use url.toURI().getSchemeSpecificPart() to get the String path, which is what URLUtil.getPath does, so it might be easier, and safe, to just use that utility function here
There was a problem hiding this comment.
OK, feel free to update to using URLUtil than. also JRuby 9 only supported Java >= 7
|
If it's looking good to you then I guess it's not a WIP. According to https://docs.oracle.com/javase/7/docs/api/java/io/File.html#delete(), |
|
PR looking good, but I do not understand why its only tied to |
|
@jpinsonault @kares This is probably something that would be good to put in 9.1.14, right? Can we land it yet? |
|
cc @enebo |
|
@headius @jpinsonault @kares if this is good on a single provider and solid I say it is better than nothing. Expecting all providers to be done would be great but this will make .14 better... SHIP IT! :) @jpinsonault If this can be done in the other providers are you up for adding it to those too? |
|
I will merge this and extend it to the other providers. |
|
I believe it makes most sense to move this into JRubyClassLoader's close/teardown logic, which is already called by ScriptingContainer.terminate. I'll do that and get everything merged. |
This PR is to resolve #3928, File descriptors from stdlib jars are leaked when ScriptingContainers are terminated
It's very much a work in progress, and I was hoping to get some feedback on
This biggest problem I currently see with it is that I was unable to find a way to cleanup the jars and references without making the
JarResourceclass public. Since I'm not very experienced with this codebase, I'm especially interested in your opinions on how to clean up the jars in a cleaner way.The basic strategy here is:
JRubyClassLoader: When jars are extracted to a temporary folder, keep track of their pathsSingleThreadLocalContextProviderbased context provider goes through each of the temp jars, and cleans up thejarCachereferences and deletes the jars from diskIn the case of a single threaded container, this should be safe. My understanding is that each single threaded container has its own class loader and context provider. Additionally, each time a container causes the class loader to extract a stdlib jar into a temp directory, it creates a uniquely named file for that class loader. So, removing these references will not affect any other scripting containers.
Since our use case in #3928 primarily involves using the single threaded local context scope, I've focused on that. But if this work should involve improvements to the other context providers we'd be happy to implement that as well.
So far I've been testing this manually by just creating scripting containers in a loop, giving them a script that requires a few stdlib jars, and verifying that the jars are created and destroyed on disk as expected. I've also been using yourkit to verify that the jar cache doesn't leak references