Skip to content

Reuse thread-local jump exceptions to avoid construction#8031

Closed
headius wants to merge 3 commits intojruby:masterfrom
headius:reuse_return_jump
Closed

Reuse thread-local jump exceptions to avoid construction#8031
headius wants to merge 3 commits intojruby:masterfrom
headius:reuse_return_jump

Conversation

@headius
Copy link
Member

@headius headius commented Dec 5, 2023

Thread-local branching requires us to raise an exception to unroll the stack past multiple frames. These exceptions should never escape a given thread, and yet we always construct a new object for them. We could improve performance of non-local flow control by using a thread-local exception object (for cases that have state) or a global singleton exception object (for cases that do not).

This PR will attempt to do that.

Relates to performance challenges like #5933

@headius headius added this to the JRuby 9.4.6.0 milestone Dec 5, 2023
@headius
Copy link
Member Author

headius commented Dec 5, 2023

At least one of the failures shows a problem with this approach: there may be more than one IRReturnJump in flight at a time.

From #1980:

https://github.com/jruby/jruby/blob/6d497bd585c6ce535907205d8285f644235d07c1/spec/regression/GH-1980_multiple_nonlocal_returns_in_flight_spec.rb

In order to avoid reallocating this object, then, we'd need to have some sort of pooling mechanism, or a flag to indicate it is currently in use.

This allows multiple IRReturnJump to be in flight at once without
stepping on each other, and cleans up some direct accesses of the
jump fields.
We cannot use a static thread-local because it will root the
entire runtime (at best) and potentially root classloaders and
keep entire applications in memory after they should have been
cleaned up. Moving this to StaticScope has the following
advantages:

* Localize the return jump cache to the point of initiation, the
  scope doing the non-local return.
* Scopes not using return jumps will not use this cache and not
  create any objects.

The lack of synchronization here is intentional; if two threads
race and both create a ThreadLocal for the scope, one will win and
the other will just be collected eventually. There's no particular
need to ensure only one ThreadLocal gets created.
@headius
Copy link
Member Author

headius commented Dec 5, 2023

This was a good experiment but ultimately the overhead of the thread-local appears to match or exceed the overhead of just creating a new object every time. The example in #5933 is no faster, and possibly a bit slower, using the code from this PR. The real culprit in that case is the construction of a useless DynamicScope.

Moving this into a simple ThreadContext field might reduce the overhead of managing a one-element pool, but it would require passing ThreadContext into all the IRReturnJump handling methods, a larger change than I had planned and possibly again more overhead managing the pool than just creating a new object.

@headius headius closed this Dec 5, 2023
@headius headius deleted the reuse_return_jump branch December 5, 2023 19:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant