Improve concurrency of $~ adjacent methods#6647
Merged
headius merged 5 commits intojruby:masterfrom Apr 1, 2021
Merged
Conversation
Previously these methods used the matchdata from the frame backref which could lead to races across threads as they all try to read and write this value. Instead, this patch moves all "out of band" passing of match data to a context-local field, only setting the frame backref if requested at the end of the operation. This will further improve the concurrency of $~-adjacent methods by avoiding the use of that shared frame global within the core method logic. The $~ frame global is still a concurrency issue across threads, but this will help ensure it is only an issue for user code that chooses to consume this value.
Modify setBackRef to want a MatchData so it can always set it as in use. Clearing to nil should now be through clearBackRef. All call paths updated to the new regime. The justification here is that once a MatchData has been injected into a frame, it is now visible to other threads. Whether it will actually be read or not, we must say goodbye to that object since it could be in use at any time on another thread that shares the same frame.
This changes all places we read the frame-local backref to use the thread-local match instead. None of these places used the backref for its content; they only used it to reuse the object, which is now handled more correctly by the thread-local match and the setBackRef method that marks matches as in-use. This should eliminate all internal data races against the $~ variable, since no core methods depend upon its value. This does not make $~ thread-safe but it pushes the issue to the edge, where the user will have to decide to use it knowing it may be updated across threads. See jruby#4868 for the original issue that this partially fixes, and see jruby#6640 for an issue caused by these internal data races.
7aa72dd to
e787c6c
Compare
27341a6 to
f879bf4
Compare
I audited all methods that read and write backref or lastline and made sure they reported appropriately through their annotations (or other means).
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR seeks to improve the concurrency characteristics of methods that read or write the "frame global" backref variable
$~.As seen in #6640 the concurrency issues surrounding backref (#4868) can in some cases lead to internal race conditions we must avoid. In that issue, the frame backref was used during the execution of the main
splitloop, leading to two threads stepping on each other.The work here will move all internal usage of backref (as an "out of band" data-passing mechanism) to a context-local (essentially thread-local) field, only propagating the resulting match to the frame backref at the end of the method as appropriate. This will essentially eliminate data races surrounding
$~within our own code. The remaining race will be in user code that opts into consuming this variable.