-
-
Notifications
You must be signed in to change notification settings - Fork 942
Closed
Labels
Milestone
Description
1.7.24 and 9.0.5.0 both have a pretty serious regression when matching against utf8 strings in multiple threads at the same time. This bug does not appear to be present in 1.7.23 or 9.0.4.0.
If multiple threads are running code like this, with completely unshared string and regex objects:
str = "foobar"
str.force_encoding("UTF-8")
str.gsub(/foo/i, '')
eventually one of the threads will throw an error like this:
Exception in thread "Ruby-0-Thread-6: ./recreate_utf8_bug.rb:3" java.lang.ArrayIndexOutOfBoundsException: 6
at org.jcodings.specific.BaseUTF8Encoding.mbcCaseFold(BaseUTF8Encoding.java:167)
at org.jcodings.specific.UTF8Encoding.mbcCaseFold(UTF8Encoding.java:24)
at org.joni.SearchAlgorithm$SLOW_IC.lowerCaseMatch(SearchAlgorithm.java:238)
at org.joni.SearchAlgorithm$SLOW_IC.search(SearchAlgorithm.java:206)
at org.joni.Matcher.forwardSearchRange(Matcher.java:140)
at org.joni.Matcher.searchInterruptible(Matcher.java:451)
at org.jruby.RubyRegexp$SearchMatchTask.run(RubyRegexp.java:273)
at org.jruby.RubyThread.executeBlockingTask(RubyThread.java:1066)
at org.jruby.RubyRegexp.matcherSearch(RubyRegexp.java:235)
at org.jruby.RubyString.gsubCommon19(RubyString.java:3123)
at org.jruby.RubyString.gsubCommon19(RubyString.java:3106)
at org.jruby.RubyString.gsub19(RubyString.java:3101)
at org.jruby.RubyString.gsub19(RubyString.java:3069)
at org.jruby.RubyString$INVOKER$i$gsub19.call(RubyString$INVOKER$i$gsub19.gen)
at org.jruby.internal.runtime.methods.JavaMethod$JavaMethodOneOrTwoOrNBlock.call(JavaMethod.java:367)
at org.jruby.runtime.callsite.CachingCallSite.call(CachingCallSite.java:202)
at $_dot_.recreate_utf8_bug.block_2$RUBY$__file__(./recreate_utf8_bug.rb:13)
at $_dot_$recreate_utf8_bug$block_2$RUBY$__file__.call($_dot_$recreate_utf8_bug$block_2$RUBY$__file__)
at org.jruby.runtime.CompiledBlock19.yieldSpecificInternal(CompiledBlock19.java:117)
at org.jruby.runtime.CompiledBlock19.yieldSpecific(CompiledBlock19.java:92)
at org.jruby.runtime.Block.yieldSpecific(Block.java:111)
at org.jruby.RubyFixnum.times(RubyFixnum.java:275)
at org.jruby.RubyFixnum$INVOKER$i$0$0$times.call(RubyFixnum$INVOKER$i$0$0$times.gen)
at org.jruby.runtime.callsite.CachingCallSite.callBlock(CachingCallSite.java:143)
at org.jruby.runtime.callsite.CachingCallSite.callIter(CachingCallSite.java:154)
at $_dot_.recreate_utf8_bug.chained_0_rescue_1$RUBY$SYNTHETIC__file__(./recreate_utf8_bug.rb:10)
at $_dot_.recreate_utf8_bug.block_1$RUBY$__file__(./recreate_utf8_bug.rb:9)
at $_dot_$recreate_utf8_bug$block_1$RUBY$__file__.call($_dot_$recreate_utf8_bug$block_1$RUBY$__file__)
at org.jruby.runtime.CompiledBlock19.yield(CompiledBlock19.java:159)
at org.jruby.runtime.CompiledBlock19.call(CompiledBlock19.java:87)
at org.jruby.runtime.Block.call(Block.java:101)
at org.jruby.RubyProc.call(RubyProc.java:300)
at org.jruby.RubyProc.call(RubyProc.java:230)
at org.jruby.internal.runtime.RubyRunnable.run(RubyRunnable.java:99)
at java.lang.Thread.run(Thread.java:745)
The error doesn't occur if I remove the call to #force_encoding or if I remove the "i" flag from the regex.
See this gist for a full reproducible test case and examples of running it on different JRuby versions:
https://gist.github.com/marshalium/3e62c2affbd2ce95757f
Reactions are currently unavailable