Skip to content

Running specific regex with Regexp::IGNORECASE flag on text starting with specific pipe character results in java.lang.ArrayIndexOutOfBoundsException #7730

@mrckzgl

Description

@mrckzgl

Environment Information

  • JRuby: jruby 9.4.1.0 (3.1.0) 2023-02-07 237d5fa Java HotSpot(TM) 64-Bit Server VM 15.0.2+7-27 on 15.0.2+7-27 +jit [x86_64-linux] no command line flags

Description

Try running this simple code in JRuby

str1 = "asdas│dsfsd"
str2 = "│dsfsd"

regex1 = Regexp.new(Regexp.escape("abc"), Regexp::IGNORECASE)
puts "regex1 on str1 result: " + str1.match?(regex1).to_s
puts "regex1 on str2 result: " + str2.match?(regex1).to_s

regex2 = Regexp.new(Regexp.escape("test"), Regexp::IGNORECASE)
puts "regex2 on str1 result: " + str1.match?(regex2).to_s
puts "regex2 on str2 result: " + str2.match?(regex2).to_s

Expected Behavior
Output:

regex1 on str1 result: false
regex1 on str2 result: false
regex2 on str1 result: false
regex2 on str2 result: false

Actual Behavior

regex1 on str1 result: false
regex1 on str2 result: false
regex2 on str1 result: false
Unhandled Java exception: java.lang.ArrayIndexOutOfBoundsException: Index -2 out of bounds for length 8
java.lang.ArrayIndexOutOfBoundsException: Index -2 out of bounds for length 8
           mbcCaseFold at org/jcodings/specific/BaseUTF8Encoding.java:154
           mbcCaseFold at org/jcodings/specific/UTF8Encoding.java:22
        lowerCaseMatch at org/joni/Search.java:42
            access$000 at org/joni/Search.java:27
                search at org/joni/Search.java:439
    forwardSearchRange at org/joni/Matcher.java:139
          searchCommon at org/joni/Matcher.java:447
   searchInterruptible at org/joni/Matcher.java:318
                   run at org/jruby/RubyRegexp.java:285
                   run at org/jruby/RubyRegexp.java:266
           executeTask at org/jruby/RubyThread.java:1751
           executeTask at org/jruby/RubyThread.java:1729
         matcherSearch at org/jruby/RubyRegexp.java:230
                matchP at org/jruby/RubyRegexp.java:1247
               match_p at org/jruby/RubyRegexp.java:1171
               match_p at org/jruby/RubyString.java:1729
                  call at org/jruby/RubyString$INVOKER$i$match_p.gen:-1
          cacheAndCall at org/jruby/runtime/callsite/CachingCallSite.java:495
                  call at org/jruby/runtime/callsite/CachingCallSite.java:244
  invokeOther18:match? at regex_test.rb:10
           RUBY$script at regex_test.rb:10
                   run at regex_test.rb:-1
   invokeWithArguments at java/lang/invoke/MethodHandle.java:729
                  load at org/jruby/ir/Compiler.java:114
             runScript at org/jruby/Ruby.java:1277
           runNormally at org/jruby/Ruby.java:1194
           runNormally at org/jruby/Ruby.java:1176
           runNormally at org/jruby/Ruby.java:1212
           runFromMain at org/jruby/Ruby.java:991
         doRunFromMain at org/jruby/Main.java:398
           internalRun at org/jruby/Main.java:282
                   run at org/jruby/Main.java:227
                  main at org/jruby/Main.java:199

So when a string starts with a "|" character some regex on it will bail. The example works fine on standard ruby 3.1.3p185 (2022-11-24 revision 1a6b16756e) [x86_64-linux]. The problem only appears on JRuby when the Regexp::IGNORECASE flag is set on the second regex.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions