Skip to content

Regression for non matching Regexp in 9.3.4 #7484

@andsel

Description

@andsel

Environment Information

Provide at least:

  • JRuby version (jruby -v) and command line (flags, JRUBY_OPTS, etc)
jruby 9.3.4.0 (2.6.8) 2022-03-23 eff48c1ebf OpenJDK 64-Bit Server VM 11.0.15+10 on 11.0.15+10 +jit [x86_64-linux]
  • Operating system and platform (e.g. uname -a)
Linux kalimera 5.15.0-53-generic #59~20.04.1-Ubuntu SMP Thu Oct 20 15:10:22 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux

Other relevant info you may wish to add:
* Installed or activated gems
* Application/framework version (e.g. Rails, Sinatra)
* Environment variables

Expected Behavior

  • Non matching regexp with Regexp::MULTILINE shouldn't have performance regression between 9.2.20 and 9.3.4

  • Provide an executable Ruby script or a link to an example repository.

require "securerandom"

puts ">>> Setup, using JRuby #{JRUBY_VERSION}"

regexp = Regexp.new("(?:.*)(Cannot|Failed) login (?<user>S+) (?:.*)", Regexp::MULTILINE)

puts ">>> Regexp instantiated and start processing"
start_time = Time.now

rand_str = SecureRandom.alphanumeric(5000)
msg = "foo bar blubb cannot login i123456 #{rand_str}"
#10000.times do
100.times do
  regexp.match(msg)
  #msg.match(/(?:.*)(Cannot|Failed) login (?<user>\S+) (?:.*)/)
end
stop_time = Time.now

execution_time = stop_time - start_time
puts ">>> Regexp terminated, executes in #{execution_time} sec"

Actual Behavior
The above script with JRuby 9.2.20.0 perform under the second:

$ ruby regexp_regression_reproducer.rb
>>> Setup, using JRuby 9.2.20.1
>>> Regexp instantiated and start processing
>>> Regexp terminated, executes in 0.217434 sec

with JRuby 9.3.4.0 performs in ~40 seconds:

$ ruby regexp_regression_reproducer.rb
>>> Setup, using JRuby 9.3.4.0
>>> Regexp instantiated and start processing
>>> Regexp terminated, executes in 38.519028 sec

but if I switch

regexp.match(msg)

to

msg.match(/(?:.*)(Cannot|Failed) login (?<user>\S+) (?:.*)/)

it goes back to have good performance:

$ ruby regexp_regression_reproducer.rb
>>> Setup, using JRuby 9.3.4.0
>>> Regexp instantiated and start processing
>>> Regexp terminated, executes in 0.176172 sec

So I suspect is something related to the way the Regexp is instantiated, in particular the Regexp::MULTILINE.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions