Skip to content

ArrayIndexOutOfBoundsException when utf8 string matched against regexp with with word boundary \b #3397

@kml

Description

@kml
> p = "x".force_encoding("utf-8"); p.match(/x.*\b/)
Java::JavaLang::ArrayIndexOutOfBoundsException: 1
    from org.jcodings.specific.UTF8Encoding.length(UTF8Encoding.java:35)
    from org.jcodings.specific.BaseUTF8Encoding.mbcToCode(BaseUTF8Encoding.java:91)
    from org.jcodings.specific.UTF8Encoding.mbcToCode(UTF8Encoding.java:24)
    from org.jcodings.Encoding.isMbcWord(Encoding.java:469)
    from org.joni.ByteCodeMachine.opWordBound(ByteCodeMachine.java:1054)
    from org.joni.ByteCodeMachine.matchAt(ByteCodeMachine.java:239)
    from org.joni.Matcher.matchCheck(Matcher.java:304)
?> p = "x".bytes.to_a.pack('c*').force_encoding("utf-8"); p.match(/x.*\b/)
=> nil

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions