Skip to content

Fix Symbol#inspect of UTF_16/UTF_32#4994

Merged
enebo merged 1 commit intojruby:jruby-9.1from
yui-knk:fix_test_ascii_incomat_inspect
Jan 22, 2018
Merged

Fix Symbol#inspect of UTF_16/UTF_32#4994
enebo merged 1 commit intojruby:jruby-9.1from
yui-knk:fix_test_ascii_incomat_inspect

Conversation

@yui-knk
Copy link
Contributor

@yui-knk yui-knk commented Jan 21, 2018

Stop to append byte before inspect string.
For example, when an encoding of symbolBytes is UTF_16LE, "a" is
[0x61, 0x00]. If we append ":" (0x3A) to symbolBytes before
inspect it, bytes are [0x3A, 0x61, 0x00] with UTF_16LE encoding.
This is not what we want to get. This commit chnages the order of
inspecting and appending to avoid this.

Ref: https://github.com/ruby/ruby/blob/v2_5_0/string.c#L10402

Stop to append byte before inspect string.
For example, when an encoding of `symbolBytes` is UTF_16LE, "a" is
`[0x61, 0x00]`. If we append `":"` (0x3A) to `symbolBytes` before
inspect it, bytes are `[0x3A, 0x61, 0x00]` with UTF_16LE encoding.
This is not what we want to get. This commit chnages the order of
inspecting and appending to avoid this.

Ref: https://github.com/ruby/ruby/blob/v2_5_0/string.c#L10402
@enebo enebo added this to the JRuby 9.1.16.0 milestone Jan 22, 2018
@enebo enebo merged commit 342268d into jruby:jruby-9.1 Jan 22, 2018
@enebo
Copy link
Member

enebo commented Jan 22, 2018

@yui-knk You seem to even fix a second bug in here where we seem to be adding :" at the front of a symbol but add no closing ". I might change this code now that you have fixed this because we potentially make 3 instances of RubyString depending on the symbol being inspected. I think we can reduce this to just one.

@enebo
Copy link
Member

enebo commented Jan 22, 2018

Actually I will not be planning on changing this. 1) :sym.inspect is exceedingly rare in hot code 2) guts of bytelist vs RubyString and ability to determine CR_7BIT is much simpler if we make a string first. Working around that to defer making the string would involve some new code.

I did glance at MRI and they remove some of this cost by using memcopy/memmove and set the ':' and contents of the string. We could optimize in this way if we wanted but due to 1) above I am not inclined to put in that extra effort :)

@yui-knk
Copy link
Contributor Author

yui-knk commented Jan 23, 2018

ability to determine CR_7BIT is much simpler if we make a string first

I agree :)

@yui-knk yui-knk deleted the fix_test_ascii_incomat_inspect branch January 23, 2018 00:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants