Skip to content

Inconsistent encoding when calling #to_s on arrays containing hashes with non us-ascii characters #6748

@robbavey

Description

@robbavey

Environment Information

  • jruby-9.2.19.0 installed using rvm
  • Linux tester 5.4.0-48-generic #52-Ubuntu SMP Thu Sep 10 10:58:49 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
  • Darwin macbook.local 19.6.0 Darwin Kernel Version 19.6.0: Thu Oct 29 22:56:45 PDT 2020; root:xnu-6153.141.2.2~1/RELEASE_X86_64 x86_64

Other relevant info you may wish to add:

Expected Behavior

This is the behaviour from MRI:

ubuntu@tester:~$ irb
2.5.7 :001 >extended = [{"non_ascii"=>"ง"}, {"ascii" => "two"}]
 => [{"non_ascii"=>"ง"}, {"ascii"=>"two"}] 
2.5.7 :002 > extended.to_s.encoding
 => #<Encoding:UTF-8> 
2.5.7 :003 > "#{extended}".encoding
 => #<Encoding:UTF-8> 
2.5.7 :004 > extended = [{"ascii" => "two"}, {"non_ascii"=>"ง"}]
 => [{"ascii"=>"two"}, {"non_ascii"=>"ง"}] 
2.5.7 :005 > "#{extended}".encoding
 => #<Encoding:UTF-8> 
2.5.7 :006 > extended.to_s.encoding
 => #<Encoding:UTF-8> 
2.5.7 :007 > exit

Actual Behavior

ubuntu@tester:~$ irb
jruby-9.2.19.0 :001 > extended = [{"non_ascii"=>"ง"}, {"ascii" => "two"}]
 => [{"non_ascii"=>"ง"}, {"ascii"=>"two"}] 
jruby-9.2.19.0 :002 > extended.to_s.encoding
 => #<Encoding:US-ASCII> 
jruby-9.2.19.0 :003 > "#{extended}".encoding
 => #<Encoding:US-ASCII> 
jruby-9.2.19.0 :004 > extended = [{"ascii" => "two"}, {"non_ascii"=>"ง"}]
 => [{"ascii"=>"two"}, {"non_ascii"=>"ง"}] 
jruby-9.2.19.0 :005 > extended.to_s.encoding
 => #<Encoding:UTF-8> 
jruby-9.2.19.0 :006 > "#{extended}".encoding
 => #<Encoding:UTF-8> 

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions