Skip to content

Marshal load loses correct encoding for string subclass #939

@jkl1337

Description

@jkl1337

I am getting unexpected behavior when using Marshal.load on a subclass of string that includes an instance variable with a value. In the example below the instance variable is an integer, but it seems to do this with anything other than nil for the instance variable.

Note that Marshal.dump seems to yield identical output in JRuby and MRI (the encoding of the Marshal.dump string is ASCII-8BIT in both cases as expected)

This problem was encountered in a Rails app after attempting to cache an object containing an object with a subclass of string.

The workaround I am using is to customize marshal_dump on the class to return an array with the string data as one element and the ivars as the other.

jruby 1.7.5.dev (1.9.3p392) 2013-08-02 c672591 on Java HotSpot(TM) 64-Bit Server VM 1.7.0_25-b15 [linux-amd64]

[1] pry(main)> class StringSubclass < String
[1] pry(main)*   attr_accessor :oops
[1] pry(main)* end  
=> nil
[2] pry(main)> s_ok = StringSubclass.new('what')
=> "what"
[3] pry(main)> s_ok.encoding
=> #<Encoding:UTF-8>
[4] pry(main)> Marshal.dump(s_ok)
=> "\x04\bIC:\x13StringSubclass\"\twhat\x06:\x06ET"
[5] pry(main)> Marshal.load(Marshal.dump(s_ok)).encoding
=> #<Encoding:UTF-8>
[6] pry(main)> 
[7] pry(main)> s_oops = StringSubclass.new('what').tap { |s| s.oops = 1; s }
=> "what"
[8] pry(main)> s_oops.encoding
=> #<Encoding:UTF-8>
[9] pry(main)> Marshal.dump(s_oops)
=> "\x04\bIC:\x13StringSubclass\"\twhat\a:\x06ET:\n@oopsi\x06"
[10] pry(main)> Marshal.load(Marshal.dump(s_oops)).encoding
=> #<Encoding:ASCII-8BIT>
ruby 2.0.0p195 (2013-05-14 revision 40734) [x86_64-linux]

[1] pry(main)> class StringSubclass < String
[1] pry(main)*   attr_accessor :oops
[1] pry(main)* end  
=> nil
[2] pry(main)> s_ok = StringSubclass.new('what')
=> "what"
[3] pry(main)> s_ok.encoding
=> #<Encoding:UTF-8>
[4] pry(main)> Marshal.dump(s_ok)
=> "\x04\bIC:\x13StringSubclass\"\twhat\x06:\x06ET"
[5] pry(main)> Marshal.load(Marshal.dump(s_ok)).encoding
=> #<Encoding:UTF-8>
[6] pry(main)> 
[7] pry(main)> s_oops = StringSubclass.new('what').tap { |s| s.oops = 1; s }
=> "what"
[8] pry(main)> s_oops.encoding
=> #<Encoding:UTF-8>
[9] pry(main)> Marshal.dump(s_oops)
=> "\x04\bIC:\x13StringSubclass\"\twhat\a:\x06ET:\n@oopsi\x06"
[10] pry(main)> Marshal.load(Marshal.dump(s_oops)).encoding
=> #<Encoding:UTF-8>

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions