Skip to content

Hash keys should only be deduplicated for direct literal forms #3472

@headius

Description

@headius

In #3405 @tagomoris discovered that our Hash key deduplication for String was losing or modifying encodings. The base problem is that we were not considering encoding in our deduplicated String cache, sometimes returning a differently-encoded String if their contents match. I've fixed that in f6553cf but there's a further issue: to match MRI we should only be deduplicating String keys in a Hash when they are the direct, literal form, as shown below:

{"str" => 1} # "str" should be frozen and deduplicated
str2 = "str2"
{str2 => 1} # "str2" should be frozen and duplicated, not deduplicated

Other forms of simple inserting a String key into a Hash should also not deduplicate; our doing so is a bit overeager and led to discovering the bug in #3405.

We need to modify the IR compiler to make literal String keys to a literal Hash into FrozenString and modify the Hash-creation logic to simply use those deduplicated (and cached) frozen Strings as-is without re-attempting to dup them.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions