Project

General

Profile

Actions

Bug #21993

open

`rb_gc_update_tbl_refs` is incorrectly documented as the dcompact pair for `rb_mark_tbl_no_pin`, and is unsafe for `st_table`s with non-VALUE keys

Bug #21993: `rb_gc_update_tbl_refs` is incorrectly documented as the dcompact pair for `rb_mark_tbl_no_pin`, and is unsafe for `st_table`s with non-VALUE keys

Added by ivoanjo (Ivo Anjo) 3 days ago.

Status:
Open
Assignee:
-
Target version:
-
[ruby-core:125238]

Description

Hey!

I work for Datadog on the Ruby profiler and I've been exploring the TypedData GC API to understand the correct patterns for writing compaction-aware extensions.

While reading through the API, I noticed a mismatch between the documented pairing of rb_mark_tbl_no_pin + rb_gc_update_tbl_refs and what the implementation actually does that seems like it can cause crashes and even corruption for extensions.

The problem

The public header documents rb_gc_update_tbl_refs as the dcompact counterpart for rb_mark_tbl_no_pin:

 * Updates  references  inside  of  tables.   After  you  marked  values  using
 * rb_mark_tbl_no_pin(), the  objects inside  of the table  could of  course be
 * moved.  This function is to fixup  those references.

However, rb_mark_tbl_no_pin's implementation only marks values, not keys:

// gc.c / gc/gc.h
static int
gc_mark_tbl_no_pin_i(st_data_t key, st_data_t value, st_data_t data)
{
    rb_gc_mark_movable((VALUE)value);  // key is never touched
    return ST_CONTINUE;
}

Thus, tables used with rb_mark_tbl_no_pin can have non-VALUE keys (e.g. IDs, raw integers, C pointers, etc). The problem is that rb_gc_update_tbl_refs (via gc_update_table_refs) calls rb_gc_location on both keys and values:

static int
hash_foreach_replace(st_data_t key, st_data_t value, ...) {
    if (rb_gc_location((VALUE)key) != (VALUE)key)   // <- called on key
        return ST_REPLACE;
    if (rb_gc_location((VALUE)value) != (VALUE)value)
        return ST_REPLACE;
    ...
}

Calling rb_gc_location on something that's not a VALUE and even sometimes replacing it is probably going to cause crashes and heap corruption.

Suggested fixes

I can think of a few ways to address this:

a) Deprecate rb_gc_update_tbl_refs

Maybe just deprecate it? I'm not even sure if this gets used very often.

b) Introduce rb_mark_hash_no_pin

The mark_keyvalue iterator already exists internally and is used by mark_hash() for Ruby's own Hash objects. Exposing a public rb_mark_hash_no_pin would give rb_gc_update_tbl_refs a safe, correct dmark counterpart for tables where both keys and values are Ruby objects.

c) Fix the docs and document the precondition for rb_gc_update_tbl_refs

Update the docs to make explicit that rb_gc_update_tbl_refs assumes a VALUE => VALUE table, and that all keys must have been marked (e.g. via rb_mark_set or a manual iteration with rb_gc_mark_movable) in addition to calling rb_mark_tbl_no_pin for the values. Weird and ugly, but correct?


Happy to help with a patch if useful.

No data to display

Actions

Also available in: PDF Atom