Bug #21993
open`rb_gc_update_tbl_refs` is incorrectly documented as the dcompact pair for `rb_mark_tbl_no_pin`, and is unsafe for `st_table`s with non-VALUE keys
Description
Hey!
I work for Datadog on the Ruby profiler and I've been exploring the TypedData GC API to understand the correct patterns for writing compaction-aware extensions.
While reading through the API, I noticed a mismatch between the documented pairing of rb_mark_tbl_no_pin + rb_gc_update_tbl_refs and what the implementation actually does that seems like it can cause crashes and even corruption for extensions.
The problem¶
The public header documents rb_gc_update_tbl_refs as the dcompact counterpart for rb_mark_tbl_no_pin:
* Updates references inside of tables. After you marked values using
* rb_mark_tbl_no_pin(), the objects inside of the table could of course be
* moved. This function is to fixup those references.
However, rb_mark_tbl_no_pin's implementation only marks values, not keys:
// gc.c / gc/gc.h
static int
gc_mark_tbl_no_pin_i(st_data_t key, st_data_t value, st_data_t data)
{
rb_gc_mark_movable((VALUE)value); // key is never touched
return ST_CONTINUE;
}
Thus, tables used with rb_mark_tbl_no_pin can have non-VALUE keys (e.g. IDs, raw integers, C pointers, etc). The problem is that rb_gc_update_tbl_refs (via gc_update_table_refs) calls rb_gc_location on both keys and values:
static int
hash_foreach_replace(st_data_t key, st_data_t value, ...) {
if (rb_gc_location((VALUE)key) != (VALUE)key) // <- called on key
return ST_REPLACE;
if (rb_gc_location((VALUE)value) != (VALUE)value)
return ST_REPLACE;
...
}
Calling rb_gc_location on something that's not a VALUE and even sometimes replacing it is probably going to cause crashes and heap corruption.
Suggested fixes¶
I can think of a few ways to address this:
a) Deprecate rb_gc_update_tbl_refs
Maybe just deprecate it? I'm not even sure if this gets used very often.
b) Introduce rb_mark_hash_no_pin
The mark_keyvalue iterator already exists internally and is used by mark_hash() for Ruby's own Hash objects. Exposing a public rb_mark_hash_no_pin would give rb_gc_update_tbl_refs a safe, correct dmark counterpart for tables where both keys and values are Ruby objects.
c) Fix the docs and document the precondition for rb_gc_update_tbl_refs
Update the docs to make explicit that rb_gc_update_tbl_refs assumes a VALUE => VALUE table, and that all keys must have been marked (e.g. via rb_mark_set or a manual iteration with rb_gc_mark_movable) in addition to calling rb_mark_tbl_no_pin for the values. Weird and ugly, but correct?
Happy to help with a patch if useful.
No data to display