Skip to content

Conversation

@jhawthorn
Copy link
Member

I was realizing with the new TYPED_DATA_EMBEDDED it should be possible to store an RData as a proper valid RTypedData. We do this by marking the T_DATA as embedded, and having void *data be the first element of the struct (so that it's in the same location) after that pointer we can store dmark and dfree.

I think there is still usage of the old API in the wild and this should remove most of our maintenance burden internally.

cc @byroot @nobu What do you think? Is this worth doing? I can do more testing and clean this up for merging if you think so.

There is one potential incompatibility with this: RTYPEDDATA_P will return true for all T_DATA. (I suppose we could remove all references to RTYPEDDATA_P internally and add a new method to check whether it's this special type of TypedData, but that seems unnecessary)

CApiWrappedTypedStruct RTYPEDDATA_P returns false for an untyped data object FAILED
Expected true == false
to be truthy but was false
/Users/jhawthorn/src/ruby/spec/ruby/optional/capi/typed_data_spec.rb:97:in 'block (3 levels) in <top (required)>'
/Users/jhawthorn/src/ruby/spec/ruby/optional/capi/typed_data_spec.rb:21:in '<top (required)>'

@launchable-app

This comment has been minimized.

Comment on lines +1070 to +1074
struct rb_legacy_rdata {
void *data; /* must be first for RDATA(obj)->data compatibility */
RUBY_DATA_FUNC dmark;
RUBY_DATA_FUNC dfree;
};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is possible to keep the order as previous and make extra embed size 0.
You can obtain the container pointer from the member pointer with ccan_container_of.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need somewhere to store dmark and dfree and RTypedData needs the location those used to be in for RData

a = @s.untyped_wrap_struct(1024)
@s.RTYPEDDATA_P(a).should == false
end
end
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should enclose in ruby_version_is ""..."4.0" do / end.

@nobu
Copy link
Member

nobu commented Dec 9, 2025

Apart from this PR, Always treat encoding as TYPEDDATA should be merged, I think.

Copy link
Member

@byroot byroot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been meaning to retrofit RData as RTypedData since about two years, but your solution is much better than the one I was toying with.

I believe this is very valuable as RData is very rare now, officially deprecated, and removing many conditionals in hotspots should help with performance (as demonstrated in #15387). And I think you now can remove the RTYPEDDATA_P(obj) condition in rbimpl_check_typeddata.

My only worry is that we're perhaps a bit late in the release cycle for such a change, but I'd really really like that change for 4.0 though.

Comment on lines +1077 to +1079
STATIC_ASSERT(rdata_data, offsetof(struct RData, data) == offsetof(struct RTypedData, data) + offsetof(struct rb_legacy_rdata, data));
STATIC_ASSERT(rdata_dmark, offsetof(struct RData, dmark) == offsetof(struct RTypedData, data) + offsetof(struct rb_legacy_rdata, dmark));
STATIC_ASSERT(rdata_dfree, offsetof(struct RData, dfree) == offsetof(struct RTypedData, data) + offsetof(struct rb_legacy_rdata, dfree));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we even need this anymore? If legacy RData is retrofitted inside RTypedData, we don't care about offsets. AFAIK these were only necessary to be able to distinguish the two.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or do you want to keep supporting RDATA(obj)->dmark and RDATA(obj)->dfree ? I can't imagine how an extension would legitimately need to do that, to me that always was private API.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had intended to keep dmark/dfree just in case they were in use, but if we're aiming for Ruby 4.1 I can see removing them. We should probably keep RDATA(obj)->data for now

}

static const rb_data_type_t rb_legacy_rdata_type = {
.wrap_struct_name = "legacy RData",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
.wrap_struct_name = "legacy RData",
.wrap_struct_name = "deprecated RData",

@byroot
Copy link
Member

byroot commented Dec 9, 2025

There is one potential incompatibility with this: RTYPEDDATA_P will return true for all T_DATA.

I think after this change, RTYPEDDATA_P has almost no more usefulness internally, so we can just remove about all references of it. If you are worried about 3rd party usage, you can define it to be something like:

RTYPEDDATA_TYPE(obj) != &deprecated_rdata_type

But we probably deprecate this as well as part of https://bugs.ruby-lang.org/issues/19998

@byroot byroot requested a review from jeremyevans December 9, 2025 07:11
@byroot
Copy link
Member

byroot commented Dec 9, 2025

You should also be able to get rid of RUBY_TYPED_FL_IS_TYPED_DATA.

Copy link
Contributor

@jeremyevans jeremyevans left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, I think this is good change. I worry it may be a little late in the dev cycle to introduce this change, though. We are only a little over two weeks from release.

I think we may want a developer with more GC knowledge that I have to review the commented out code in gc.c. We may want to wait to commit this until we have a better understanding about why the existing code was wrong, and what the correct code would be. If we already have that understanding, we should remove or replace the commented out code and include the reasoning in the commit message.

@jhawthorn
Copy link
Member Author

I agree it's probably too late in the cycle for this. I'll wait until January.

I think we may want a developer with more GC knowledge that I have to review the commented out code in gc.c. We may want to wait to commit this until we have a better understanding about why the existing code was wrong, and what the correct code would be. If we already have that understanding, we should remove or replace the commented out code and include the reasoning in the commit message.

The rb_gc_shutdown_call_finalizer_p change? The function tells us whether we need to call the finalizer for an object during VM shutdown. For T_DATA we generally have to, this was trying to skip them for RData/RTypedData with no data or no dfree (where we would not call the free function normally). However it was "wrong" for TypedData because RDATA(obj)->dfree is going to accidentally read from RTYPEDDATA(obj)->type which will never be 0.

There's no reason to test this. data being NULL is rare, and we can just let the free code later treat that as a no-op. I'll do this in a separate PR.

@jhawthorn jhawthorn changed the title Declare legacy RData as TypedData [Ruby 4.1] Declare legacy RData as TypedData Dec 10, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants