Skip to content

Allow reified classes to store instance variables in raw fields#7012

Merged
headius merged 1 commit intojruby:jruby-9.3from
byteit101:field_ivar_storage
Mar 16, 2022
Merged

Allow reified classes to store instance variables in raw fields#7012
headius merged 1 commit intojruby:jruby-9.3from
byteit101:field_ivar_storage

Conversation

@byteit101
Copy link
Member

@byteit101 byteit101 commented Jan 18, 2022

This is principally for making Java integration produce more idomatic Ruby. This allows users to configure a class to store specific ivars in a field of a reified class. This is principally for JRubyFX at this point in time, as that framwork sets java fields on a provided class.

Still a WIP. Notable things that need to be done:

  • Find a good configuration name
  • pick exception types
  • deduplicate some code
  • Write tests
  • Document weird nonsense like @foo.equals?(@foo) being false for some types (ex: j.l.String)
  • Document weird nonsense like @foo= being frozen, but self.foo= not
  • Deal with clones and parents

Quick working demo:

require 'java'
require 'jruby/core_ext'

class BackedClass
   java_field "java.lang.String testfield" # define the field to be reified
   headius_definitely_approved_this_ivar_configuration_method_name :@testfield # now mark it used for ivar storage

	def testread
		puts "Ruby R #{@testfield.inspect}"
	end
	def testwrite(v)
		puts "Ruby W #{v}"
		@testfield = v
	end
	def testjavaread
		puts "Java R #{self.class.java_class.get_field("testfield").get(self).inspect}"
	end
	def testjavawrite(v)
		self.class.java_class.get_field("testfield").set(self, v)
		puts "Java W #{v}"
	end
	become_java!
end

o = BackedClass.new
o.testjavaread
o.testread
o.testjavawrite("Java says hi")
o.testjavaread
o.testread
o.testwrite("Ruby strings are better!")
o.testjavaread
o.testread

Now outputs:

Java R nil
Ruby R nil
Java W Java says hi
Java R "Java says hi"
Ruby R "Java says hi"
Ruby W Ruby strings are better!
Java R "Ruby strings are better!"
Ruby R "Ruby strings are better!"

@headius
Copy link
Member

headius commented Jan 24, 2022

A couple general comments...

  • Perhaps we should merge the new configuration method into java_field as an option or additional argument? Something like java_field, "java.lang.String => testfield", variable: :@testfield or perhaps variable: true to just use the same name for the instance variable (with @ of course).
  • This also lets us punt on coming up with a good name 😀
  • The lambdafication of the "variable accessor builder" is a good step. We may want to just formalize an interface, though, because there's no reason we'd want to mix and match, e.g. a different type of accessor for set and get.

Nicely done reusing the plumbing for FieldVariableAccessor! I can give this a run through JVM assembly dumping to confirm it is inlining the access to the field directly.

@enebo
Copy link
Member

enebo commented Jan 24, 2022

Some thoughts from chat:

java_field, "java.lang.String => testfield", variable: :@testfield
I think :@testfield could possibly be variable: true if it is same name and variable: :test_field or whatnot if not
truthy eliminates needing to write the name twice
removing the @, to me at least, doesn't make it look like you can do other weird stuff like :$testfield

@byteit101
Copy link
Member Author

My current configuration thought is java_field "java.lang.String foo", variable: true for "basic" usage, and java_field "java.lang.Object foo", variable: {unwrap: true, type: com.MyClass} for "advanced" usage though I'm also debating bind_variable and bind instead of variable

@kares
Copy link
Member

kares commented Jan 25, 2022

variable: {unwrap: true, type: com.MyClass}

does it make sense to unwrap: false, type: com.MyClass ? if not than there should be just one option ...

@headius
Copy link
Member

headius commented Jan 25, 2022

java_field "java.lang.Object foo", variable: {unwrap: true, type: com.MyClass}

A similar question to @kares: why would you declare the variable as java.lang.Object but then also as com.MyClass? The latter is just for unwrapping, but isn't it going to determine that type anyway when it unwraps?

@byteit101
Copy link
Member Author

but isn't it going to determine that type anyway when it unwraps?
That's the argument to pass into toJava. The example above is kinda dumb, but short vs int vs long may be more useful.

I've updated it to be:

  • Default: java_field "java.lang.Object foo" # bind_variable: false
  • Auto Unwrap: java_field "java.lang.Object foo", bind_variable: true
  • No Unwrap: java_field "java.lang.Object foo", bind_variable: {to_java: false}
  • Unwrap with type: java_field "java.lang.Object foo", bind_variable: {to_java: com.MyClass}

I've also added some more tests for concrete vs ruby and unwrap vs no unwrap

As part of the concrete testing I had to expand the methodhandle stuff much more, but I think I'm ready for review now

@byteit101 byteit101 marked this pull request as ready for review January 29, 2022 08:06
@headius
Copy link
Member

headius commented Jan 31, 2022

A few general comments/questions as I start reviewing...

Default: java_field "java.lang.Object foo" # bind_variable: false

So this is like current behavior on a Java proxy type, where the field will be presented as a foo method that converts Ruby/Java both ways, yes (specifically things like String and numerics)?

Auto Unwrap: java_field "java.lang.Object foo", bind_variable: true

Something about this naming doesn't work for me. If we want to be explicit about it being an instance variable, why not "instance_variable" instead of "bind_variable"? This keyword is really about adjusting the behavior of the instance variable accesses, right?

If this is a more general purpose bind configuration then bind: true or binding: true or even variable_bind: true read better to me. Bikeshedding mostly.

@byteit101
Copy link
Member Author

So this is like current behavior on a Java proxy type, where the field will be presented as a foo method that converts Ruby/Java both ways, yes (specifically things like String and numerics)?

Correct, and those methods are always present (the configuration doesn't change that)

Copy link
Member

@headius headius left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved but we should chat through my very minor naming/encapsulation concerns.

@byteit101
Copy link
Member Author

Auto Unwrap: java_field "java.lang.Object foo", bind_variable: true

Something about this naming doesn't work for me. If we want to be explicit about it being an instance variable, why not "instance_variable" instead of "bind_variable"? This keyword is really about adjusting the behavior of the instance variable accesses, right?

If this is a more general purpose bind configuration then bind: true or binding: true or even variable_bind: true read better to me. Bikeshedding mostly.

I think we should get the naming right on the first go, so here is my reasoning:

  • Should have something to indicate this is to do with Ruby instance variables
  • Should indicate that this is for more advanced java integration of fields with ruby class syntax
    • We are constrained lexically by the positioning of this argument on a thing about java fields, whereas we want to tell the user what will happen to variables.
  • Should indicate that this may change the semantics of fields (see the notes on the list above)
  • Ergonomics should be reasonably short, though not too short. If the ergonomics are at an optimized level of clunky, they are enough to remember the basics, but not enough to remember the details, sending people to avoid unless more necessary, and sending people to the docs where we can warn them of the semantic issues above.
    It's that last point that made me favor bind, but link, map, or any other similar modifier works for me. I like instance_variable but I think it may be too unwieldy in the name department when combined with a modifier, but I'm not sure.

A few I thought of:

  • expose_as_variable
  • link_to_variable
  • variable_link
  • integrate_variable
  • as_variable
  • instance_variable_bind
  • variable_bind # I like this suggestion of yours better than my bind_variable

@headius
Copy link
Member

headius commented Feb 7, 2022

I am leaning toward instance_variable: because it reads well to me. "declare a Java field with type X and name foo, and make it accessible as an instance variable with these characteristics". I think "instance_variable" also makes sense since it's used in instance_variable_set, instance_variable_get and so on.

I'm also having a bit of trouble with the nested hash for the instance variable options. It makes sense semantically but it feels cumbersome.

Using your examples:

  • Default: java_field "java.lang.Object foo" # instance_variable: false ...variable will not be bound
  • Auto Unwrap: java_field "java.lang.Object foo", instance_variable: true ...variable bound with default conversion logic

Those two forms feel pretty good. With the additional options, it feels cleaner to just have them as additional keywords. "to_java" is pretty good, but "convert" might be more obvious? One problem with "to_java" to me is that it's really bi-directional, so it's both "to_java" and "to_ruby" depending on whether you set or get.

  • No Unwrap: java_field "java.lang.Object foo", instance_variable: true, convert: false ...variable will not convert to/from Java types
  • Unwrap with type: java_field "java.lang.Object foo", instance_variable: true, convert: com.MyClass ...variable will convert to target type com.MyClass.

@byteit101
Copy link
Member Author

I do like the parity with instance_variable_get and instance_variable_set, but I don't think it necessarily reads as "and make it accessible" to me without one of the action words I proposed above (bind, link, expose, etc).

Because this is rather advanced usage and does weird things to semantics, I'd definitely like something a) easily understandable without googling in case a someone runs into it and hasn't seen this form before, and b) easily googleable. Due to the infrequency of use by end users expected, I think we should optimize understanding for the novice reader over the expert writer here. That's one reason (in addition to those listed above) why I have preferred "verb-y" names thus far.

Taking this into account, I definitely would prefer something like instance_variable_bind: true or expose_via_instance_variable: true

I notice you are preferring verbless names so far, what's the reasoning behind that preference?

As for the other options, convert is fine by me, as is inline (non-nested hash). Another thought I had was store_as, but I think convert is probably fine. My only concern is that now they may read as an independent option.

@enebo
Copy link
Member

enebo commented Feb 15, 2022

I think adding a verb will become harder to remember than the word 'instance_variable'. java_field is a declarative set of statements about the field. One adjective here is it is also an instance_variable (or so it will appear so).

I cannot think of a better name that convert. I would like to see a concrete example of why convert needs to be anything other than true or false. I believe there is one but it would be nice to see a good example (all I am coming up with is field is a base type yet we decide to say it is a specific type with convert). We need this example if for no other reason then we need to document the feature.

@byteit101
Copy link
Member Author

I would like to see a concrete example of why convert needs to be anything other than true or false. I believe there is one but it would be nice to see a good example (all I am coming up with is field is a base type yet we decide to say it is a specific type with convert). We need this example if for no other reason then we need to document the feature.

The need for a specific type comes from toJava requiring an argument, and me not knowing any better way to say "make this a java thing, whatever it is". Under the hood, convert is only used as self.my_field = value.to_java(convert_argument_here)

Examples:

java_field "java.lang.Object myfield", ..., convert: false # default for object, stores the IRubyObject in myfield
java_field "my.Class myfield", ..., convert: my.Class # default for non-Object, stores the raw java instance in myfield

java_field "java.lang.Object myfield", ..., convert: my.Class # stores the raw java instance in myfield
java_field "java.lang.Numeric myfield", ..., convert: j.l.Float # store a float in the number, not an integer, long, or short

Without convert, any type of Object either a) won't be able to store a java instance or b) won't be able to store a ruby instance. by adding convert, it's configurable.
As a more concrete example, imagine a poorly written java JSON library, that reads fields of type object, but expects only one of String or HashMap. To interface with that, you want to store a string, and thus have to use java_field "java.lang.Object myfield", ..., convert: j.l.String`

Another important note: the convert argument only really matters if java (or any jvm bits) will be reading this field after Ruby had written to it. It doesn't matter for writing to it from the JVM. JavaFX (via JRubyFX) only writes, never reads fields, so convert is unnecessary there.

A summary of when convert: is useful:

  • When Java is reading the field after Ruby writes, and
  • The field type is j.l.Object, or
  • The field type is the parent of a class hierarchy of differing-semantic classes (j.l.Numeric), or
  • if the JRuby to_java itself has special logic (unsure)

I think that is all the reasons you would use convert:

As for the primary name, I'm not clear on which name you prefer? instance_variable plain? one of the verbed forms?

@enebo
Copy link
Member

enebo commented Feb 16, 2022

I guess I see this as three possible forms:

No conversion (you are just passing around the natural Java type of the field:

java_field "my.Address address", instance_variable: true, convert: false

Conversion [default and can be omitted] (we are passing a Ruby type around which knows how to convert to the Java type [rubyToJava]):

java_field "java.lang.String address", instance_variable: true, convert: true

Conversion but we want to specify what type we want to convert too (this seems like it could only be useful for a Ruby type which can be multiple Java classes/interfaces):

java_field "java.swing.JComponent component", instance_variable: true, convert: java.swing.JButton

This last example is what I don't understand. It is also possible I am missing something in the original proposal. I largely just see a specified type to convert but I do not really understand why convert would ever be different than the type declaration. If they are different then I am wondering what the behavior is if the Java access of this same type puts the base type but it will not match our convert type.

This is principally for making Java integration produce
more idomatic Ruby. This allows users to configure a class
to store specific ivars in a field of a reified class.
@byteit101 byteit101 changed the base branch from master to jruby-9.3 February 17, 2022 04:45
@byteit101
Copy link
Member Author

byteit101 commented Feb 17, 2022

I have updated the syntax to "intance_variable:" and "to_java", as per the recent matrix chat. I have also squashed and rebased this to 9.3 as per the same matrix chat. (original commits on orig_field_ivar_storage on byteit101 fork) That's all I have.

@enebo enebo added this to the JRuby 9.3.4.0 milestone Mar 16, 2022
@headius headius merged commit ed82382 into jruby:jruby-9.3 Mar 16, 2022
@headius
Copy link
Member

headius commented Mar 16, 2022

Merged!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants