Skip to content

Use pure-Ruby Fiber-based Enumerator#next#5672

Merged
headius merged 9 commits intojruby:masterfrom
headius:fiber_enumerator_next
Apr 12, 2019
Merged

Use pure-Ruby Fiber-based Enumerator#next#5672
headius merged 9 commits intojruby:masterfrom
headius:fiber_enumerator_next

Conversation

@headius
Copy link
Member

@headius headius commented Mar 30, 2019

This code has been passed around multiple implementations as a
pure-Ruby version of Enumerator#next that just uses fibers. We
have had many issues with the separate fiber-like impl we used
for Enumerator#next, and the logical path forward is to unify that
with Fiber via this code.

There are a few caveats to this migration:

  • Three new introduced failures appear to be shared with other
    impls using this logic (rbx, truffleruby).
  • This commit breaks the fast-path Array logic for Enumerator#next
    so it's always going through a thread. However the protocol here
    would allow us to implement #to_generator on any object, which
    should make it possible to short-circuit more easily (in Ruby!).
  • I have not measured performance. However the logic surrounding
    the Fiber calls should not be particularly heavy.
  • I have not vetted the cleanup side of this, except to confirm
    that this implementation seems about as good at Thread leak with the biz gem #5671
    in that it does appear to GC, but for a high-churn benchmark it
    eventually collapses without explicit GCs. So...no worse?

This code is modified from TruffleRuby's copy of Rubinius's code,
with Rubinius's license being very liberal and TruffleRuby's being
identical to JRuby's.

I am proposing this for 9.2.7 because it seems like mostly a lateral move, although there were some tag changes. There are at least two todos before release:

  • Green build obviously.
  • Implement at least a few "fast" generators for core collections like Array and Hash.

This code has been passed around multiple implementations as a
pure-Ruby version of Enumerator#next that just uses fibers. We
have had many issues with the separate fiber-like impl we used
for Enumerator#next, and the logical path forward is to unify that
with Fiber via this code.

There are a few caveats to this migration:

* Three new introduced failures appear to be shared with other
  impls using this logic (rbx, truffleruby).
* This commit breaks the fast-path Array logic for Enumerator#next
  so it's always going through a thread. However the protocol here
  would allow us to implement #to_generator on any object, which
  should make it possible to short-circuit more easily (in Ruby!).
* I have not measured performance. However the logic surrounding
  the Fiber calls should not be particularly heavy.
* I have not vetted the cleanup side of this, except to confirm
  that this implementation seems about as good at jruby#5671
  in that it does appear to GC, but for a high-churn benchmark it
  eventually collapses without explicit GCs. So...no worse?

This code is modified from TruffleRuby's copy of Rubinius's code,
with Rubinius's license being very liberal and TruffleRuby's being
identical to JRuby's.
@headius headius added this to the JRuby 9.2.7.0 milestone Mar 30, 2019
@headius headius requested review from enebo, kares and lopex March 30, 2019 00:49
guizmaii referenced this pull request in Colisweb/AdoptJRuby Apr 3, 2019
 - Use my repo instead of the headius one.
 - build Jruby with `./mvnw` instead of `mvn package -Pbootstrap`
 - clean
guizmaii referenced this pull request in Colisweb/AdoptJRuby Apr 3, 2019
guizmaii referenced this pull request in Colisweb/AdoptJRuby Apr 3, 2019
guizmaii referenced this pull request in Colisweb/AdoptJRuby Apr 3, 2019
headius added 4 commits April 11, 2019 18:27
The original implementation was based on Enumerator code from
Rubinius. However in Rubinius, the Fiber thread does not root
objects any deeper than to the Fiber object itself. In order to
make this not leak threads, all state shared between the Fiber
and the Enumerator is passed in a separate State object.

This also includes minor tweaks, like fixing the to_generator
logic (needed for efficient non-fiber generators), and using the
Fiber.yield method directly as the body of the iteration, so no
intermediate block state is necessary. This last change should
reduce the cost of the iteration loop close to what it was in the
Java-based Enumerator#next.
* Fix generator result
* Fix peek_values to return a new array each time
* Call finalize on still-alive fibers when rewinding, to attempt
  to eagerly kill them.
@headius headius merged commit 8e1b5e4 into jruby:master Apr 12, 2019
@headius headius deleted the fiber_enumerator_next branch April 12, 2019 00:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants