Skip to content

Provide direct byte[] read and write for RubyIO#6590

Merged
headius merged 12 commits intojruby:masterfrom
headius:io_bytes_readwrite
Mar 2, 2021
Merged

Provide direct byte[] read and write for RubyIO#6590
headius merged 12 commits intojruby:masterfrom
headius:io_bytes_readwrite

Conversation

@headius
Copy link
Member

@headius headius commented Mar 2, 2021

This PR tracks work to address #6589 by adding allocation-free RubyIO read and write paths.

Several Java-based consumers of RubyIO write to it without having
a RubyString in hand, by wrapping incoming byte[] or ByteList.
This patch adds a write path that can accept unwrapped byte[] plus
encoding to reduce allocation and follow a fast path.

Users that hit this logic, mostly via IOOutputStream:

* The Psych ext when dumping to an IO
* For stdout and stderr streams provided by Ruby.getError/OutputStream
* By Marshal for dumping to a target IO or IO-like
* By GzipWriter for writing to a stream
* Anyone that calls to_outputstream on an IO

Part of work for jruby#6589
@headius headius added this to the JRuby 9.3.0.0 milestone Mar 2, 2021
@headius headius requested review from enebo and kares March 2, 2021 03:34
@headius
Copy link
Member Author

headius commented Mar 2, 2021

Note that due to the complexity of passing more than one "out" parameter, IO writes that require transcoding will still allocate a ByteList, but the initial wasted wrapper is not created in any case.

headius added 4 commits March 1, 2021 21:51
None of the consumers use the return value, and this path is
intended to simulate OutputStream.write, which does not return the
number of bytes written, so skip the creation or retrieval of a
Fixnum and just return the int.
Copy link
Member

@kares kares left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🥇 these will be very useful.
have couple overload suggestions that would be nice to support as a Java API

@headius
Copy link
Member Author

headius commented Mar 2, 2021

@kares Thank you for the review! Yeah I think there are more places we could be using these, including in extensions. @jonathanswenson commented on Matrix that he had attempted to modify Puma to use more direct IO calls but without this PR it could only get so far. This also came to my attention because Psych uses this stream abstraction in order to let SnakeYAML dump directly to an IO or IO-like, the former of which should be much more efficient now (and the latter may be worth special-casing common IO-likes like StringIO).

I will add the suggested API enhancements and work on adding read support today. Plus benchmarks.

headius added 3 commits March 2, 2021 10:54
Status is set heavily for any blocking operations so make this
lighter weight.
This eliminates allocation of the struct-like object for every
write operation, which should improve the performance of all IO
writes.
These mimic OutputStream except for requiring an Encoding be
passed in. Encoding could be made optional depending on how we
want to handle jruby#6588.
@headius
Copy link
Member Author

headius commented Mar 2, 2021

With this latest push, IO implements context-free versions of the write methods (four overloads added with recommendations to call the context-receiving versions) and there should be zero allocation along the fast path write (with no transcoding).

@headius
Copy link
Member Author

headius commented Mar 2, 2021

Getting performance metrics on this from Ruby is difficult, because most of the key consumers (Psych, Marshal, GzipWriter) have more overhead of their own. However an allocation profile is interesting. Note the drastic reduction in byte[] and ByteList allocation overall.

Script:

require 'tempfile'

io = Tempfile.new('bench_marshal_dump')

ary = ["foobar"] * 10000

n = 30
while n > 0
  Marshal.dump(ary, io)
  io.seek(IO::SEEK_SET, 0)
  n-=1
end

Before:

          percent          live          alloc'ed  stack class
 rank   self  accum     bytes objs     bytes  objs trace name
    1 28.47% 28.47%  14650408 366248  26851968 671287 300000 org.jruby.util.ByteList
    2 27.88% 56.35%  14346288 358645  24680088 616990 300000 org.jruby.RubyString
    3 18.23% 74.58%   9382496 367510  22223920 681015 300000 byte[]
    4  2.33% 76.91%   1198472 18274   8555048 117271 300000 char[]
    5  1.02% 77.93%    523600 16347    621648 19411 300000 java.util.concurrent.ConcurrentHashMap$Node
    6  0.85% 78.78%    437616 18213   1695936 70643 300000 java.lang.String
    7  0.67% 79.45%    344384 6956   1605896 31868 300000 java.lang.Object[]
    8  0.53% 79.98%    274680 1558   1248408 24813 300000 int[]
    9  0.44% 80.43%    227632 7098    234128  7301 300000 java.lang.ref.WeakReference
   10  0.39% 80.82%    201744 6289    501104 15644 300000 java.util.HashMap$Node

After:

          percent          live          alloc'ed  stack class
 rank   self  accum     bytes objs     bytes  objs trace name
    1  8.14%  8.14%   1192440 18217   8565008 117414 300000 char[]
    2  6.05% 14.19%    885136 13972   7830192 80777 300000 byte[]
    3  3.58% 17.77%    524016 16360    621808 19416 300000 java.util.concurrent.ConcurrentHashMap$Node
    4  3.47% 21.24%    508448 12699   2837568 70927 300000 org.jruby.util.ByteList
    5  2.99% 24.23%    437232 18197   1698336 70743 300000 java.lang.String
    6  2.32% 26.55%    339344 6936   1607816 31922 300000 java.lang.Object[]
    7  1.87% 28.42%    274440 1551   1249416 24832 300000 int[]
    8  1.55% 29.97%    227504 7094    234000  7297 300000 java.lang.ref.WeakReference
    9  1.40% 31.37%    204288 5095  24676488 616900 300000 org.jruby.RubyString
   10  1.37% 32.74%    201104 6269    500848 15636 300000 java.util.HashMap$Node

Copy link
Member

@enebo enebo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did not compare the new methods you ported but I trust you nailed it :)

It is for 9.3 too so it does not have the same requirements as a backport (like changing a field type).

headius added 4 commits March 2, 2021 14:52
* Factor out real IO calc in IOOutputStream constructor
* Revert to RubyThread.getStatus to preserve native status
* Clean up some C-style declarations
@headius headius marked this pull request as ready for review March 2, 2021 21:49
@headius headius merged commit 80777de into jruby:master Mar 2, 2021
@headius headius deleted the io_bytes_readwrite branch March 2, 2021 21:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants