Skip to content

jruby-openssl spins endlessly trying to read from a non-blocking SocketChannel while no data is available. #1280

@xb

Description

@xb

Under jruby-1.7.6, I observed multiple threads spinning in this state:

   java.lang.Thread.State: RUNNABLE
        at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
        at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
        at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
        at sun.nio.ch.IOUtil.read(IOUtil.java:197)
        at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:379)
        - locked <0x00000000e495c940> (a java.lang.Object)
        at org.jruby.ext.openssl.SSLSocket.readAndUnwrap(SSLSocket.java:516)
        at org.jruby.ext.openssl.SSLSocket.read(SSLSocket.java:504)
        at org.jruby.ext.openssl.SSLSocket.do_sysread(SSLSocket.java:616)
        at org.jruby.ext.openssl.SSLSocket.sysread(SSLSocket.java:634)

Using Java debugger, it turned out that the SocketChannelImpl object is set to non-blocking mode (the "blocking" flag of the object is set to false) while the method org.jruby.ext.openssl.SSLSocket.readAndUnwrap() is has the blocking parameter set to true.

When considering the current implementation of readAndUnwrap()

    private int readAndUnwrap(boolean blocking) throws IOException {
        int bytesRead = getSocketChannel().read(peerNetData);
        if (bytesRead == -1) {
            if (!peerNetData.hasRemaining() || (status == SSLEngineResult.Status.BUFFER_UNDERFLOW)) {
                closeInbound();
                return -1;
            }
            // inbound channel has been already closed but closeInbound() must
            // be defered till the last engine.unwrap() call.
            // peerNetData could not be empty.
        }

it is clear that bytesRead==0 (because the SocketChannel is in non-blocking mode), however, processing does not fail in any way. Rather, readAndUnwrap() returns 0, which in turn results in rr==0 in this code (in org.jruby.ext.openssl.SSLSocket.do_sysread()):

            // ensure >0 bytes read; sysread is blocking read.
            while (rr <= 0) {
                if (engine == null) {
                    rr = getSocketChannel().read(dst);
                } else {
                    rr = read(dst, blocking);
                }
                if (rr == -1) {
                    throw getRuntime().newEOFError();
                }
            }

Thus, this while loop spins forever. The comment says that sysread is a blocking read, but apparently it is not.

Thus, it apparently may be that a blocking SocketChannel becomes non-blocking due to weird circumstances, or it starts being non-blocking right away.

I'd recommend catching these bugs by checking whether the SocketChannel is actually blocking when it is expected to be blocking (and throwing an exception if the check fails). A quick workaround would be to ensure that the SocketChannel is always blocking when it is expected to be blocking (however, this hides the bug and may not solve it completely, maybe there is some interfering code which actually sets the SocketChannel to non-blocking mode).

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions