-
-
Notifications
You must be signed in to change notification settings - Fork 942
Description
Under jruby-1.7.6, I observed multiple threads spinning in this state:
java.lang.Thread.State: RUNNABLE
at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
at sun.nio.ch.IOUtil.read(IOUtil.java:197)
at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:379)
- locked <0x00000000e495c940> (a java.lang.Object)
at org.jruby.ext.openssl.SSLSocket.readAndUnwrap(SSLSocket.java:516)
at org.jruby.ext.openssl.SSLSocket.read(SSLSocket.java:504)
at org.jruby.ext.openssl.SSLSocket.do_sysread(SSLSocket.java:616)
at org.jruby.ext.openssl.SSLSocket.sysread(SSLSocket.java:634)
Using Java debugger, it turned out that the SocketChannelImpl object is set to non-blocking mode (the "blocking" flag of the object is set to false) while the method org.jruby.ext.openssl.SSLSocket.readAndUnwrap() is has the blocking parameter set to true.
When considering the current implementation of readAndUnwrap()
private int readAndUnwrap(boolean blocking) throws IOException {
int bytesRead = getSocketChannel().read(peerNetData);
if (bytesRead == -1) {
if (!peerNetData.hasRemaining() || (status == SSLEngineResult.Status.BUFFER_UNDERFLOW)) {
closeInbound();
return -1;
}
// inbound channel has been already closed but closeInbound() must
// be defered till the last engine.unwrap() call.
// peerNetData could not be empty.
}it is clear that bytesRead==0 (because the SocketChannel is in non-blocking mode), however, processing does not fail in any way. Rather, readAndUnwrap() returns 0, which in turn results in rr==0 in this code (in org.jruby.ext.openssl.SSLSocket.do_sysread()):
// ensure >0 bytes read; sysread is blocking read.
while (rr <= 0) {
if (engine == null) {
rr = getSocketChannel().read(dst);
} else {
rr = read(dst, blocking);
}
if (rr == -1) {
throw getRuntime().newEOFError();
}
}Thus, this while loop spins forever. The comment says that sysread is a blocking read, but apparently it is not.
Thus, it apparently may be that a blocking SocketChannel becomes non-blocking due to weird circumstances, or it starts being non-blocking right away.
I'd recommend catching these bugs by checking whether the SocketChannel is actually blocking when it is expected to be blocking (and throwing an exception if the check fails). A quick workaround would be to ensure that the SocketChannel is always blocking when it is expected to be blocking (however, this hides the bug and may not solve it completely, maybe there is some interfering code which actually sets the SocketChannel to non-blocking mode).