Skip to content

Fix "docker-java-stream" thread stuck at UnixDomainSocket.recv()#1476

Merged
bsideup merged 7 commits intodocker-java:masterfrom
tejksat:alexander.koshevoy/preventFromBlockingOnReadFromUDS
Nov 16, 2020
Merged

Fix "docker-java-stream" thread stuck at UnixDomainSocket.recv()#1476
bsideup merged 7 commits intodocker-java:masterfrom
tejksat:alexander.koshevoy/preventFromBlockingOnReadFromUDS

Conversation

@tejksat
Copy link
Copy Markdown
Contributor

@tejksat tejksat commented Oct 26, 2020

The fix is trivial and it is located in com.github.dockerjava.okhttp.UnixDomainSocket.UnixSocketInputStream.read(byte[], int, int).

I have also fixed AttachContainerCmdIT, which used to fail for me from time to time because of the race condition. I believe the canonical way is to: (1) create the container, (2) attach to it, and then (3) start it. The last two actions have been previously shifted. This led to the race condition: the container might have finished its execution before the attach actually happened.

For me (with Docker for Mac 2.4.0.0), the issue is reproduced in com.github.dockerjava.cmd.AttachContainerCmdIT.attachContainerWithTTY test for OkHttp transport:

  1. Sporadically in master branch.

  2. Steadily after applying the PR Fix sporadic failures of AttachContainerCmdIT tests #1483 with the patch for AttachContainerCmdIT:
    The main thread waits for AttachContainerTestCallback to complete.

    "main@1" prio=5 tid=0x1 nid=NA waiting
      java.lang.Thread.State: WAITING
    	  at jdk.internal.misc.Unsafe.park(Unsafe.java:-1)
    	  at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:234)
    	  at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1079)
    	  at java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1369)
    	  at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:278)
    	  at com.github.dockerjava.api.async.ResultCallbackTemplate.awaitCompletion(ResultCallbackTemplate.java:111)
    	  at com.github.dockerjava.cmd.AttachContainerCmdIT.attachContainerWithTTY(AttachContainerCmdIT.java:169)
    	  ...
    	  at com.intellij.rt.junit.JUnitStarter.main(JUnitStarter.java:53)
    

    And docker-java-stream thread is stuck on recv() when trying to process HTTP response headers of attach command (while it should read the standard streams of the Docker container):

    "docker-java-stream-1410725528" #28 daemon prio=5 os_prio=31 cpu=0.82ms elapsed=1.92s tid=0x00007f8fb7115000 nid=0x9f07 runnable  [0x000070000c28b000]
       java.lang.Thread.State: RUNNABLE
    	at com.github.dockerjava.okhttp.UnixDomainSocket.recv(Native Method)
    	at com.github.dockerjava.okhttp.UnixDomainSocket$UnixSocketInputStream.read(UnixDomainSocket.java:240)
    	at java.io.FilterInputStream.read(java.base@11.0.8/FilterInputStream.java:133)
    	at com.github.dockerjava.okhttp.UnixSocketFactory$1$1.read(UnixSocketFactory.java:45)
    	at okio.Okio$2.read(Okio.java:140)
    	at okio.AsyncTimeout$2.read(AsyncTimeout.java:237)
    	at okio.RealBufferedSource.indexOf(RealBufferedSource.java:358)
    	at okio.RealBufferedSource.readUtf8LineStrict(RealBufferedSource.java:230)
    	at okhttp3.internal.http1.Http1ExchangeCodec.readHeaderLine(Http1ExchangeCodec.java:242)
    	at okhttp3.internal.http1.Http1ExchangeCodec.readHeaders(Http1ExchangeCodec.java:251)
    	at okhttp3.internal.http1.Http1ExchangeCodec.readResponseHeaders(Http1ExchangeCodec.java:219)
    	at okhttp3.internal.connection.Exchange.readResponseHeaders(Exchange.java:115)
    	at okhttp3.internal.http.CallServerInterceptor.intercept(CallServerInterceptor.java:94)
    	at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:142)
    	at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:117)
    	at com.github.dockerjava.okhttp.HijackingInterceptor.intercept(HijackingInterceptor.java:20)
    	at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:142)
    	at okhttp3.internal.connection.ConnectInterceptor.intercept(ConnectInterceptor.java:43)
    	...
    	at okhttp3.RealCall.execute(RealCall.java:81)
    	at com.github.dockerjava.okhttp.OkDockerHttpClient$OkResponse.<init>(OkDockerHttpClient.java:251)
    	at com.github.dockerjava.okhttp.OkDockerHttpClient.execute(OkDockerHttpClient.java:225)
    	at com.github.dockerjava.cmd.TrackingDockerHttpClient.execute(TrackingDockerHttpClient.java:27)
    	at com.github.dockerjava.core.DefaultInvocationBuilder.execute(DefaultInvocationBuilder.java:228)
    	at com.github.dockerjava.core.DefaultInvocationBuilder.lambda$executeAndStream$1(DefaultInvocationBuilder.java:269)
    	at com.github.dockerjava.core.DefaultInvocationBuilder$$Lambda$97/0x00000008002c1040.run(Unknown Source)
    	at java.lang.Thread.run(java.base@11.0.8/Thread.java:834)
    
    

@tejksat tejksat requested a review from bsideup October 27, 2020 14:34
@tejksat
Copy link
Copy Markdown
Contributor Author

tejksat commented Oct 30, 2020

@bsideup is anything significant missing in the PR? Should I improve it?

…ttp transport) (docker-java#1475)

Prevent unnecessary recurring calls of `recv()` on UNIX socket, which sometimes results in blocking read and causes hanging of "docker-java-stream" thread.
…docker-java#1475)

Prevent unnecessary recurring calls of `read()` on UNIX socket, which may result in blocking read and cause hanging of "docker-java-stream" thread.
@tejksat tejksat force-pushed the alexander.koshevoy/preventFromBlockingOnReadFromUDS branch from e137293 to f648421 Compare November 3, 2020 08:31
@tejksat
Copy link
Copy Markdown
Contributor Author

tejksat commented Nov 3, 2020

I've extracted the changes for AttachContainerCmdIT in the separate PR: #1483. I've added the fix for the similar method in HttpClient transport com.github.dockerjava.httpclient5.UnixDomainSocket.UnixSocketInputStream.read(byte[], int, int).

Due to different implementations of processing HTTP responses in OkHttp and HttpClient transport (different flows when working with the buffer — using okio.RealBufferedSource vs non-cyclic buffer new byte[1024] in FramedInputStreamConsumer.accept(DockerHttpClient.Response)) the issue is not reproduced for HttpClient on the same tests in AttachContainerCmdIT.

@tejksat tejksat changed the title Fix "docker-java-stream" thread stuck at UnixDomainSocket.recv() (OkHttp transport) Fix "docker-java-stream" thread stuck at UnixDomainSocket.recv() (OkHttp and HttpClient transports) Nov 3, 2020
@bsideup bsideup added this to the next milestone Nov 10, 2020
@wendigo
Copy link
Copy Markdown

wendigo commented Nov 13, 2020

@bsideup I believe that we (@prestosql) were hit by this bug after updating to testcontainers 1.15.0. Is there a chance that it will be merged soon and release will follow?

@bsideup bsideup changed the title Fix "docker-java-stream" thread stuck at UnixDomainSocket.recv() (OkHttp and HttpClient transports) Fix "docker-java-stream" thread stuck at UnixDomainSocket.recv() Nov 16, 2020
@bsideup bsideup merged commit d85ea79 into docker-java:master Nov 16, 2020
@bsideup
Copy link
Copy Markdown
Member

bsideup commented Nov 16, 2020

@wendigo it is merged now and will be released soon. After that, we will update our version in TC and release it as well (since we shade docker-java (except docker-java-api), it needs to be released and cannot be solved with a transitive dependency update

@wendigo
Copy link
Copy Markdown

wendigo commented Nov 16, 2020

@bsideup if you'd be able to release SNAPSHOT version of TC I can test if that solves our problem as well

@cdancy
Copy link
Copy Markdown
Contributor

cdancy commented Nov 16, 2020

@bsideup can you comment back here when a new version has been released? Seeing similar issues and we think this might solve the problem.

@HenningCash
Copy link
Copy Markdown

You can use jitpack.io to test the master branch:
https://jitpack.io/#docker-java/docker-java/master-SNAPSHOT

@tejksat
Copy link
Copy Markdown
Contributor Author

tejksat commented Nov 25, 2020

We have switched to the latest 3.2.6 version. So far so good.

@bsideup thank you for the release!

wendigo added a commit to wendigo/trino that referenced this pull request Dec 22, 2020
This version includes fix for "docker-java-stream" thread stuck at UnixDomainSocket.recv():
docker-java/docker-java#1476
losipiuk pushed a commit to trinodb/trino that referenced this pull request Dec 22, 2020
This version includes fix for "docker-java-stream" thread stuck at UnixDomainSocket.recv():
docker-java/docker-java#1476
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants