gh-90622: Do not spawn ProcessPool workers on demand via fork method.#91598
Conversation
…spawn via fork. This avoids potential deadlocks in the child processes due to forking from a multithreaded process.
|
cc @tomMoral for information and opinions |
|
When looking at the original bug report, I don't see why the deadlock would be linked to bad interaction between fork and threads. When I played with Maybe there is something that I am missing in the context of using That being said, when looking at the commit raised by bisect, I would be prone to suspect that something goes wrong with the feature introduced in #19453 and not with the from concurrent import futures
import faulthandler
import sys
def work(iteration, item):
sys.stdout.write(f'working: iteration={iteration}, item={item}\n')
sys.stdout.flush()
for i in range(0, 10000):
faulthandler.dump_traceback_later(20, exit=True)
with futures.ProcessPoolExecutor(max_workers=2) as executor:
executor.submit(work, i, 1)
executor.submit(work, i, 2)
faulthandler.cancel_dump_traceback_later()This should give a traceback that could help understand if the deadlock is really due to |
tcmalloc probably uses a background thread and doesn't check that the thread disappeared. This is a common problem with C libraries. |
If this was such a mechanism that is causing the deadlock, this should be deterministic as the C-level threads are never propagated. As the deadlock occurs randomly, I think this should have to do with some concurrency issue. |
|
Ah, then it's possible that the fork happens while a lock is held by another thread in the parent... |
Yep. This is the fundamental reason One reason some people wrongly assume "I use fork() in a threaded program and I never have problems" implies that it works is that glibc's malloc intentionally does not use locks (or if it does, it actively manages them during fork) because they aim for a level of legacy compatibility that includes attempting to support existing programs doing bad things that POSIX specs do not actually allow. glibc has a relatively poor malloc implementation for threaded process performance as a result. Thus tcmalloc, jemalloc, mimalloc, etc. all existing and being adopted by people serious about performance. So whenever you see threads knowingly existing and fork being used, there is a problem. Regardless of if you can reproduce it yourself. </preacher mode> |
Adds a comment describing why the lack of assertion means we still have a potential bug.
|
Thanks @gpshead for the PR 🌮🎉.. I'm working now to backport this PR to: 3.9, 3.10, 3.11. |
|
Sorry, @gpshead, I could not cleanly backport this to |
…ethod. (pythonGH-91598) Do not spawn ProcessPool workers on demand when they spawn via fork. This avoids potential deadlocks in the child processes due to forking from a multithreaded process. (cherry picked from commit ebb37fc) Co-authored-by: Gregory P. Smith <greg@krypto.org>
|
GH-92495 is a backport of this pull request to the 3.11 branch. |
|
Sorry @gpshead, I had trouble checking out the |
… fork method. (pythonGH-91598) Do not spawn ProcessPool workers on demand when they spawn via fork. This avoids potential deadlocks in the child processes due to forking from a multithreaded process.. (cherry picked from commit ebb37fc) Co-authored-by: Gregory P. Smith <greg@krypto.org>
|
GH-92497 is a backport of this pull request to the 3.10 branch. |
… fork method. (pythonGH-91598) (pythonGH-92497) Do not spawn ProcessPool workers on demand when they spawn via fork. This avoids potential deadlocks in the child processes due to forking from a multithreaded process.. (cherry picked from commit ebb37fc) Co-authored-by: Gregory P. Smith <greg@krypto.org> (cherry picked from commit b795376) Co-authored-by: Gregory P. Smith <greg@krypto.org>
…method. (GH-91598) (GH-92497) (#92499) Do not spawn ProcessPool workers on demand when they spawn via fork. This avoids potential deadlocks in the child processes due to forking from a multithreaded process.. (cherry picked from commit ebb37fc) Co-authored-by: Gregory P. Smith <greg@krypto.org> (cherry picked from commit b795376) Co-authored-by: Gregory P. Smith <greg@krypto.org> Co-authored-by: Gregory P. Smith <greg@krypto.org>
… fork method. (pythonGH-91598) (pythonGH-92497) (python#92499) Do not spawn ProcessPool workers on demand when they spawn via fork. This avoids potential deadlocks in the child processes due to forking from a multithreaded process.. (cherry picked from commit ebb37fc) Co-authored-by: Gregory P. Smith <greg@krypto.org> (cherry picked from commit b795376) Co-authored-by: Gregory P. Smith <greg@krypto.org> Co-authored-by: Gregory P. Smith <greg@krypto.org>
This avoids potential deadlocks in the child processes due to forking from
a multithreaded process.
This is the bugfix only PR suitable for backporting (manual intervention likely still required). See #91587 for the PR fixing the new 3.11 feature that builds on this. I expect to merge that one first and rebase this on top of it.