-
-
Notifications
You must be signed in to change notification settings - Fork 33.7k
Description
Bug report
Bug description:
Currently, all blocking operations in concurrent.interpreters.Queue work via polling. In general, this approach is satisfactory for an initial implementation (unfortunately, this is a cancer of the asynchronous world), but it suffers from one very unpleasant performance issue. Namely, the maximum number of operations per second is effectively bounded by the chosen delay:
>>> # `concurrent.interpreters.Queue`: a simple echo benchmark
>>> from concurrent import interpreters
>>> from timeit import timeit
>>> iterations = 100
>>> in_q = interpreters.create_queue()
>>> out_q = interpreters.create_queue()
>>> out_q.put(None)
>>> def echo(in_q, out_q):
... while True:
... out_q.put(in_q.get())
>>> interpreters.create().call_in_thread(echo, out_q, in_q)
>>> iterations / timeit(lambda: out_q.put(in_q.get()), number=iterations)
61.944790966792546 # OPS, >10 milliseconds per operation (`_delay=10 / 1000`)It is set by the undocumented _delay parameter and is essentially a compromise between processor load and response speed (which I also described in a comment to one PR). Neither queue queues nor multiprocessing queues use polling and therefore have much higher performance:
>>> # `queue.SimpleQueue`: a simple echo benchmark
>>> from queue import SimpleQueue
>>> from threading import Thread
>>> from timeit import timeit
>>> iterations = 100_000
>>> in_q = SimpleQueue()
>>> out_q = SimpleQueue()
>>> out_q.put(None)
>>> def echo(in_q, out_q):
... while True:
... out_q.put(in_q.get())
>>> Thread(target=echo, args=[out_q, in_q]).start()
>>> iterations / timeit(lambda: out_q.put(in_q.get()), number=iterations)
154248.64387142067 # OPS, <10 microseconds per operation>>> # `queue.Queue`: a simple echo benchmark
>>> from queue import Queue
>>> from threading import Thread
>>> from timeit import timeit
>>> iterations = 100_000
>>> in_q = Queue()
>>> out_q = Queue()
>>> out_q.put(None)
>>> def echo(in_q, out_q):
... while True:
... out_q.put(in_q.get())
>>> Thread(target=echo, args=[out_q, in_q]).start()
>>> iterations / timeit(lambda: out_q.put(in_q.get()), number=iterations)
37413.69358798779 # OPS, <50 microseconds per operation>>> # `multiprocessing.Queue`: a simple echo benchmark
>>> from multiprocessing import Process, Queue, set_start_method
>>> from timeit import timeit
>>> set_start_method("fork")
>>> iterations = 10_000
>>> in_q = Queue()
>>> out_q = Queue()
>>> out_q.put(None)
>>> def echo(in_q, out_q):
... while True:
... out_q.put(in_q.get())
>>> Process(target=echo, args=[out_q, in_q]).start()
>>> iterations / timeit(lambda: out_q.put(in_q.get()), number=iterations)
13411.797245613465 # OPS, <100 microseconds per operationI marked this as a bug, since such low performance can hardly be considered expected behavior for a lightweight alternative to multiprocessing. It would be much better to implement truly blocking behavior.
CPython versions tested on:
3.14
Operating systems tested on:
Linux
Metadata
Metadata
Assignees
Labels
Projects
Status