bpo-31540 Add context management for concurrent.futures.ProcessPoolExecutor#3682
bpo-31540 Add context management for concurrent.futures.ProcessPoolExecutor#3682pitrou merged 9 commits intopython:masterfrom
Conversation
tomMoral/loky#48 * Add context argument to allow non forking ProcessPoolExecutor * Do some cleaning (pep8+nonused code+naming) * Liberate the ressource earlier in the `_worker_process`
f2f41e0 to
6376291
Compare
| result_queue.put(_ResultItem(call_item.work_id, | ||
| result=r)) | ||
|
|
||
| # Liberate the resource as soon as possible, to avoid holding onto |
There was a problem hiding this comment.
Would it be easy to add a test for this?
There was a problem hiding this comment.
I added a test for this behavior.
Lib/concurrent/futures/process.py
Outdated
| max_workers: The maximum number of processes that can be used to | ||
| execute the given calls. If None or not given then as many | ||
| worker processes will be created as the machine has processors. | ||
| context: A multiprocessing context to launch the workers. This |
There was a problem hiding this comment.
As a nit, I think calling this parameter mp_context would be a bit more explicit.
Lib/test/test_concurrent_futures.py
Outdated
| t = {executor_type}(5) | ||
| t.submit(sleep_and_print, 1.0, "apple") | ||
| if __name__ == "__main__": | ||
| t = {executor_type}(5) |
There was a problem hiding this comment.
Perhaps we want to pass the right context argument here?
| script: | ||
| # Skip tests that re-run the entire test suite. | ||
| - ./venv/bin/python -m coverage run --pylib -m test --fail-env-changed -uall,-cpu -x test_multiprocessing_fork -x test_multiprocessing_forkserver -x test_multiprocessing_spawn | ||
| - ./venv/bin/python -m coverage run --pylib -m test --fail-env-changed -uall,-cpu -x test_multiprocessing_fork -x test_multiprocessing_forkserver -x test_multiprocessing_spawn -x test_concurrent_futures |
There was a problem hiding this comment.
When coverage is used with multiprocessing, the spawning of new interpreter launch a new test session, resulting in a mess. I think it was previously working with the fork context but when I launch the test with the three backends, it also launches new test sessions.
Thus, I disabled coverage tests with test_concurrent_futures. I am not sure of what I should do if this is not the case. The duplicated test sessions could results from some command line arguments parsing either in the semaphore tracker or the forkserver but I do not think it is linked to this PR.
Let me know if this make sense.
|
Thank you! This looks fine on the principle (apart from a couple small things mentioned in the review). You'll need to update the docs in |
- Rename context to mp_context in ProcessPoolExecutor constructor - Fix the context used in test_interpreter_shutdown
- Ensure that the job argument passed are freed asap
Doc/library/concurrent.futures.rst
Outdated
| given, it will default to the number of processors on the machine. | ||
| If *max_workers* is lower or equal to ``0``, then a :exc:`ValueError` | ||
| will be raised. | ||
| *mp_context* can be a multiprocessing context or any object providing a |
There was a problem hiding this comment.
I'd rather restrict it to a multiprocessing context. Perhaps we'll use other multiprocessing APIs in the future.
|
|
||
| class EventfulGCObj(): | ||
| def __init__(self, ctx): | ||
| mgr = get_context(ctx).Manager() |
There was a problem hiding this comment.
What is the rationale for using a manager here? Since the object is instantiated in the parent, it should be inheritable by the child anyway.
There was a problem hiding this comment.
The executor is launched in the setup method of the TestCase. Thus, there is no possibility to pass the Event object thru inheritence and the job (id, obj) is passed via pickle, which requires the Manager.
There was a problem hiding this comment.
Oh, right. I've never used managers and I was surprised to see this...
Lib/test/test_concurrent_futures.py
Outdated
| future = self.executor.submit(id, obj) | ||
| future.result() | ||
|
|
||
| assert obj.event.wait(timeout=1) |
There was a problem hiding this comment.
By convention, we'd use self.assertTrue (also so that assertions are still checked if running with -O).
|
Thank you @tomMoral ! |
The
ProcessPoolExecutorprocesses start method can only be change by changing the global default context withset_start_methodat the beginning of a script. We propose to allow passing a context argument in the constructor to allow more flexible control of the executor. Doing so, we also add some tests for all the available context, to make sure the executor is working correctly.In addition, we made the following changes, which can be put in another PR if necessary:
_shutdownto_global_shutdownto make its function in the code clearer._worker_process. Indeed, with the actual behavior, the ressources are not freed before theworkerreceives a new task or shutdown.This work was done as part of the
lokyproject in collaboration with@ogrisel. See #1013 for the details.
https://bugs.python.org/issue31540