-
-
Notifications
You must be signed in to change notification settings - Fork 33.7k
bpo-36888: Add multiprocessing.parent_process() #13247
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
249fa8d to
8839770
Compare
tomMoral
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some comments
Lib/multiprocessing/spawn.py
Outdated
| if parent_pid is not None: | ||
| source_process = _winapi.OpenProcess( | ||
| _winapi.PROCESS_DUP_HANDLE, False, parent_pid) | ||
| _winapi.PROCESS_ALL_ACCESS, False, parent_pid) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Out of curiosity, is it possible to narrow the permissions? (for example PROCESS_ALL_ACCESS + SYNCHRONIZE?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You mean PROCESS_DUP_HANDLE and SYNCHRONIZE? I tried a few days ago and I got an access denied.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok. Perhaps @zooba has more insight about this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update: after a good night sleep, PROCESS_DUP_HANDLE | SYNCHRONIZE does work.
|
@pitrou want to take a look? ;) |
|
Same as in #13276, this code will not always be functional if several python processes manipulate the same |
|
Why is it not functional? The parent process handle should uniquely be open in the parent process no? |
|
@tomMoral Pardon me, I realized my last message was not very clear.
Thus technically, a |
|
Is this still WIP? |
|
Not anymore. |
pitrou
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have reservations about the implementation. Also, a documentation addition will be required.
Lib/multiprocessing/spawn.py
Outdated
| if parent_pid is not None: | ||
| source_process = _winapi.OpenProcess( | ||
| _winapi.PROCESS_DUP_HANDLE, False, parent_pid) | ||
| _winapi.PROCESS_ALL_ACCESS, False, parent_pid) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Out of curiosity, is it possible to narrow the permissions? (for example PROCESS_ALL_ACCESS + SYNCHRONIZE?)
Lib/multiprocessing/popen_fork.py
Outdated
| try: | ||
| os.close(parent_r) | ||
| code = process_obj._bootstrap() | ||
| code = process_obj._bootstrap(parent_sentinel=child_w) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand. child_w will always be "ready" (writable) except if the pipe buffer is full, so how can you use it as a sentinel?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea is to use _ParentProcess.is_alive, that itself calls mutliprocessing.connection.wait on the sentinel. Does that answer your question?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not really, because it doesn't answer how it works concretely on the write end of the pipe.
>>> import os, select
>>> r, w = os.pipe()
>>> select.select([r], [r], [r])
^CTraceback (most recent call last):
File "<stdin>", line 1, in <module>
KeyboardInterrupt
>>> select.select([w], [w], [w])
([], [4], [])
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Though, at least on Linux:
>>> select.select([w], [], [w])
^CTraceback (most recent call last):
File "<stdin>", line 1, in <module>
KeyboardInterrupt
>>> os.close(r)
>>> select.select([w], [], [w])
([4], [], [])
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not a frequent user of select, but it does seem odd to have a write end of a pipe into the rlist of select (which is the behavior we rely on using multiprocessing.connection.wait). But actually, I did not write this line, @tomMoral did. Maybe he has other insights on this.
@pitrou would you be more confortable if we use the read end of a new dedicated pipe as a sentinel instead to mimicate the sentinel implementation of the parent process for fork?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that would seem like a more strictly POSIX-compliant solution.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes indeed, it seems weird that way. I did not realize this was different compared to spawn_posix, where you have access to child_r. (But there seems to be a leak with parent_w with this implem, cf other comment).
That being said, the goal of the sentinel is simply to know if this pipe is still open or not. That is why it feels a waste to create a new pipe. But there might not be better answer.
Lib/multiprocessing/spawn.py
Outdated
| def _main(fd): | ||
| with os.fdopen(fd, 'rb', closefd=True) as from_parent: | ||
| def _main(fd, parent_sentinel): | ||
| with os.fdopen(fd, 'rb', closefd=False) as from_parent: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not very nice if fd and parent_sentinel are different fds... I think the caller should instead ensure those are different fds, possibly using os.dup.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Woops, agreed.
| self.sentinel, w = forkserver.connect_to_new_process(self._fds) | ||
| self.finalizer = util.Finalize(self, os.close, (self.sentinel,)) | ||
| with open(w, 'wb', closefd=True) as f: | ||
| with open(w, 'wb', closefd=False) as f: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't seem right. See comment about using os.dup instead where necessary.
Lib/test/_test_multiprocessing.py
Outdated
| parent_pid, parent_name = rconn.recv() | ||
| self.assertEqual(parent_pid, current_process().pid) | ||
| self.assertEqual(parent_pid, os.getpid()) | ||
| self.assertEqual(parent_name, current_process().name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please also test parent_process() from the parent process.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK (Should return None)
Lib/test/_test_multiprocessing.py
Outdated
| def test_parent_process_attributes(self): | ||
| if self.TYPE == "threads": | ||
| self.skipTest('test not appropriate for {}'.format(self.TYPE)) | ||
| from multiprocessing.process import current_process |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this be at the top-level?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I often see import statements inside tests, and I have to admit I do not know what rule is used to determine if an import should be top-level or not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally, it's better to put imports at the top level, except if the module might not be available, or if importing it may have undesirable side-effects.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By the way multiprocessing.current_process() works, and perhaps multiprocessing.parent_process() should work, too.
|
A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated. Once you have made the requested changes, please leave a comment on this pull request containing the phrase |
Lib/multiprocessing/popen_fork.py
Outdated
| try: | ||
| os.close(parent_r) | ||
| code = process_obj._bootstrap() | ||
| code = process_obj._bootstrap(parent_sentinel=child_w) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes indeed, it seems weird that way. I did not realize this was different compared to spawn_posix, where you have access to child_r. (But there seems to be a leak with parent_w with this implem, cf other comment).
That being said, the goal of the sentinel is simply to know if this pipe is still open or not. That is why it feels a waste to create a new pipe. But there might not be better answer.
|
I have made the requested changes; please review again. |
pitrou
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the update. There seems to be a couple of issues still.
|
|
||
| self.sentinel, w = forkserver.connect_to_new_process(self._fds) | ||
| self.finalizer = util.Finalize(self, os.close, (self.sentinel,)) | ||
| parent_sentinel = os.dup(w) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This variable isn't used anywhere. Is it a bug?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It leaves the write end opened as wis closed at the end of the following context manager (upon your request). But now that I fully understood #13247 (comment), I guess it is better to have it as a private attribute of the process object and register a Finalizer.
| env = None | ||
|
|
||
| with open(wfd, 'wb', closefd=True) as to_child: | ||
| with open(wfd, 'wb', closefd=False) as to_child: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is that? Is it a leftover from previous attempts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch, sorry about that.
Lib/multiprocessing/process.py
Outdated
| self._sentinel = sentinel | ||
| self._config = {} | ||
|
|
||
| def close(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't look necessary to override this method.
Lib/test/_test_multiprocessing.py
Outdated
| wconn.send("alive" if parent_process().is_alive() else "not alive") | ||
|
|
||
| start_time = time.monotonic() | ||
| while (parent_process().is_alive() and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about calling join(timeout=5)? With the sentinel, it seems like it should work.
| # parent process used by the child process. | ||
| self._parent_w = os.dup(w) | ||
| self.finalizer = util.Finalize(self, os.close, (self.sentinel,)) | ||
| self.finalizer = util.Finalize(self, os.close, (self._parent_w,)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
_parent_w is not the best name, but I did not find convincing, short alternatives. I thought _child_sentinel was somewhat unclear. Any ideas anyone?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think _parent_w is ok. Also you don't need to make it an attribute (see other comment).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
However, you're also clobbering the previous finalizer here.
|
@pitrou thanks for the review. |
pitrou
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a couple more comments.
Lib/multiprocessing/popen_fork.py
Outdated
| self.finalizer = util.Finalize(self, os.close, (parent_r,)) | ||
| self.finalizer = util.Finalize(self, os.close, (parent_w,)) | ||
| self.sentinel = parent_r | ||
| self._parent_w = parent_w |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not required (but it doesn't hurt either).
| # parent process used by the child process. | ||
| self._parent_w = os.dup(w) | ||
| self.finalizer = util.Finalize(self, os.close, (self.sentinel,)) | ||
| self.finalizer = util.Finalize(self, os.close, (self._parent_w,)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think _parent_w is ok. Also you don't need to make it an attribute (see other comment).
Lib/test/_test_multiprocessing.py
Outdated
| from multiprocessing.process import parent_process | ||
| wconn.send("alive" if parent_process().is_alive() else "not alive") | ||
|
|
||
| start_time = time.monotonic() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't used anymore.
Lib/multiprocessing/popen_fork.py
Outdated
| os.close(child_w) | ||
| os.close(child_r) | ||
| self.finalizer = util.Finalize(self, os.close, (parent_r,)) | ||
| self.finalizer = util.Finalize(self, os.close, (parent_w,)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, this is clobbering the previous finalizer. You should use a single callback that closes both fds.
| # parent process used by the child process. | ||
| self._parent_w = os.dup(w) | ||
| self.finalizer = util.Finalize(self, os.close, (self.sentinel,)) | ||
| self.finalizer = util.Finalize(self, os.close, (self._parent_w,)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
However, you're also clobbering the previous finalizer here.
|
I have made the requested changes; please review again. |
|
Thank you :-) |
In the std lib, the
semaphore_trackerand theManagerrely on daemonized processes that are launched with server like loops. The cleaning of such processes is made complicated by the fact that there is no cannonical way to check if the parent process is alive.I propose to add in context a
parent_processfunction that would give access to aProcessobject representing the parent process. This way, we could benefit from sentinel to improve the clean up of these processes that can be left dangling in case of hard stop from the main interpreter.https://bugs.python.org/issue36888