bpo-36888: Add multiprocessing.parent_process() #13247

tomMoral · 2019-05-11T15:38:43Z

In the std lib, the semaphore_tracker and the Manager rely on daemonized processes that are launched with server like loops. The cleaning of such processes is made complicated by the fact that there is no cannonical way to check if the parent process is alive.

I propose to add in context a parent_process function that would give access to a Process object representing the parent process. This way, we could benefit from sentinel to improve the clean up of these processes that can be left dangling in case of hard stop from the main interpreter.

https://bugs.python.org/issue36888

tomMoral

Some comments

Lib/test/_test_multiprocessing.py

tomMoral · 2019-05-16T16:16:37Z

Lib/multiprocessing/spawn.py

        if parent_pid is not None:
            source_process = _winapi.OpenProcess(
-                _winapi.PROCESS_DUP_HANDLE, False, parent_pid)
+                _winapi.PROCESS_ALL_ACCESS, False, parent_pid)


Out of curiosity, is it possible to narrow the permissions? (for example PROCESS_ALL_ACCESS + SYNCHRONIZE?)

You mean PROCESS_DUP_HANDLE and SYNCHRONIZE? I tried a few days ago and I got an access denied.

Ok. Perhaps @zooba has more insight about this.

Update: after a good night sleep, PROCESS_DUP_HANDLE | SYNCHRONIZE does work.

pierreglaser · 2019-05-16T16:58:23Z

@pitrou want to take a look? ;)

pierreglaser · 2019-05-17T15:01:42Z

Same as in #13276, this code will not always be functional if several python processes manipulate the same SharedMemoryManager. I am not sure that this represents the canonical use-case of SharedMemoryManager though.

tomMoral · 2019-05-17T16:35:04Z

Why is it not functional? The parent process handle should uniquely be open in the parent process no?

pierreglaser · 2019-05-18T17:48:01Z

@tomMoral Pardon me, I realized my last message was not very clear.

Manager objects can establish connections to several python processes. It does not make so much sense to differentiate its parent process to other potential processes that are connected to it: especially, we do not want the Manager process to start cleaning up ressources if its parent dies, while other processes may still be connected to the Manager.

Thus technically, a Manager would need one sentinel per process it is connected to (and not only one for its parent process, as it is the case right now). This is particularly hard to implement for SharedMemoryManager objects, because when asked to create a new resource, a SharedMemoryManager does not return a proxy type, but an actual SharedMemory object. Thus, processes accessing the SharedMemory object created by the Manager do not need to communicate with the Manager whenever they want to modify it.
Therefore, it seems hard, maybe impossible, for a SharedMemoryManager to track all the processes manipulating the objects it delivers.

pitrou · 2019-05-19T14:45:01Z

Is this still WIP?

pierreglaser · 2019-05-19T14:54:26Z

Not anymore.

pitrou

I have reservations about the implementation. Also, a documentation addition will be required.

pitrou · 2019-05-19T15:12:55Z

Lib/multiprocessing/spawn.py

        if parent_pid is not None:
            source_process = _winapi.OpenProcess(
-                _winapi.PROCESS_DUP_HANDLE, False, parent_pid)
+                _winapi.PROCESS_ALL_ACCESS, False, parent_pid)


Out of curiosity, is it possible to narrow the permissions? (for example PROCESS_ALL_ACCESS + SYNCHRONIZE?)

pitrou · 2019-05-19T15:16:38Z

Lib/multiprocessing/popen_fork.py

            try:
                os.close(parent_r)
-                code = process_obj._bootstrap()
+                code = process_obj._bootstrap(parent_sentinel=child_w)


I don't understand. child_w will always be "ready" (writable) except if the pipe buffer is full, so how can you use it as a sentinel?

The idea is to use _ParentProcess.is_alive, that itself calls mutliprocessing.connection.wait on the sentinel. Does that answer your question?

Not really, because it doesn't answer how it works concretely on the write end of the pipe.

>>> import os, select >>> r, w = os.pipe() >>> select.select([r], [r], [r]) ^CTraceback (most recent call last): File "<stdin>", line 1, in <module> KeyboardInterrupt >>> select.select([w], [w], [w]) ([], [4], [])

Though, at least on Linux:

>>> select.select([w], [], [w]) ^CTraceback (most recent call last): File "<stdin>", line 1, in <module> KeyboardInterrupt >>> os.close(r) >>> select.select([w], [], [w]) ([4], [], [])

I am not a frequent user of select, but it does seem odd to have a write end of a pipe into the rlist of select (which is the behavior we rely on using multiprocessing.connection.wait). But actually, I did not write this line, @tomMoral did. Maybe he has other insights on this.
@pitrou would you be more confortable if we use the read end of a new dedicated pipe as a sentinel instead to mimicate the sentinel implementation of the parent process for fork?

Yes, that would seem like a more strictly POSIX-compliant solution.

Yes indeed, it seems weird that way. I did not realize this was different compared to spawn_posix, where you have access to child_r. (But there seems to be a leak with parent_w with this implem, cf other comment).

That being said, the goal of the sentinel is simply to know if this pipe is still open or not. That is why it feels a waste to create a new pipe. But there might not be better answer.

pitrou · 2019-05-19T15:17:46Z

Lib/multiprocessing/spawn.py

-def _main(fd):
-    with os.fdopen(fd, 'rb', closefd=True) as from_parent:
+def _main(fd, parent_sentinel):
+    with os.fdopen(fd, 'rb', closefd=False) as from_parent:


Not very nice if fd and parent_sentinel are different fds... I think the caller should instead ensure those are different fds, possibly using os.dup.

Woops, agreed.

pitrou · 2019-05-19T15:20:03Z

Lib/multiprocessing/popen_forkserver.py

        self.sentinel, w = forkserver.connect_to_new_process(self._fds)
        self.finalizer = util.Finalize(self, os.close, (self.sentinel,))
-        with open(w, 'wb', closefd=True) as f:
+        with open(w, 'wb', closefd=False) as f:


Doesn't seem right. See comment about using os.dup instead where necessary.

pitrou · 2019-05-19T15:20:36Z

Lib/test/_test_multiprocessing.py

+        parent_pid, parent_name = rconn.recv()
+        self.assertEqual(parent_pid, current_process().pid)
+        self.assertEqual(parent_pid, os.getpid())
+        self.assertEqual(parent_name, current_process().name)


Please also test parent_process() from the parent process.

OK (Should return None)

pitrou · 2019-05-19T15:20:50Z

Lib/test/_test_multiprocessing.py

+    def test_parent_process_attributes(self):
+        if self.TYPE == "threads":
+            self.skipTest('test not appropriate for {}'.format(self.TYPE))
+        from multiprocessing.process import current_process


Shouldn't this be at the top-level?

I often see import statements inside tests, and I have to admit I do not know what rule is used to determine if an import should be top-level or not.

Generally, it's better to put imports at the top level, except if the module might not be available, or if importing it may have undesirable side-effects.

By the way multiprocessing.current_process() works, and perhaps multiprocessing.parent_process() should work, too.

bedevere-bot · 2019-05-19T15:21:41Z

A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated.

Once you have made the requested changes, please leave a comment on this pull request containing the phrase I have made the requested changes; please review again. I will then notify any core developers who have left a review that you're ready for them to take another look at this pull request.

Lib/multiprocessing/popen_spawn_posix.py

tomMoral · 2019-05-19T16:39:02Z

Lib/multiprocessing/popen_fork.py

            try:
                os.close(parent_r)
-                code = process_obj._bootstrap()
+                code = process_obj._bootstrap(parent_sentinel=child_w)


Yes indeed, it seems weird that way. I did not realize this was different compared to spawn_posix, where you have access to child_r. (But there seems to be a leak with parent_w with this implem, cf other comment).

That being said, the goal of the sentinel is simply to know if this pipe is still open or not. That is why it feels a waste to create a new pipe. But there might not be better answer.

pierreglaser · 2019-05-20T15:06:58Z

I have made the requested changes; please review again.

pitrou

Thanks for the update. There seems to be a couple of issues still.

pitrou · 2019-05-20T16:02:16Z

Lib/multiprocessing/popen_forkserver.py


        self.sentinel, w = forkserver.connect_to_new_process(self._fds)
        self.finalizer = util.Finalize(self, os.close, (self.sentinel,))
+        parent_sentinel = os.dup(w)


This variable isn't used anywhere. Is it a bug?

It leaves the write end opened as wis closed at the end of the following context manager (upon your request). But now that I fully understood #13247 (comment), I guess it is better to have it as a private attribute of the process object and register a Finalizer.

pitrou · 2019-05-20T16:04:38Z

Lib/multiprocessing/popen_spawn_win32.py

            env = None

-        with open(wfd, 'wb', closefd=True) as to_child:
+        with open(wfd, 'wb', closefd=False) as to_child:


Why is that? Is it a leftover from previous attempts?

Good catch, sorry about that.

Lib/multiprocessing/popen_spawn_posix.py

Lib/multiprocessing/process.py

Lib/multiprocessing/popen_fork.py

pitrou · 2019-05-20T16:11:57Z

Lib/multiprocessing/process.py

+        self._sentinel = sentinel
+        self._config = {}
+
+    def close(self):


It doesn't look necessary to override this method.

pitrou · 2019-05-20T16:13:39Z

Lib/test/_test_multiprocessing.py

+        wconn.send("alive" if parent_process().is_alive() else "not alive")
+
+        start_time = time.monotonic()
+        while (parent_process().is_alive() and


How about calling join(timeout=5)? With the sentinel, it seems like it should work.

pierreglaser · 2019-05-20T17:08:29Z

Lib/multiprocessing/popen_forkserver.py

+        # parent process used by the child process.
+        self._parent_w = os.dup(w)
        self.finalizer = util.Finalize(self, os.close, (self.sentinel,))
+        self.finalizer = util.Finalize(self, os.close, (self._parent_w,))


_parent_w is not the best name, but I did not find convincing, short alternatives. I thought _child_sentinel was somewhat unclear. Any ideas anyone?

I think _parent_w is ok. Also you don't need to make it an attribute (see other comment).

However, you're also clobbering the previous finalizer here.

pierreglaser · 2019-05-20T17:57:01Z

@pitrou thanks for the review.
I have made the requested changes; please review again.

pitrou

Just a couple more comments.

pitrou · 2019-05-20T17:55:19Z

Lib/multiprocessing/popen_fork.py

            self.finalizer = util.Finalize(self, os.close, (parent_r,))
+            self.finalizer = util.Finalize(self, os.close, (parent_w,))
            self.sentinel = parent_r
+            self._parent_w = parent_w


This is not required (but it doesn't hurt either).

pitrou · 2019-05-20T17:57:12Z

Lib/multiprocessing/popen_forkserver.py

+        # parent process used by the child process.
+        self._parent_w = os.dup(w)
        self.finalizer = util.Finalize(self, os.close, (self.sentinel,))
+        self.finalizer = util.Finalize(self, os.close, (self._parent_w,))


I think _parent_w is ok. Also you don't need to make it an attribute (see other comment).

Lib/multiprocessing/popen_spawn_posix.py

Lib/multiprocessing/process.py

pitrou · 2019-05-20T17:59:47Z

Lib/test/_test_multiprocessing.py

+        from multiprocessing.process import parent_process
+        wconn.send("alive" if parent_process().is_alive() else "not alive")
+
+        start_time = time.monotonic()


This isn't used anymore.

pitrou · 2019-05-20T18:01:07Z

Lib/multiprocessing/popen_fork.py

            os.close(child_w)
+            os.close(child_r)
            self.finalizer = util.Finalize(self, os.close, (parent_r,))
+            self.finalizer = util.Finalize(self, os.close, (parent_w,))


Hmm, this is clobbering the previous finalizer. You should use a single callback that closes both fds.

pitrou · 2019-05-20T18:01:23Z

Lib/multiprocessing/popen_forkserver.py

+        # parent process used by the child process.
+        self._parent_w = os.dup(w)
        self.finalizer = util.Finalize(self, os.close, (self.sentinel,))
+        self.finalizer = util.Finalize(self, os.close, (self._parent_w,))


However, you're also clobbering the previous finalizer here.

pierreglaser · 2019-05-20T19:24:50Z

I have made the requested changes; please review again.
@pitrou hopefully this will not make you cringe again :)

pitrou · 2019-05-20T19:37:16Z

Thank you :-)

the-knights-who-say-ni added the CLA signed label May 11, 2019

bedevere-bot added the awaiting review label May 11, 2019

ENH add parent_process in multiprocessing contexts

8839770

pierreglaser force-pushed the PR_parent_process branch from 249fa8d to 8839770 Compare May 16, 2019 15:53

pierreglaser added 9 commits May 16, 2019 17:57

FIX avoid error in Process.__repr__

bcdba73

CLN more logical attribute naming

98de001

ENH implement ParentProcess.is_alive

9b67034

FIX pass parent_name to ParentProcess

23a2f3d

do not use current_process to create ParentProcess

5630de5

TST test parent_process use-cases

be73255

FIX do not close parent_sentinel on windows

faa3fb8

FIX implement ParentProcess on windows

cd2f241

MNT news entry

3d71d3f

tomMoral commented May 16, 2019

View reviewed changes

pierreglaser added 2 commits May 16, 2019 18:53

FIX _main for forkserver

0ceffc8

TST better tests

9382659

pitrou requested changes May 19, 2019

View reviewed changes

bedevere-bot added awaiting changes and removed awaiting review labels May 19, 2019

tomMoral commented May 19, 2019

View reviewed changes

pierreglaser added 3 commits May 20, 2019 10:46

ENH use narrower permissions on windows

95dfc61

FIX duplicate sentinel when necessary

29898b5

ENH listen to read end of a pipe in fork

566db0e

pierreglaser added 2 commits May 20, 2019 12:23

FIX duplicate the parent sentinel in forkserver

a2310a7

ENH access parent_process from multiprocessing

b3829f6

tomMoral requested a review from 1st1 as a code owner May 20, 2019 10:27

pierreglaser added 4 commits May 20, 2019 12:35

TST improve test coverage and readability

133df9c

FIX pass correct fd to spawn on windows

3c5d241

DOC docs

4fe8467

CLN duplicate fd instead of not closing it

a9c987c

pitrou requested changes May 20, 2019

View reviewed changes

address review comments

e23cc08

pierreglaser reviewed May 20, 2019

View reviewed changes

FIX pass timeout correctly

7293d4c

pitrou reviewed May 20, 2019

View reviewed changes

pierreglaser added 2 commits May 20, 2019 20:51

address review comments

80dcabf

CLN unnecessary attribute

32aa640

pitrou approved these changes May 20, 2019

View reviewed changes

bedevere-bot added awaiting merge and removed awaiting changes labels May 20, 2019

pitrou changed the title ~~bpo-36888 ENH add parent_process in multiprocessing contexts~~ bpo-36888: Add multiprocessing.parent_process() May 20, 2019

pitrou merged commit c09a9f5 into python:master May 20, 2019

bedevere-bot removed the awaiting merge label May 20, 2019

pierreglaser mentioned this pull request May 27, 2019

bpo-36977: Make SharedMemoryManager release its resources if its parent process dies #13451

Open

tomMoral deleted the PR_parent_process branch February 22, 2022 12:47

Uh oh!

bpo-36888: Add multiprocessing.parent_process() #13247

bpo-36888: Add multiprocessing.parent_process() #13247

Uh oh!

Conversation

tomMoral commented May 11, 2019 • edited by pitrou Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tomMoral left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

pierreglaser May 20, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

pierreglaser commented May 16, 2019

Uh oh!

pierreglaser commented May 17, 2019

Uh oh!

tomMoral commented May 17, 2019

Uh oh!

pierreglaser commented May 18, 2019

Uh oh!

pitrou commented May 19, 2019

Uh oh!

pierreglaser commented May 19, 2019

Uh oh!

pitrou left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

pierreglaser May 19, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

bedevere-bot commented May 19, 2019

Uh oh!

tomMoral commented May 11, 2019 •

edited by pitrou

Loading

pierreglaser May 20, 2019 •

edited

Loading

pierreglaser May 19, 2019 •

edited

Loading