Fix several MacOS memory management bugs#31769
Conversation
Before calling [NSApp run], we need to drain the previous pool as the windows have been added to it during enumeration.
There exists a (likely rare) scenario in which new NSEvents can be added to the queue indefinitely, and thus the outermost pool will never be drained. Hence, create a new pool and drain it once per iteration.
Now that we are setting up and draining pools, our MatplotlibAppDelegate is short lived, as NSApplication.delegate is a weak reference.
The alloc/init of the NSFileHandle needs a -release The object returned by -addObserverForName:object:queue:usingBlock: is retained by the system and needs a corresponding -removeObserver
The call to -[NSApp windows] needs to be inside of an @autoreleasepool block, else there exists the possibility that the windows array will go into a lower pool that won't be drained.
Using setReleasedWhenClosed:NO has been the standard for programmatically-created windows since the introduction of ARC. Also add calls to setDelegate:nil as NSWindow.delegate was assign until macOS 10.13.
|
Thank you for opening your first PR into Matplotlib! If you have not heard from us in a week or so, please leave a new comment below and that should bring it to our attention. Most of our reviewers are volunteers and sometimes things fall through the cracks. We also ask that you please finish addressing any review comments on this PR and wait for it to be merged (or closed) before opening a new one, as it can be a valuable learning experience to go through the review process. You can also join us on discourse chat for real-time discussion. For details on testing, writing docs, and our review process, please see the developer guide. We strive to be a welcoming and open project. Please follow our Code of Conduct. |
|
I did some more thinking about the "Pools at Boundaries" case. If we used preprocessor macros at the start and end of the function, it would:
For example, the #define BEGIN_PYTHON_TO_OBJC_SECTION \
@autoreleasepool { @try {
#define END_PYTHON_TO_OBJC_SECTION \
} @catch (NSException *exception) { \
PyErr_SetString(PyExc_RuntimeError, [[exception reason] UTF8String]); \
} }
…
static PyObject*
show(PyObject* self)
{
BEGIN_PYTHON_TO_OBJC_SECTION
// -objectEnumerator will add all windows to the topmost pool,
// wrap in @autoreleasepool as -[NSApp run] is long-running.
@autoreleasepool {
[NSApp activateIgnoringOtherApps: YES];
NSArray *windowsArray = [NSApp windows];
NSEnumerator *enumerator = [windowsArray objectEnumerator];
NSWindow *window;
while ((window = [enumerator nextObject])) {
[window orderFront:nil];
}
}
Py_BEGIN_ALLOW_THREADS
[NSApp run];
Py_END_ALLOW_THREADS
END_PYTHON_TO_OBJC_SECTION
Py_RETURN_NONE;
} |
I don't know much about the internals or best naming, but the macros do have some value, as otherwise the |
Yeah, that's one of my big hesitations with my patch as-is, I really dislike the idea of clobbering so much of the blame history! |
Rather than using @autoreleasepool and indenting several lines of code (which will break `git blame`), use BEGIN_OBJC_ENTRY and END_OBJC_ENTRY.
|
The test failures above seem to be related to some issue with time zones. While I don't think it's related to my changes, please let me know if I need to rerun the tests somehow or address this on my end. |
|
This is great! Thank you for all the great explanations and walkthrough of the details you've updated, it really helps. I unfortunately don't have a mac as easily at hand these days, but I'll try and test this out later this weekend if I can. I read through the code and had AI do a quick sanity check as well and am leaving my human written comments inline.
I am interested in moving this project to ARC! I did attempt that a while ago, but we ended up having a segfault when using the subplot tool icon to spawn a new window and then close that, so we reverted that commit. See these references. If you have any ideas of what needs to be done for the ARC work, that would be a huge help. I was just doing that with attempting to read documentation and listening to the compilers, but it didn't seem like it would be a huge lift... I would make sure you test the subplot tool icon to make sure that works as expected still even with these changes. I would test by closing the windows in different orders and closing/reopening the subplots etc. just to make sure it is keeping the count as expected. |
|
@greglucas - Note that Objective-C exceptions are regarded to be fragile and are not to be used like exceptions in other programming languages. From Exceptions and Cocoa:
Apple Objective-C frameworks instead use a (verbose) pattern of: When Apple made Swift, they automatically translated this pattern to Swift Thus, it makes sense to have our |
I agree with this. I think that we need to |
Sorry for the noise, my text editor automatically keeps trying to re-indent these when I save.
greglucas
left a comment
There was a problem hiding this comment.
Thanks for addressing all of those comments. I think this looks good now! We can keep improving in follow-up PRs as I think this lays a nice groundwork.
|
@greglucas - Do I need to do anything else on my end to merge this (or is that something you do)? In the meantime, I'm attempting to install macOS Sierra 10.12 on a test machine and plan to dig into the Window<->FigureManager reference cycle. I'll likely address that in my ARC work. |
# Conflicts: # src/_macosx.m
| } | ||
| } | ||
|
|
||
| static int wait_for_stdin() { |
There was a problem hiding this comment.
There's already an autoreleasepool in most of this function; do we need a second one?
There was a problem hiding this comment.
Yes. The first pool covers entry, the second pool is due to being inside of an while loop. In general: pools are cheap to push/pop and better than leaking objects.
There was a problem hiding this comment.
There's no while loop in this function, though?
There was a problem hiding this comment.
My mistake, I was using my phone's web browser before and your comment looked like it was related to a different function.
In wait_for_stdin(), we do need multiple autorelease pools, but the code as written is incorrect.
static int wait_for_stdin() {
BEGIN_OBJC_ENTRY
…
if (![[NSApp windows] count]) {
flushEvents();
return 1;
}
At the end of this code snippet, the call to [NSApp windows] has made a copy of the windows array and then added it to the autorelease pool set up in BEGIN_OBJC_ENTRY. This pool isn't drained until the end of the function.
However, we then go on to do a call to [NSApp run], which may run for a very long time. Our windows are retained and in the outermost pool and may not be released for a while.
The if block needs to be in its own @autoreleasepool, and the rest of the function can simply use the pool set up by BEGIN_OBJC_ENTRY.
Good catch!
| NSEvent* event = [NSApp nextEventMatchingMask: NSEventMaskAny | ||
| untilDate: date | ||
| inMode: NSDefaultRunLoopMode | ||
| dequeue: YES]; |
There was a problem hiding this comment.
Note, these were (perhaps wrongly?) aligned on the colon.
There was a problem hiding this comment.
Let me double check this when I get back to a desktop. Colons should be aligned to each other, but the merge may have messed up alignment.
There was a problem hiding this comment.
Yep, that got misaligned with the merge. I'll fix it. Thanks!
| } | ||
| @try { | ||
|
|
||
| NSApplication* app = [NSApplication sharedApplication]; |
There was a problem hiding this comment.
We can probably dedent these bits at least.
QuLogic
left a comment
There was a problem hiding this comment.
I guess we should squash this, given the indent/dedent back-and-forth.
|
Thanks @iccir! We hope to see you again. I'm definitely interested in reviewing the other updates you mentioned if you're able to get around to them. |
Thanks for merging! I'll be working on the ARC changes next. Expect a PR once I get some more testing done! |
|
@meeseeksdev backport to v3.11.x |
…769-on-v3.11.x Backport PR #31769 on branch v3.11.x (Fix several MacOS memory management bugs)
PR summary
This pull request fixes several memory management issues in
_macosx.m. It allows for windows to be properly dealloc'd, which should fix #31106.I realize that I'm a stranger and first-time contributor, please feel free to push back and be honest! I use to do these types of object ownership audits when I worked at Apple, but it's been almost 15 years since I've looked at manual-reference-counted code :)
Autorelease Pools
One of the root causes of #31106 is that each NSWindow allocated in
FigureManager_new()was added to an NSAutoreleasePool that wasn't drained by the time-[NSApplication run]was called. macOS performed several-retain/-autoreleasepairs on the windows. By the time ourshow()function was called, the retain count of each windows was over 600 (See #31751).Pools at Boundaries
Although messy, a good guideline is to setup an autorelease pool each time you cross from the Python layer into the Objective-C layer. This prevents objects from going into a pool that you have no control over, and may never be drained.
I have added
@autoreleasepoolblocks to every static function exposed to Python. Some of these aren't needed since there are no Obj-C calls (event_loop_is_running(),Timer_dealloc(), etc). I can removed the unneeded ones; however, I felt like it was better to be consistent and guard against the possibility of a future contributor using Obj-C fromevent_loop_is_running()/etc and then re-introducing this issue.I followed the existing coding style and indented each
@autoreleasepoolblock. This resulted in a messy diff due to the number of lines changed. I could instead use preprocessor macros similar toPy_BEGIN_ALLOW_THREADS/Py_END_ALLOW_THREADSwithout indentation. Perhaps something likeBEGIN_OBJC_CALLS?Pools in Loops
When a loop can iterate indefinitely, it's a good idea to create a pool for each iteration. I did so in
FigureCanvas__start_event_loop()andflushEvents(). There existed a possibility that the call to-sendEvent:could add additional events to the event queue and thus spin the loop indefinitely without pool drainage.Pools before -run
The structure of
show()andwait_for_stdin()is a bit different. Here, we need to use two pools. The first pool will catch the autoreleased value of-[NSApp windows]and will be drained prior to the call to[NSApp run]. Without two pools, the windows will have a +1 retain count and will be waiting to be drained while[NSApp run]is running.NSTrackingArea Leak
#31754, also fixed by PR #31661 . I can back this out if desired.
The
NSTrackingAreacreated inFigureCanvas_init()was leaking. The+alloc/-initneeds a balancing-release.Technically,
NSTrackingArea.ownerwas an unsafeassignreference until macOS 10.13 when it was turned into a saferweakreference. There now exists the possibility of a crash on macOS 10.12 if the owning NSView is deallocated before the NSTrackingArea. I'm not sure if this is worth addressing, however. I can fix it if desired.wake_on_fd_write leaks
#31755
There are two leaks in
wake_on_fd_write():NSFileHandleis never-releaseed.-addObserverForName:object:queue:usingBlock:is retained by the system and not released until-removeObserver:.I fixed both of these by adding cleanup code to the end of the block.
MatplotlibAppDelegate
With pools in place and draining, the
MatplotlibAppDelegateobject created inlazy_init()was short-lived. AsNSApplication.delegateis a weak property, there was no strong reference to it.Per ARC conventions, I made a new
staticvariable to point to the object. While this isn't necessary under manually reference counting, it's typically used to indicate that an+alloc/-initwithout a balanced-releaseis intentional. It also saves a step if a migration to ARC is ever desired.isReleasedWhenClosed
Setting
isReleasedWhenClosedtoNOhas been the standard for programmatically-created windows since the introduction of ARC in 2011. Parts of AppKit, such as NSWindowController, will automatically call[theWindow setReleasedWhenClosed:NO]on adopted windows.While the existing code should correctly release the window per historic API documentation, I'm not sure what kind of test coverage
isReleasedWhenClosed=YESwindows have inside of Apple, as the recommendation is to use NSWindowController.I changed the window to
isReleasedWhenClosed=NOand modified the calls to-closeas follows:The
setDelegate:nilcall is needed asNSWindow.delegatedidn't becomeweakuntil macOS 10.13. I believe that matplotlib officially still supports 10.12.Testing
Checking for Over-releases
Historically, fixing leaks in one section of code can expose over-releases/crashes in other sections. Prior to this PR, we are "leaking" several Objective-C objects due to them going into a pool that is never drained.
To mitigate this, I ran several built-in examples with permutations of the following environmental variables:
I also checked using Instruments with the Zombies template.
That said, the amount of "leaked" objects was large, so this PR may expose existing over-releases as a crash.
Checking for Leaks
I ran the following example and then attached with Instruments with the Allocations template.
I performed several iterations of clicking "Mark Generation" and then closing the window. I proceeded to let the loop run 20+ more times in order to let Python and malloc clear/recycle any memory regions. I found three leaks (the
NSTrackingAreainFigureCanvas_init()and the two inwake_on_fd_write()).AI Disclosure
assigntoweak. Apple doesn't make this information easily accessible.PR checklist
Note
I ran
pytestand got:8858 passed, 1555 skipped, 28 xfailed, 2 xpassedAs a first time contributor, I'm not sure how concerned I should be with "28 xfailed". I don't believe my change could affect those tests.