Skip to content

Conversation

@tearfur
Copy link
Member

@tearfur tearfur commented Dec 17, 2025

Fixes #5853

The gist of this PR:

  1. Take the session lock in tr_torrent::stat(). This fixes the race condition.
  2. Slow down the macOS UI update loop timer from 1 second to 3 seconds to match the Qt app.
  3. In the macOS UI update loop, take the session lock before calling tr_torrentStat() for all torrents to avoid waiting for the lock multiple times.

The Problem

Last time we tried implementing [1] (#4571), it caused jank/beachballing in the macOS app and the fix was reverted (#4936).

OTOH, the Qt app takes the session lock for every API call it makes towards the core, but there were never any complaint about UI stuttering. So I went and took a look at how they are doing things differently.

The starkest difference I noticed is, in contrast to the macOS app, which takes and releases the lock in every tr_torrentStat() call, the Qt app's session update timer loop takes the session lock once and does not release it until it has done all tr_torrentStat() calls it needed to make. My hunch is this saves a lot of time waiting on the lock. This is possible for the Qt app because the it uses the RPC API instead of the C API for most API calls towards the core.

Another minor difference is that the RPC API supports the notion of "recently active" torrents, while the C API does not.

The Solution

For now, what I did is expose a function in the C API that returns the RAII object that owns the session lock, then use it in the macOS app's session update loop.

I realise this is ugly and exposes internal implementation details, but I don't feel comfortable taking it a step further because I'm basically editing the macOS code on Notepad with no compiler.

Next steps

As an intermediate step, I think we can expose a new tr_torrentStat() variant that takes the lock once and returns the stat object pointers for multiple torrents and refactor the macOS app around it. I can do that libtransmission part, but someone else will have to do the work in the macOS app.

The end goal is of course to convert the macOS (and the GTK app) into a standalone RPC client.

@tearfur tearfur marked this pull request as ready for review December 17, 2025 08:57
@tearfur tearfur added scope:mac scope:core type:fix A bug fix needs testers Call for someone to see if they can reproduce an issue needs notes PR title needs a one-sentence paragraph starting with 'Notes:' to summarize for the release notes labels Dec 17, 2025
@ckerr
Copy link
Member

ckerr commented Dec 18, 2025

Just to make sure I'm understanding your analysis right -- transmission-qt never calls tr_torrentStat() explicitly. When you say it just gets the lock once and then performs everything in a batch before releasing, are you referring to the lock call in tr_rpc_request_exec()?

auto const lock = tr_sessionLock(self.sessionHandle);
for (Torrent* torrent in self.fTorrents)
{
[torrent update];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't inspected the code in depth, so this is more of a question than a change request -- how much risk is there in calling [torrent update] inside of the lock? it looks like update does enqueue a notification. Is that dispatched immediately, or later when the event loop is idle? (ie after we've released the lock)

Copy link
Member Author

@tearfur tearfur Dec 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm afraid we'll need one of the macOS contributors to help us confirm this. I'm not joking when I say I've got obj-c++ literacy of a kneecap. 🥶

Tried my best to read the docs for NSNotificationQueue. Sounds like there is an event loop similar to the core's libevent loop, and the notification will be dispatched after this updateUI invocation?

Whereas a notification center distributes notifications when posted, notifications placed into the queue can be delayed until the end of the current pass through the run loop or until the run loop is idle.

CC @Coeur @nevack

Copy link
Collaborator

@Coeur Coeur Dec 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The notification enqueuing here is very safe: it's NSNotificationCoalescingOnName which means that if you send 1 thousand of them at the same time, it will only produce a small number of notifications. I don't know exactly how many, but I suspect around 30/sec or something adjusted for the refresh rate of the display.

In other words, you quickly enqueue 1000 times the same notification within 0.1 sec, it will only be posted 1-3 times.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's helpful, thanks!

Do those notifications that get posted execute immediately (ie while we still have the mutex lock) or do they get invoked later when the event loop is idle?

To avoid possible deadlocks, we should minimize the amount of code that's invoked inside the lock. That's my concern and why I'm asking about the notifications, since that's the only place I see that might be risky if the notifications are executed inside the lock.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's posted async, potentially during the lifetime of the lock.

@tearfur
Copy link
Member Author

tearfur commented Dec 18, 2025

Just to make sure I'm understanding your analysis right -- transmission-qt never calls tr_torrentStat() explicitly. When you say it just gets the lock once and then performs everything in a batch before releasing, are you referring to the lock call in tr_rpc_request_exec()?

Exactly.

@ckerr
Copy link
Member

ckerr commented Dec 18, 2025

The code looks reasonable to me. I'd like one of the @transmission/contributors to test this & see how it affects the jank

static CGFloat const kBottomBarHeight = 24.0;

static NSTimeInterval const kUpdateUISeconds = 1.0;
static NSTimeInterval const kUpdateUISeconds = 3.0;
Copy link
Collaborator

@Coeur Coeur Dec 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer to keep it every 1 second.
The sensitivity to this visual rhythm is important for many people. Slower would look frozen/broken.

Think of the progress bars for anything: if it stops moving for more than a second, you get anxious, and lose time figuring out if some problem happened or not.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, but does 1 second cause UI stuttering? And more importantly, has the crashes stopped?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dropped the commit for 1s => 3s, see how it goes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whether we're at 1s or 3s, the big question is @Coeur when you test this, does this stop the crashes w/o adding jank?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will try more thoroughly during this weekend.
It takes a few hours to reproduce a crash on main branch, so in order to assert that this fix doesn't crash, I need many hours.
I'm now on a very powerful computer with only a dozen torrents, so I won't be able to assert jankiness. I don't know if @nevack @DevilDimon @sweetppro or @livings124 have some older hardware or a tons of torrents to test with harsher conditions.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also while we're at it, pinging @ResearchLabAssistant to let them test if it fixes their crash.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

needs notes PR title needs a one-sentence paragraph starting with 'Notes:' to summarize for the release notes needs testers Call for someone to see if they can reproduce an issue scope:core scope:mac type:fix A bug fix

Development

Successfully merging this pull request may close these issues.

race condition in tr_torrentStat()

3 participants