Skip to content

Conversation

@stsewd
Copy link
Member

@stsewd stsewd commented Aug 26, 2025

  • Take into consideration active users from the last 30 days only.
  • Ignore GitHub App, so we don't re-sync installations over and over again, as users can share the same installation, and repos from the GH app are kept in sync via a webhook.

This task now completes in 42 minutes.

Closes #12406

# Triggering a sync per-user, will re-sync the same installation
# multiple times.
sync_remote_repositories(
user.pk,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could probably have a wrapper around this function, so it accepts a user object, instead of re-fetching the object from the DB each time.

@stsewd stsewd requested a review from Copilot August 26, 2025 21:14
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR re-enables the daily sync task for active user repositories with optimizations to prevent performance issues. The task was previously disabled due to causing celery instances to freeze up from excessive re-syncing.

Key changes include:

  • Reducing the active user window from 90 days to 30 days to limit scope
  • Adding logic to skip GitHub App services during sync to avoid redundant processing
  • Using iterator() for more memory-efficient user processing

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
readthedocs/settings/base.py Re-enables the commented-out daily repository sync task
readthedocs/oauth/tasks.py Updates sync logic to skip GitHub App services and reduces active user window to 30 days

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

@stsewd stsewd marked this pull request as ready for review August 26, 2025 21:28
@stsewd stsewd requested a review from a team as a code owner August 26, 2025 21:28
@stsewd stsewd requested a review from agjohnson August 26, 2025 21:28
@stsewd stsewd requested a review from ericholscher August 26, 2025 21:29
one_month_ago = timezone.now() - datetime.timedelta(days=30)
users = User.objects.annotate(weekday=ExtractIsoWeekDay("last_login")).filter(
last_login__gt=three_months_ago,
last_login__gt=one_month_ago,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It feels like we shouldn't consider the user inactive until after longest possible user session duration, plus a buffer of some days. last_login is updated on log in, and so won't update again until the user's session is invalidated -- likely the session cookie aging out after 30 days. After that period, last_login doesn't really signal an inactive user until maybe 2 additional weeks have passed (44 days total) and they haven't logged in again.

So, maybe say 45 days is probably the minimum and 90 days does seem like a lot.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With 90 days the task takes 1h 40min to complete (2.4K users), and with 45 days it takes 44min (1.3K users).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to 45 days.

@humitos
Copy link
Member

humitos commented Aug 27, 2025

This task now completes in 42 minutes

In another comment you said it took ~3m. I'm confused. What's the correct value here?

42 minutes is a lot! 😔

@stsewd
Copy link
Member Author

stsewd commented Aug 27, 2025

In another comment you said it took ~3m. I'm confused. What's the correct value here?

This is the time it takes to sync all active users. If you are referring to my message from slack, that was about the time of a syncing the repos of a single user (from 12min to 12 seconds).

Copy link
Member

@ericholscher ericholscher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like a good step forward. Do we think the GH app is what caused this task to massively increase in size?

In general, we should probably try and find a way to lock these tasks so they don't pile up. We have this logic other places, but should probably apply it to all daily tasks:

def memcache_lock(lock_id, lock_expire, app_identifier):

@stsewd
Copy link
Member Author

stsewd commented Aug 28, 2025

Do we think the GH app is what caused this task to massively increase in size?

I think it was more about the users from organizations with lots of repositories that recently logged in.

In general, we should probably try and find a way to lock these tasks so they don't pile up. We have this logic other places, but should probably apply it to all daily tasks:

I think we should just make sure the task isn't retried if it times out or gets killed. Since this task is run daily, there is no way the task will run over 24+ hours in order to trigger a new task for the lock to be useful.

@stsewd
Copy link
Member Author

stsewd commented Aug 28, 2025

I played around this locally and wasn't able to replicate a retry after the task times out. My other guess would be the task killing the server (OOM), but that I can't test locally...

@ericholscher
Copy link
Member

In general, we should probably try and find a way to lock these tasks so they don't pile up. We have this logic other places, but should probably apply it to all daily tasks:

I think we should just make sure the task isn't retried if it times out or gets killed. Since this task is run daily, there is no way the task will run over 24+ hours in order to trigger a new task for the lock to be useful.

Yea, but the symptom of all these issues is that the tasks try to run 2x, so if we can easily lock it and log a warning in Sentry, it will help us catch situations like this with an explicit guarantee.

@read-the-docs-community
Copy link

read-the-docs-community bot commented Sep 2, 2025

Documentation build overview

📚 docs | 🛠️ Build #29440276 | 📁 Comparing 1b79c5a against latest (c7c96de)


🔍 Preview build

Show files changed (1 files in total): 📝 1 modified | ➕ 0 added | ➖ 0 deleted
File Status
reference/git-integration.html 📝 modified

@stsewd stsewd requested a review from ericholscher September 2, 2025 18:52
@stsewd stsewd merged commit 9ffd0c7 into main Sep 4, 2025
4 of 5 checks passed
@stsewd stsewd deleted the re-enable-sync-user-repos branch September 4, 2025 16:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Re-enable sync_active_users_remote_repositories

5 participants