Page MenuHomePhabricator

πŸ“ˆ Add metrics: monthly casual users and active users
Closed, ResolvedPublic

Description

Task

Take daily snapshots of the following new metrics:

  • monthly_casual_users (users who had >=1 and <5 actions in the last month)
  • monthly_active_users (users who had >=5 actions in the last month)

Acceptance Criteria

  1. The metrics are added to the table with daily snapshots (implemented in T383421).
  2. They have NULL value for all rows already existing in the table.
  3. The metrics calculate how many users had a certain amount of recent changes in the instance in the last month, excluding the actions from the PlatformReservedUser.

Example of SQL

This SQL returns the expected values for one specific MW database:

SELECT
    COUNT(CASE WHEN activity_count >= 1 AND activity_count < 5 THEN 1 END) AS monthly_casual_users,
    COUNT(CASE WHEN activity_count >= 5 THEN 1 END) AS monthly_active_users
FROM (
    SELECT
        rc.rc_actor,
        COUNT(*) AS activity_count
    FROM
        mwdb_7b97892aeb.mwt_d192403ea5_recentchanges rc
        INNER JOIN mwdb_7b97892aeb.mwt_d192403ea5_actor a ON rc.rc_actor = a.actor_id
	WHERE rc.rc_timestamp >= DATE_FORMAT(DATE_SUB(NOW(), INTERVAL 1 MONTH), '%Y%m%d%H%i%S') -- monthly
		/*
		Conditions below added for consistency with Wikidata: https://phabricator.wikimedia.org/diffusion/ADES/browse/master/src/wikidata/site_stats/sql/active_user_changes.sql
		*/
    	       AND a.actor_user != 0
    	       AND rc.rc_bot = 0
    	       AND (rc.rc_log_type != 'newusers' OR rc.rc_log_type IS NULL)
               GROUP BY rc.rc_actor
      ) AS actor_activity;

Notes

  • We accept that this logic will not consider actions like import of entities from the dashboard as activity, since those are currently also performed by PlatformReservedUser. We will handle this in a separate ticket that assigns such actions to the manager user of the MediaWiki: T383686
  • As previously implemented in T383421, a new row should only appear for the instance and a date if at least one value of any metric in it would be different from the latest available one.

Event Timeline

I removed this condition from the example SQL, as it will now be handled by making PlatformReservedUser a bot in T391479

a.actor_name <> 'PlatformReservedUser' -- exclude main page creation
Tarrow changed the task status from Open to Stalled.May 26 2025, 8:29 AM