Task
Take daily snapshots of the following new metrics:
- monthly_casual_users (users who had >=1 and <5 actions in the last month)
- monthly_active_users (users who had >=5 actions in the last month)
Acceptance Criteria
- The metrics are added to the table with daily snapshots (implemented in T383421).
- They have NULL value for all rows already existing in the table.
- The metrics calculate how many users had a certain amount of recent changes in the instance in the last month, excluding the actions from the PlatformReservedUser.
Example of SQL
This SQL returns the expected values for one specific MW database:
SELECT
COUNT(CASE WHEN activity_count >= 1 AND activity_count < 5 THEN 1 END) AS monthly_casual_users,
COUNT(CASE WHEN activity_count >= 5 THEN 1 END) AS monthly_active_users
FROM (
SELECT
rc.rc_actor,
COUNT(*) AS activity_count
FROM
mwdb_7b97892aeb.mwt_d192403ea5_recentchanges rc
INNER JOIN mwdb_7b97892aeb.mwt_d192403ea5_actor a ON rc.rc_actor = a.actor_id
WHERE rc.rc_timestamp >= DATE_FORMAT(DATE_SUB(NOW(), INTERVAL 1 MONTH), '%Y%m%d%H%i%S') -- monthly
/*
Conditions below added for consistency with Wikidata: https://phabricator.wikimedia.org/diffusion/ADES/browse/master/src/wikidata/site_stats/sql/active_user_changes.sql
*/
AND a.actor_user != 0
AND rc.rc_bot = 0
AND (rc.rc_log_type != 'newusers' OR rc.rc_log_type IS NULL)
GROUP BY rc.rc_actor
) AS actor_activity;Notes
- We accept that this logic will not consider actions like import of entities from the dashboard as activity, since those are currently also performed by PlatformReservedUser. We will handle this in a separate ticket that assigns such actions to the manager user of the MediaWiki: T383686
- As previously implemented in T383421, a new row should only appear for the instance and a date if at least one value of any metric in it would be different from the latest available one.