Page MenuHomePhabricator

Coordinate the updates of IP-using AbuseFilter filters to use `user_unnamed_ip`
Open, LowPublic

Description

Background

When temporary accounts are enabled on a wiki, any filter using ips via user_name will no longer be able to do so. Those filters need to be updated to use user_unnamed_ip instead. This should be done before temporary accounts are enabled so that filters will work seamlessly (user_unnamed_ip has support for anonymous users and will work as-expected on wikis without temporary accounts enabled) after rollout.

Migration steps
  • Ensure someone has access to protected variables (see T369610)
  • Communicate with maintainers about the necessary changes needed to the filters and request they make the changes (this is an opportunity to ensure everything works as expected before temporary accounts rolls out on the wiki)
  • Filters that haven't been updated will updated by TSP before temporary accounts are enabled on the wiki
Information for AbuseFilter maintainers
  • Here are the lists of filters needing an update/verification: P77148 (ip_in_range) and P77198 (user_age > 0 and user_age == 0). If you don't have access, reach out to SGrabarczuk (WMF) (talk | email) and share your Phabricator username.
  • See the instruction on how to update the filters. Kindly please mark on which wiki the work is done. In case of any questions, reach out to @STran (limited availability in the week of June 9) or @sgrabarczuk
  • Ideally before June 11:
    • metawiki
    • cswiki (We had feedback that no actively used filters needed updating)
    • trwiki
    • kowiki
  • Ideally before June 18:
    • frwiki
    • zhwiki
    • fawiki
    • idwiki
    • arwiki
    • viwiki
    • hiwiki
    • eswiki
    • itwiki
    • nlwiki
    • ukwiki
    • ruwiki
    • ptwiki
    • plwiki
    • hewiki
    • jawiki
Deployment stages
  • Minor pilot wikis
  • Major pilot wikis
  • All wikis

Details

Other Assignee
STran

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
Tchanders subscribed.

Moving out of the sprint, since we're not expecting to do the major pilots communication this sprint.

As of today, there are 812 filters using ip_in_range.

As of today, there are 812 filters using ip_in_range.

Today it is 809 (mwscript ./extensions/AbuseFilter/maintenance/SearchFilters.php --wiki=testwiki --pattern=ip_in_range).

Clarification:

[kharlan@deploy2002 ~]$ mwscript ./extensions/AbuseFilter/maintenance/SearchFilters.php --wiki=testwiki --pattern="ip_in_range\(user_name" | wc -l
591
[kharlan@deploy2002 ~]$ mwscript ./extensions/AbuseFilter/maintenance/SearchFilters.php --wiki=testwiki --pattern="ip_in_range\( user_name" | wc -l
13

New condition:

[kharlan@deploy2002 ~]$ mwscript ./extensions/AbuseFilter/maintenance/SearchFilters.php --wiki=testwiki --pattern="ip_in_range\(user_unnamed_ip" | wc -l
2
[kharlan@deploy2002 ~]$ mwscript ./extensions/AbuseFilter/maintenance/SearchFilters.php --wiki=testwiki --pattern="ip_in_range\( user_unnamed_ip"
0

@sgrabarczuk have communications gone out to AbuseFilter maintainers about the need to update filters? Per this line in the task description:

Communicate with maintainers about the necessary changes needed to the filters and request they make the changes (this is an opportunity to ensure everything works as expected before temporary accounts rolls out on the wiki)

Not yet - I believe we will send the request in English this week, and in other languages next week.

Restricted Application added a subscriber: alaa. · View Herald TranscriptApr 5 2025, 8:28 PM

I'd also like to point out that filters using user_age > 0 and user_age == 0 will need to be updated as well. These don't appear to be included in the list given to us.

I'd also like to point out that filters using user_age > 0 and user_age == 0 will need to be updated as well. These don't appear to be included in the list given to us.

No the original query focused on updating usages of ip_in_range specifically. These ones are trickier as they're classifiers for account types but I've run a query to try and find all of these ones, omitting any that should have been caught by the original queries and let Szymon know as I think he's coordinating sharing out the list. Thanks!

In case someone was wondering, the replacement for user_age comparisons is user_type (test using ==, != or equals_to_any).
For example, user_age == 0 would match only IP addresses and possibly the first edit attempt of a temporary account (but this is unreliable). If it should match temporary accounts unconditionally, use equals_to_any(user_type, 'ip', 'temp') for backward compatibility, or user_type == 'temp' as soon as temporary accounts are definitely established.

By the way, filters using user_editcount could be affected, too.

Finished updating filters on jawiki. I've shared the following note with the local community. This may be helpful for other maintainers as well.


user_name, user_unnamed_ip

Previously, the user_name variable returned either an IP address or a registered user's account name. After the deployment of Temporary Accounts, it will return either a temporary account's or a registered user's account name. As a result, ip_in_range(user_name, "...") will no longer function properly, since user_name will no longer be an IP, and the return value will always be false. We’ll need to use user_unnamed_ip instead, like ip_in_range(user_unnamed_ip, "...").

It's important to note that user_unnamed_ip is a protected variable. Once a protected variable is used and saved in a filter, non-admin users will no longer be able to view or edit that filter, nor view detailed logs related to it. Also, once a filter is flagged as protected, this cannot be undone (to prevent IP-related conditions from being accessed via filter history). Therefore, user_unnamed_ip should only be used if the filter is already protected or if the condition is critical to the core logic. Using this variable for secondary or non-essential conditions will unnecessarily flag the filter as protected.


user_age

If the editor is:

  • IP address: 0
  • Temporary account: integer (≥ 0)
  • Registered user: integer (≥ 0)

Unlike IP addresses, temporary accounts have an incrementing user_age value, just like registered users. This means that it will no longer be possible to distinguish anonymous users from registered users using user_age === 0 or user_age > 0. Instead, we must use user_type.


user_editcount, global_user_editcount

If the editor is:

  • IP address: null
  • Temporary account: integer
  • Registered user: integer

Since int(null) === 0, a condition like user_editcount < 50 previously matched "IP users or registered users with fewer than 50 edits". However, because temporary account edit counts increment like those of registered users, this logic will no longer be valid. We must instead use a condition like (user_type !== 'named' | user_editcount < 50), meaning "users who are not registered, or (registered users) who have made fewer than 50 edits". The same applies to global_user_editcount.


user_groups

If the editor is:

  • IP address: [ 0 => '*' ]
  • Temporary account: [ 0 => '*', 1 => 'temp' ]
  • Registered user: [ 0 => '*', 1 => 'user', ... ]

This variable can still be used as before. However, in code like (user_editcount < 50 | !'autoconfirmed' in user_groups), it is technically more correct to write (!'autoconfirmed' in user_groups | user_editcount < 50).

  • (user_editcount < 50 | !'autoconfirmed' in user_groups)
    • Before temp accounts: "IP users, or registered users with fewer than 50 edits, or registered users not yet autoconfirmed"
    • After temp accounts: "Temporary or registered users with fewer than 50 edits (excludes temp users with ≥ 50 edits), or users not autoconfirmed"

→ For anonymous users, the first operand returns true before the temp account rollout, and the second operand returns true after the rollout.

  • (!'autoconfirmed' in user_groups | user_editcount < 50)

→ "Users who are not autoconfirmed (includes all temp users and some registered users), or registered users with fewer than 50 edits"


Just to clarify again, this is simply a translation of the note I shared with the community.

Finished updating filters on jawiki. I've shared the following note with the local community. This may be helpful for other maintainers as well.


user_name, user_unnamed_ip

Previously, the user_name variable returned either an IP address or a registered user's account name. After the deployment of Temporary Accounts, it will return either a temporary account's or a registered user's account name. As a result, ip_in_range(user_name, "...") will no longer function properly, since user_name will no longer be an IP, and the return value will always be false. We’ll need to use user_unnamed_ip instead, like ip_in_range(user_unnamed_ip, "...").

It's important to note that user_unnamed_ip is a protected variable. Once a protected variable is used and saved in a filter, non-admin users will no longer be able to view or edit that filter, nor view detailed logs related to it. Also, once a filter is flagged as protected, this cannot be undone (to prevent IP-related conditions from being accessed via filter history). Therefore, user_unnamed_ip should only be used if the filter is already protected or if the condition is critical to the core logic. Using this variable for secondary or non-essential conditions will unnecessarily flag the filter as protected.


user_age

If the editor is:

  • IP address: 0
  • Temporary account: integer (≥ 0)
  • Registered user: integer (≥ 0)

Unlike IP addresses, temporary accounts have an incrementing user_age value, just like registered users. This means that it will no longer be possible to distinguish anonymous users from registered users using user_age === 0 or user_age > 0. Instead, we must use user_type.


user_editcount, global_user_editcount

If the editor is:

  • IP address: null
  • Temporary account: integer
  • Registered user: integer

Since int(null) === 0, a condition like user_editcount < 50 previously matched "IP users or registered users with fewer than 50 edits". However, because temporary account edit counts increment like those of registered users, this logic will no longer be valid. We must instead use a condition like (user_type !== 'named' | user_editcount < 50), meaning "users who are not registered, or (registered users) who have made fewer than 50 edits". The same applies to global_user_editcount.


user_groups

If the editor is:

  • IP address: [ 0 => '*' ]
  • Temporary account: [ 0 => '*', 1 => 'temp' ]
  • Registered user: [ 0 => '*', 1 => 'user', ... ]

This variable can still be used as before. However, in code like (user_editcount < 50 | !'autoconfirmed' in user_groups), it is technically more correct to write (!'autoconfirmed' in user_groups | user_editcount < 50).

  • (user_editcount < 50 | !'autoconfirmed' in user_groups)
    • Before temp accounts: "IP users, or registered users with fewer than 50 edits, or registered users not yet autoconfirmed"
    • After temp accounts: "Temporary or registered users with fewer than 50 edits (excludes temp users with ≥ 50 edits), or users not autoconfirmed"

→ For anonymous users, the first operand returns true before the temp account rollout, and the second operand returns true after the rollout.

  • (!'autoconfirmed' in user_groups | user_editcount < 50)

→ "Users who are not autoconfirmed (includes all temp users and some registered users), or registered users with fewer than 50 edits"


Just to clarify again, this is simply a translation of the note I shared with the community.

Thanks for posting this information!

Unfortunately, for the Spanish Wikipedia, there are five private filters which might have required the change before June 18, but unlike sysops, eswiki's non-admin edit filter managers do not have abusefilter-access-protected-vars.

Unfortunately, for the Spanish Wikipedia, there are five private filters which might have required the change before June 18, but unlike sysops, eswiki's non-admin edit filter managers do not have abusefilter-access-protected-vars.

eswiki is not part of this week's deployment per T340001: [Epic] Deployment plan for Temporary Accounts / T396465 so they still have time to update their filters (and add the necessary permissions to their abuse filter group like enwiki T380332).

  • idwiki
  • arwiki
  • viwiki

Ahead of the deployment tomorrow, these were the only wikis not ticked off. @sgrabarczuk confirmed that we should update theidwiki and arwiki filters, but not the viwiki filters, and still deploy to those three wikis tomorrow.

We updated the filters as follows (thanks @STran for helping):

idwiki

Out of the 4 filters that were flagged, 2 were updated and now use protected variables, 1 was updated without protected variables, and 1 was unchanged.

arwiki

Out of the 19 filters that were flagged, 1 was updated and now uses protected variables, 7 were updated without protected variables, and 11 were unchanged.

In many cases, the filters that did not need changing used !( "user" in user_groups ) which will already find temporary accounts as well as anonymous IP users, since temporary accounts do not have the user group.

viwiki is done. Thanks for your patience.

I just left a message on the Spanish Wikipedia, explaining about the upcoming changes to their abuse filters to support upcoming temporary accounts and to change from user_age > 0 to user_type == "named", and similar. You can read more on https://es.wikipedia.org/wiki/Wikipedia:Filtro_de_ediciones/Implementación#c-Codename_Noreste-20250809060400-Implementación_de_soporte_para_cuentas_temporales_y_actualización_de_user_age.

I've come across a filter created by me in 2024 where I forgot user_type === 'ip'. Given that the variable has been available for over a year, it may be worth scanning all filters for this and similar expressions as well.

I've come across a filter created by me in 2024 where I forgot user_type === 'ip'. Given that the variable has been available for over a year, it may be worth scanning all filters for this and similar expressions as well.

Would it be better to use user_type != "named"? This simple check should be able to detect all unregistered users.

Yes, that is an appropriate replacement in practice. user_type also allows 'external' and 'unknown', but these are usually found in retrospect.

I am noting here for the record that the English Wikipedia has done those changes, per the EFN.