Edit check/Tone Check/Model evaluation
This page provides coordination details for the community evaluation of the model used for Tone Check (formerly known as Peacock Check).
Each evaluator will review a minimum of 30 diffs per language using Annotool. Each evaluator has access to 100 diffs per language, and evaluators are encouraged to review as many diffs as possible.
October 2025 evaluation
[edit]The second round of evaluations includes languages listed below. This test will start on October 3, and last for one week. You will be asked to review the diffs before October 10, end of day.
If needed, one of the languages listed below may be moved to the next round of evaluation.
Please add your name to the list to be contacted for a test. We are looking for 5 users minimum for each language, and the more evaluators we have the better.
Please do not add other languages.
Edits shown to be evaluated may show some other type of issues (spam, bad formatting, bad linking...). If the issue identified is not a tone issue, then the edit should be marked as "Leave as it is", as it is another type of issue. Please only focus on the tone of the edit.
Arabic
[edit]Evaluation done
Czech
[edit]Evaluation done
German
[edit]Hebrew
[edit]Evaluation done
Indonesian
[edit]Evaluation done
Italian
[edit]Evaluation done
Dutch
[edit]- Effeietsanders (talk) 11:47, 2 October 2025 (UTC) (only when the dataset becomes more relevant - see talkpage.)
Polish
[edit]Russian
[edit]Evaluation done
Turkish
[edit]Evaluation done
Chinese
[edit]Evaluation done
Farsi
[edit]Evaluation done
Norwegian
[edit]Evaluation done
Romanian
[edit]Evaluation done
Latvian
[edit]May 2025 evaluation
[edit]The first round of evaluations includes following languages: English, Spanish, Japanese, Portuguese, and French. This test will start on May 23 and last for one week. You will be asked to review the diffs before May 30, end of day.
If needed, one of the languages listed above may be moved to the next round of evaluation.
Please add your name to the list to be contacted for a test. We are looking for 5 users minimum for each language, and the more evaluators we have the better.
Please do not add other languages.
The results of the May 2025 evaluation have been published.
English
[edit]Evaluation done.
- PMG (talk) 20:10, 15 May 2025 (UTC)
- Chipmunkdavis (talk) 03:53, 16 May 2025 (UTC)
- Sdkb talk 17:25, 16 May 2025 (UTC)
- NightWolf1223 (talk) 23:27, 19 May 2025 (UTC)
- Parksfan1955 (talk) 03:02, 20 May 2025 (UTC)
- The Grid (talk) 03:47, 20 May 2025 (UTC)
- ~/Bunnypranav:<ping> 04:56, 20 May 2025 (UTC)
- Sophisticatedevening🍷(talk) 16:09, 20 May 2025 (UTC)
- --Xandru4 (talk) 16:08, 22 May 2025 (UTC)
- Meritkosy (talk) 05:24, 23 May 2025 (UTC)
- Fuzheado (talk) 11:49, 23 May 2025 (UTC)
- Dotruonggiahy12 (talk) 18:50, 26 May 2025 (UTC)
Spanish
[edit]Evaluation done.
- Soylacarli (talk) 16:02, 19 May 2025 (UTC)
- Elwinlhq (talk) 16:13, 19 May 2025 (UTC)
- Omar sansi (talk) 05:30, 22 May 2025 (UTC)
- Pintakuda (talk) 15:16, 23 May 2025 (UTC)
- Hard --Hard (talk) 13:53, 22 May 2025 (UTC)
- Felino Volador (talk) 15:28, 22 May 2025 (UTC)
- DidiCoronel (talk) 17:40, 22 May 2025 (UTC)
- Silva Selva (talk) 03:14, 23 May 2025 (UTC)
Japanese
[edit]Evaluation done.
- Afaz (talk) 01:06, 20 May 2025 (UTC)
- Hexirp (talk) 15:27, 21 May 2025 (UTC)
- VZP10224 (talk) 06:33, 24 May 2025 (UTC)
- さえぼー (talk) 06:40, 24 May 2025 (UTC)
- Wadakuramon (talk) 07:05, 24 May 2025 (UTC)
- ...
Portuguese
[edit]Evaluate the model in Portuguese
- Arthur Timm (talk) 07:51, 17 May 2025 (UTC)
- Parzeus (talk) 16:05, 19 May 2025 (UTC)
- Vazafirst (talk) 19:05, 22 May 2025 (UTC)
- CorraleH (talk) 17:59, 23 June 2025 (UTC)
- ...
French
[edit]On the contrary of what was announced on May 19, the French language model is ready for the first round.
Evaluation done.
- Goombiis (talk) 10:05, 20 May 2025 (UTC)
- --Pa2chant.bis (talk) 16:50, 20 May 2025 (UTC)
- Frenouille (talk) 07:10, 21 May 2025 (UTC)
- PCWanonyme (talk) 09:40, 21 May 2025 (UTC)
- LD (talk) 13:07, 22 May 2025 (UTC)
- --Omnilaika02 (talk) 13:58, 22 May 2025 (UTC)
- …
Arabic
[edit]On the contrary of what was announced on May 19, the Arabic language model is not yet ready for the first round. We are sorry about this last minute change. Araic language is part of the October 2025 evaluation.
Volunteer for future evaluations
[edit]If you would like to participate in a future round of model evaluation for a language hasn't been listed above, please add your username below along with the language(s) you can provide support for.
- az: Nemoralis
- hy: Mari_Avetisyan_WMAM
- mk: Ehrlich91
- tl: Sky Harbor
- uz: Panpanchik
- kaa: Janabaevazizbek
- ar: Mr. Ibrahem