Edit check/Tone Check - MediaWiki

Edit check/Tone Check - MediaWiki
Jump to content
From mediawiki.org
Edit check
(Redirected from
Edit check/Peacock check
Translate this page
Languages:
Bahasa Indonesia
čeština
русский
中文
Wikimedia Foundation projects
Tone Check
Prompt people to write using a neutral tone.
Group:
Editing
Machine Learning
Start:
2025-01-29
Team members:
Aiko Chou
Benoît Evellin
Diego Saez-Trumper
Georgios Kyziridis
Megan Neisler
,
David Chan
David Lynch
Esther Akinloose
Marielle Volz
Rummana Yasmeen
Nicolas Ayoub
Ed Sanders
Zoe
Backlog:
#EditCheck
Lead:
Sucheta Salgaonkar
Peter Pelberg
Management:
Valerie Puffet-Michel
(engineering),
Ilias Sarantopoulos
Tracked in
Phabricator
Task T365301
This page holds the work the
Editing Team
is doing in collaboration with the
Machine Learning Team
to develop
Tone Check
(formerly
Peacock Check
Tone Check is an
Edit Check
that uses a language model to prompt people adding promotional, derogatory, or otherwise subjective language to consider "neutralizing" the tone of what they are writing.
A notable aspect of this project: Tone Check is the first Edit Check that uses
machine learning
In this case, a
BERT language model
initially selected and
fine-tuned by the Research team
to identify biased language within the new text people are attempting to publish to Wikipedia.
It is possible to
test Tone check at various wikis
The Edit check help page
documents
how Tone Check works and can be used.
To participate in and follow this project's development, we recommend adding this page to your watchlist.
Status
edit
Last update:
31 March 2026
Currently being worked on
edit
Exploring how Tone Check could help patrollers/reviewers make subtle forms of vandalism easier to detect.
Deploying Tone Check at the French, Japanese, and Portuguese Wikipedias, following the positive results of the
controlled experiment
Evaluating the effectiveness of the
editcheck-tone-shown
tag
Feedback opportunities
edit
Patrollers/reviewers:
how might Tone Check help help patrollers/reviewers make subtle forms of vandalism easier to detect?
See conversation
Configurability:
what aspects of Tone Check will be configurable on-wiki?
T393820
Planning
edit
Expanding the Tone Check
model's language coverage
Please visit
Edit check#Status
to gain a more granular understanding of where the development stands.
Objectives
edit
Tone Check is intended to
simultaneously
Cause
newer volunteers acting in good faith
to add new information to Wikipedia's main namespace that is written in a neutral tone
Reduce the effort and attention
experienced volunteers
need to allocate towards ensuring text in the main namespace is written in a neutral tone.
Background
edit
Writing in a neutral tone is an important part of Wikipedia's
neutral point of view policy
Writing in a neutral tone is also a practice many new volunteers find to be unintuitive.
An
October 2024 analysis
of the new content edits newer volunteers
published to English Wikipedia found:
56%
of the new content edits newer volunteers published
contained
peacock words
29%
of the new content edits newer volunteers published that contained peacock words were
reverted
New content edits containing peacock words were
46.7%
more likely to be reverted than new content edits
without
peacock words
With the above in mind, Tone Check is meant to address two core issues:
Newcomers
publishing edits to Wikipedia that contain promotional, derogatory, or otherwise subjective language because they lack the awareness that this kind of editing is not aligned with Wikipedia policies.
Experienced volunteers
being burdened by the effort and attention they need to allocate towards patrolling and reviewing preventable damage made in good faith. This can come at the expense of identifying and addressing more subtle and complex forms of vandalism. Tone Check is designed to address these two issues by:
Offering
new(er) volunteers
feedback
while
they are editing so they can avoid unintentionally publishing edits that are likely to violate policies
Offering
patrollers/reviewers
deeper insight
into the edits they are reviewing (and the intentions of the people publishing them) by logging the moderation feedback new(er) editors are being presented with, and the actions they do/do not take in response.
Design
edit
This section is currently a draft.
Material may not yet be complete, information may presently be omitted, and certain parts of the content may be subject to radical, rapid alteration. More information pertaining to this may be available on the
talk page
Screenshot showing the proposed Tone Check card on mobile.
User experience
edit
Tone Check is a contextual intervention designed to equip new(er) volunteers editing in good faith with the awareness and know-how needed to ensure the tone of the text they are adding is aligned with Wikipedia policies.
The Check is intentionally minimal, appears only when necessary, and aims to support the policies wikis have individually defined without blocking contributions.
When Tone Check is shown
Tone Check becomes activated when a contributor (who meets the
configuration criteria
communities will be able to set) adds new text that the underlying
machine learning model
trained on Wikipedia edits
– identifies as potentially biased or promotional.
Specifically:
The check activates after the user finishes editing a paragraph and clicks or taps outside of it.
If the system detects promotional, derogatory, or otherwise subjective language, the Tone Check card appears.
The Edit Check card is displayed in the side container on both desktop and mobile devices.
This lightweight, non-blocking format allows contributors to stay in flow while being made aware that something they have done warrants additional attention.
Placement and interaction
Tone Check appears at two key points in the editing workflow:
Mid-edit:
If detected while the contributor is actively editing, Tone Check appears immediately in the side panel (mobile and desktop)
Pre-save:
If not acted upon earlier, Tone Check appears again after the contributor clicks or taps
“Publish changes”
, during the proofreading step.
Edit Check card
When Tone Check is activated, contributors see an Edit Check card with:
A short explanation that the flagged language is often revised by other editors for a more balanced tone.
A “Learn more” link to access additional context about Wikipedia’s tone policies and guidelines.
Two actions:
Revise
, return to editing and update the highlighted text.
Decline
, proceed as-is, after selecting a reason for not revising.
A disclaimer noting that a small language model was used to detect tone-related issues in the text.
Design intent and principles
Tone Check is grounded in the following design principles:
No firm rules:
The tool suggests, but does not force, changes. Contributors can always choose to decline or proceed without editing.
Keep users in flow:
The experience is embedded within the natural flow of editing and publishing, with lightweight prompts that avoid blocking or clutter.
Meet users where they are:
Feedback is specific to the paragraph being edited and is framed using language easy to understand and grounded in Wikipedia norms.
Transparent:
A clear disclaimer and consistent design patterns ensure transparency about how suggestions are generated.
Language selection
edit
This section to include the languages we're prioritizing for initial experiment, the languages we're planning to scale to next, and why we came to select these languages.
See
phab:T388471
Model
edit
Tone Check leverages a
Small Language Model
(SLM) to detect the presence of promotional, derogatory, or otherwise subjective language.
The SLM we are using is a
BERT
model, which is open source and presents its weights openly.
The model works by being
fine-tuned
on examples of Wikipedia revisions.
It learns from instances where experienced editors have applied a specific template ("
peacock
" and equivalent templates) to flag tone violations, as well as instances where that template was removed.
This process teaches the BERT model to identify patterns associated with appropriate and inappropriate tones based on Wikipedia's editorial standards.
Under the hood, SLMs work by
transforming text into high-dimensional vectors
, which are then compared with the label, allowing the model to find a
hyperplane
that splits text into negative or positive cases.
The
model was trained
using 20,000 data points from 10 languages consisting of:
Positive examples:
Revisions on Wikipedia that were marked with the "peacock" template, indicating a tone policy violation.
Negative examples:
Revisions where the "peacock" template had been removed (signifying no policy violation).
Small Language Models (SLMs — like the one being used for Tone Check) differ from
Large Language Models
(LLMs) in that the former are trained to adapt for particular use cases by learning from a focused dataset.
In the case of Tone Check, this means the SLM learns directly from the expertise of experienced Wikipedia volunteers.
Hence, they offer more explainability and flexibility compared to LLMs.
Also SLMs requires significantly fewer computational resources than its larger counterparts.
LLMs on the other hand, are designed to work for general-purposes, with limited context and through a chat or prompting interface.
LLMs require a huge amount of computation resources, and their behavior is difficult to explain, due the high amount of parameters involved.
Evaluating the model
edit
Two evaluations of the model
edit
Before
measuring the impact of the overall Tone Check experience
through a controlled experiment in production, the team conducted two evaluations comparing the model's predictions to human-provided labels.
Outlined below is information about the purpose of each evaluation and what we found.
Internal evaluation
edit
Goals
The first evaluation we conducted was internal, involving just the WMF product teams who were working on this feature.
This review was meant to:
Evaluate whether the model aligned with human decisions often enough that we could consider its predictions reliable enough to move forward with a community-involved evaluation process
Figure out a
prediction probability score
threshold above which we could consider the model's predictions fairly accurate
Expose any edge cases or specific types of edits in which the model consistently does not perform well
Process
To assess the above, the team:
Created a list of 300 sample edits from English Wikipedia.
Assigned about 30 edits to each of the participants from our teams.
Asked each participant to go through the sample edits and indicate whether or not they contained promotional, derogatory, or otherwise subjective language that should be flagged by the Tone Check.
Compared the model's predictions to the human-provided labels.
Analyzed the cases where the model's predictions differed from the human-provided labels.
Findings
In English, false negatives (cases where the model predicts there isn't a tone check issue, but a human says there is) are very easily filtered out if we only return predictions with a probability score over 0.55.
In English, most false positives (cases where the model predicts that there is a tone check issue, but a human says there isn't) can be filtered out if we only return predictions with a probability score over 0.8.
There are some types of edits that the model has a hard time with - like edits that include a quote, where the quoted language is non-neutral in tone. In these cases, the model's predictions had a lower probability score.
Volunteer evaluation
edit
An example of a diff volunteers used to review the predictions the Tone Check model makes.
The results of the internal evaluation gave us confidence to move forward with
an external review involving experienced volunteers
We had enough positive examples (as defined above) to continue evaluating the model in English, French, Japanese, Portuguese, and Spanish.
Goals
This
second review
meant to:
Help us confirm that experienced volunteers agree with what the model identifies as promotional, derogatory, or otherwise subjective language
Evaluate whether the model's predictions about edits in French, Japanese, Portuguese, and Spanish are as reliable as they are about edits in English
Process
To assess the above, the team:
Created a list of 100 sample edits from each of the aforementioned Wikipedias.
Invited participants from each Wikipedia community to sign up and participate.
Provided the participants with a
tool
they could use to review and label each of the sample edits in the language(s) they were helping with.
Asked each participant to review and label at least 30 sample edits.
Compared the model's predictions to the human-provided labels.
Analyzed the cases where the model's predictions differed from the human-provided labels.
Findings
At the probability threshold the model would need to reach for a Tone Check to be shown (0.80) during an edit session, volunteers across the 5 languages who participated in the initial model review agreed with the model's detection of a tone issue in 95% of cases.
More details about the results from each of the 5 languages that we included in the initial volunteer review can be found in the table below.
Language
Revisions reviewed
Unique participants
High-level findings
Recommendation
How we made this recommendation
391
13
5% of reviews were false positives
No false positives with a probability above 0.67
Continue conversations with volunteers about the potential risks of the feature and ideas for how we might mitigate and manage them.
It was rare for the model to flag an edit for a tone issue when volunteer reviewers said there wasn’t one - this only happened about 5% of the time. When it did happen, the model wasn’t very confident in its prediction. Its probability score was below the threshold (0.8) that we’d use in a real-world setting.
Spanish
285
3% of reviews were false positives with a probability of 0.8 or above and there were many “false positive” cases where the added text did actually contain biased language.
2 samples (with 3 total reviews) were positive for unclear reasons that needed to be investigated (probability 0.87 and 0.82 probability). Both were samples where only a phrase was added to a paragraph, and the added phrase did not contain non-neutral language.
Proceed with 0.8 probability score threshold and recommend es.wiki evaluates feature through an A/B test.
Most of the time, when the model confidently (with a probability score of ≥0.8) flagged an edit for a tone issue, the volunteers who reviewed that edit agreed - it was indeed a problem.
In the small number of cases (3%) where the model confidently flagged an issue but at least one volunteer disagreed, there wasn’t a clear consensus among the volunteer reviewers.
Even then, most volunteer reviewers still sided with the model.
These cases often involved subjective or opinionated phrases, like “uno de los doctores mas importantes” (“one of the most important doctors”) and “desarrollo un paupérrimo torneo” (“he had a very poor tournament”).
Japanese
228
Fewer high-probability predictions overall, compared to other languages. Only 3% of samples saw probability over 0.8.
Higher proportion of false positives (27%) but none had a probability of 0.8 or above.
Two false positives had a probability score of 0.7 or above and in both cases, ≥1 other volunteer reviewers agreed with the model.
Proceed with 0.7 probability score threshold* and propose ja.wiki evaluates feature through an A/B test
*This recommendation assumes ja.wiki is generally open to Tone Check; if it is more conservative, recommend higher threshold to minimize false positives
The model didn’t make very many high-confidence predictions for Japanese - only 3% of predictions had a probability score above 0.8, which was the threshold we had planned to use in the production experiment.
Because so few predictions reached that level of confidence, we recommend lowering the threshold to 0.7 for Japanese.
Importantly, none of the predictions with a score above 0.8 flagged a tone issue in edits that volunteers thought were fine.
There were two cases where the model predicted a tone issue at probability scores of 0.7 and 0.75, and at least one human reviewer disagreed.
But in both cases, there was no clear agreement among the reviewers - at least one reviewer agreed with the model’s assessment.
Portuguese
22
More reviews required for results to be conclusive
No false positives (out of 22 reviews)
All model predictions were above 0.8 probability or below 0.69 probability.
Proceed with 0.8 probability score threshold and propose pt.wiki evaluates feature through an A/B test. In parallel, recruit more
volunteers to review model
We only received 22 reviews, which wasn’t enough for a thorough evaluation. Of the edits reviewed, 50% had a model probability score above 0.8 - so we’re not too worried about recall.
Additionally, in the cases where the model predicted a tone issue, human reviewers always agreed, meaning there were no false positives.
French
369
8% of reviews were false positives.
4% of reviews were false positives with a model probability score of 0.8 and above.
In all the false positives with probability above 0.75, there were no examples where volunteers unanimously disagreed with the model.
Proceed with 0.8 probability score threshold and propose fr.wiki evaluates feature through an A/B test
Most of the time, when the model confidently flagged an edit for a tone issue, volunteers agreed - it was indeed a problem.
In the small number of cases (4%) where the model confidently flagged an issue but at least one volunteer disagreed, there wasn’t a clear consensus among the human reviewers. Even then, many reviewers still agreed with the model. These cases often involved subjective or opinionated phrases, like “sa démarche picturale novatrice, paradoxale et indépendante” (“his innovative, paradoxical and independent approach to painting”).
User experience
edit
The viability of Tone Check,
like the broader Edit Check project
, depends on the feature being able to
simultaneously
Reduce the moderation workload
experienced volunteers
carry
Increase the rate at which
new(er) volunteers
contribute constructively
To evaluate the extent to which Tone Check is effective at the above, the team will be conducting qualitative and quantitative experiments.
Below you will find:
Impacts
the features introduced as part of the Edit Check are intended to cause and avert
Data
we will use to
help
determine the extent to which a feature has/has not caused a particular impact
Evaluation
methods we will use to gather the data necessary to determine the impact of a given feature
Desired Outcomes
ID
Outcome
Data
Evaluation Method(s)
1.
Key performance indicator:
The quality of new content edits newcomers and Junior Contributors make in the main namespace will increase because a greater percentage of these edits will not contain peacock language
Proportion of all new content edits published without biased language
Proportion of new content edits that are not reverted.
A/B test
, qualitative feedback (e.g. talk page discussions, false positive reporting)
2.
Key performance indicator:
Newcomers and Junior Contributors will experience Peacock Check as encouraging because it will offer them more clarity about what is expected of the new information they add to Wikipedia
Proportion of new content edits started (defined as reaching point that peacock check was or would be shown) that are successfully published (not reverted).
A/B test,
qualitative feedback (e.g. usability tests, interviews, etc.)
3.
New account holders will be more likely to publish an unreverted edit to the main namespace within 24 hours of creating an account because they will be made aware the new text they're attempting to publish needs to be written in a neutral tone, when they don't first think/know to write in this way themselves
Proportion of newcomers who publish ≥1 constructive edit in the Wikipedia main namespace on a mobile device within 24 hours of creating an account (constructive activation).
A/B test
4.
Newcomers and Junior Contributors will be more aware of the need to write in a neutral tone when contributing new text because the visual editor will prompt them to do so in cases where they have written text that contains peacock language.
The proportion of newcomers and Junior Contributors that publish at least one new content edit that does not contain peacock language.
A/B test
5.
Newcomers and Junior Contributors will be more likely to return to publish a new content edit in the future that does not include peacock language because Peacock Check will have caused them to realize when they are at risk of of this not being true.
Proportion of newcomers and Junior Contributors that publish an edit Peacock Check was activated within and successfully return to make an unreverted edit to a main namespace during the identified retention period.
Proportion of newcomers and Junior Contributors that publish an edit Peacock Check was activated within and return to make a new content edit without non-neutral language to a page in the main namespace during the identified retention period.
A/B test
Undesirable Outcomes
ID
Outcome
Data
Evaluation Method(s)
1.
Edit quality decreases
Proportion of published edits that add new content and are still reverted within 48hours. Note: Will include a breakdown of the revert rate of published new content edit edits with and without non-neutral language.
A/B test
and
leading indicators analysis
2.
Edit completion rate drastically decreases
Proportion of new content edits started (defined as reaching point that peacock check was or would be shown) that are published. Note: Will include breakdown by the number of checks shown to identify if lower completion rate corresponds with higher number of check shown.
A/B test
and
leading indicators analysis
3.
Edit abandonment rate drastically increases
Proportion of edits that are started (
event.action
init
) that are successfully published (
event.action
saveSuccess
).
A/B test
and
leading indicators analysis
5.
People shown Tone Check are blocked at higher rates
Proportion of contributors blocked after publishing an edit where Tone Check was shown compared to contributors not shown the Tone Check
A/B test
and
leading indicators analysis
6.
High false positive rates
Proportion of contributors that decline revising the text they’ve drafted and indicate that it was irrelevant.
A/B test
leading indicators analysis
, and qualitative feedback
Findings
edit
A/B Experiment
edit
See full report.
Conclusions
edit
A 2026 analysis of the Tone Check A/B experiment ran on French, Japanese, and Portuguese Wikipedias showed:
Tone Check successfully decreases the frequency of
non-neutral language
in published content.
Tone Check successfully decreases the likelihood that new content edits are likely to be
reverted
Note Tone Check has an even stronger effect when people engage with the Check's prompt to revise the tone of what they've written.
Tone Check successfully increases the
constructive edit rate
Tone Check causes people to be more likely to
return and publish a constructive edit
within 2 weeks of making their first.
Tone Check does not appear to be causing any
significant
disruption
to most people’s editing experience.
The feature did not cause any meaningful regressions in the
guardrail
metrics
we were monitoring.
Overall (across platforms and experience levels) the A/B experiment demonstrated that Tone Check is effective at A) increasing the quality of new content edits newcomers and Junior Contributors publish and B) increasing the likelihood that they will return to publish a constructive edit within 2 weeks. Both of these effects were proven
without
negatively affecting the overall health of the edit funnel.
All of the above is causing the team to move forward with scaling Tone Check to all Wikipedias as the
model's language support expands
See scenario 3 in
T387918
Findings
edit
Overall, constructive edit rates increased by
+6.2% increase [4.4 percentage points]
for people shown Tone Check in the test group.
Constructive edit rate
Overall: Tone Check
improved the rate of constructive edits by +6.2% [4.4] percentage points
. We observed improvements in overall edit quality at each of the three partner Wikipedias.
Platform: on desktop,
constructive edit rate increased by +6.4%
while we observed no statistically significant change in mobile web constructive edits.
Experience level: Tone Check appears especially effective at increasing the constructive edit rate of a registered Junior Contributors, where we observed a
+14.8% increase [10.2 pp]
in constructive edit rates.
New content edits published without biased language
Overall: Tone Check successfully decreases the frequency of non-neutral language in published content
Users with access to Tone Check were
-15.6% less likely
to publish edits containing non-neutral language (falling from 9.6% to 8.1%; a -1.5 pp decrease) compared to the control group.
Note: we have 99.8% confidence that this improvement is directly attributable to the tool.
Platform: trends differ.
Results confirm a highly significant impact on Desktop, where we observed the highest reduction in revert rate. In contrast, there was no detectable effect yet on Mobile Web.
Across desktop and mobile, there was a
-15% decrease [-4.4 ppp]
in the revert rate of edits shown Tone Check in the test group compared to edits eligible but not shown Tone Check in the
New content revert rate
Overall: edits made by users shown Tone Check are also
-15% less likely
to be reverted than eligible control edits (29.5% → 25.1%; a -4.4 pp decrease).
Experience level: we see the strongest effect among Junior Contributors.
Among Junior Contributors, we observed a
-33% relative [-10.2 pp] decrease
in the rate at which the new content edits they publish are reverted. This finding was statistically significant.
Among newcomers and unregistered users we did not confirm any statistically significant changes in the rate at which the new content edits they publish are reverted.
When the Tone Check successfully prompts a user to remove non-neutral language, the likelihood of that edit being reverted drops
-44.1%.
Impact of removing non-neutral language
Overall: when someone removes non-neutral language in response to a Tone Check, the likelihood of that edit being reverted
decreased by -44.1%
Platform: while we observed statistically significant decreases on both platforms, the impact was higher on desktop.
Desktop: we observed a significant
-47% decrease
[-13.4 pp] in revert rate for people who revised their text in response to Tone Check.
Mobile: we observed a significant
-14.8% [-4.8pp] decrease in revert rate
for edits where non-neutral language was removed.
Edit completion rate
Overall: edit completion rates for people shown Tone Check decreased only slightly by -3.2% (-1.6) percentage points.
Platform: the relatively small decrease in edit completion rate was concentrated on Desktop (-2.6%), with no significant change on Mobile Web.
Note: the decrease in completion rate does not exceed over 10% until more than 10 tone checks are presented in a single editing session. For these edits, edit completion rate decreased to 44.3% (a -12% decrease from the control). These edits represent only 3% of edits and potentially low quality edits that we’d want to deter.
Retention rate
Overall: People who encountered Tone Check are
24% more likely to return again to make a constructive edit
in their second week. Retention rates increased from 5.8% to 7.2% when Tone Check was shown (+1.4 percentage points).
Guardrail metrics
Edit completion rate:
no significant decreases in edit completion rate.
Revert rate:
no significant decreases in revert rate.
Check dismissal rates:
editors declined a tone check and selected “the tone is appropriate” at 16.4% of all published edits where Tone Check was shown. This excludes edits that were reverted within 48 hours.
For comparison, this is higher than the rates observed for
Reference Check
(6.6% of editors indicated that the content they were adding did not require a reference) and lower than
Paste Check
(30% of editors indicated that they wrote the content).
Block rate:
people are not blocked at a higher rate after being shown Tone Check.
Diagram showing Tone Check A/B experiment design.
Experiment scope and design
edit
Wikis
: French, Japanese, Portuguese
Timing
: 3 September 2025 – 28 January 2026
Participants
: Unregistered users and registered editors with ≤100 edits
Platform(s)
: Desktop and mobile web
Primary metric
: Proportion of new content edits that are reverted on the grounds of WP:NPOV (and related policies)
Secondary metrics
: Constructive edit rate
Guardrail metrics
: Edit completion rate, edit revert rate, edit abandonment rate, Check dismissal behavior
Leading indicators
edit
On 3 September 2025, an A/B experiment of Tone Check began at the French, Japanese, and Portuguese Wikipedias. What follows, is an analysis of test events logged between 8 September 2025 and 22 November 2025. This analysis was meant to enable the team to decide the following:
What – if any – UX adjustments/investigations will we prioritize for us to be confident moving forward with
evaluating the
feature's impact
Note: the findings that follow are not statistically significant. We expect to be able to share statistically significant conclusions in January 2026 via
T387918
Findings
Activation frequency
Tone Check was shown at least once in
9% of all published new-content edits
by newer editors
Tone Check was shown in
9.5%
of all published new content edits on
desktop
and
7.6%
of all published new content edits on
mobile
Edit completion rates
Overall: edits shown Tone Check were completed at a lower rate (66.7%) than eligible edits not shown Tone Check (68.3%), a
2.3% relative decrease
On
mobile
web there was a
7.6%
relative
increase
for the
treatment
group (69.4%) compared to the
control
(64.5%).
On
desktop
there was a
5.2% relative decrease
for the
treatment
group (65.8%) compared to the
control
(69.4%).
Revert rates
Overall: there have been no significant changes in the revert rate of
all new content edits
overall or by platform or Wikipedia. However, we’ve observed decreases in revert rate when limiting to edits where Tone Check was shown or eligible to be shown.
Platform: when we look at the revert rate of edits where Tone Check was shown at least once in an editing session compared to eligible edits in the control group, we see that:
On
desktop
, we observed a
-5.3% decrease
in the revert rate
On
mobile
, we observed a
-19% decrease
in the revert rate
For edits shown Tone Check
and
where
text was revised
to address the issue, we're currently seeing almost a
2x decrease
in revert rate compared to eligible control edits.
Blocks
Less than
1%
of users have been blocked after publishing an edit where at least one tone check was shown.
Model speed
0.6%
of all published edits (264 edits) in the AB test were saved before the model returned an evaluation. The majority of these edits occurred in the control group and on desktop.
Next steps
The Editing Team will proceed with the Tone Check A/B experiment
without
making adjustments to the intervention's user experience or experiment design.
The above is grounded in the fact that:
Tone Check is shown within a sufficient number of new content edits.
Edits shown Tone Check are completed at a lower rate (66.7%) than eligible edits not shown Tone Check (68.3%), a 2.3% relative decrease.
This slight decrease is not surprising as we are introducing an extra step in the workflow; as it's below a 10% relative difference we do not see signs of concern at this time.
Published new-content edits shown Tone Check (compared to edits eligible for Tone Check to be down) are reverted less frequently.
On
desktop
, we observed a
-5.3% decrease
in the revert rate
On
mobile
, we observed a
-19% decrease
in the revert rate
Configurability
edit
Tone Check will be implemented – like
all Edit Checks
– in a way that enables volunteers to explicitly
configure how
it behaves
and who Tone Check is made available to.
Configurability happens on a per project basis so that volunteers can ensure the Tone Check experience is aligned with local policies and conventions.
The particular facets of Tone Check that will be community configurable are still being decided. If there are particular aspects of Tone Check that you think need to be configured on-wiki, we ask that you share what you are thinking in
T393820
or on
the talk page
Timeline
edit
Year
Month
Activity
Notes
2026
April
Tone Check deployed as default-on feature at French, Japanese, and Portuguese Wikipedias
February
Tone Check A/B experiment concluded and
results published
2025
June
Tone Check Model card published
Published summary of recent en.wiki conversations (on-/off-wiki)
Local (en.wiki) Tone Check project page published
Call held with en.wiki volunteers on Discord
Discussion about Tone Check emerges at en.wiki
May
Tone Check presented during the ESEAP Summit
Invitations published on volunteer talk pages seeking help with model review
MassMessage sent inviting volunteers to review Tone Check Model
Published Mediawiki page
inviting volunteers to sign up to review the Tone Check model.
Tone Check presented during
Afrika Baraza Annual Planning Call
Announcement about volunteer-led
model review published
April
Tone Check community conversation held
Invitation to Tone Check-focused community conversation published
Tone Check (then called "Peacock Check") proof of concept presented during the "CEE Catch up Annual Planning Workshop."
Tone Check (then "Peacock Check") community conversation invitations published
March
Tone Check project page published
Work on Tone Check announced on mediawiki.org
2024
November
Work on Paste Check Announced
August
WMF CPTO (Selena Deckelmann) shares Reference Check demo at Wikimania
June
Link Check deployed to all wikis
March
Reference Reliability Check deployed to all wikis
2023
October 2023
Reference Check deployed to first Wikipedias
February 2023
Editing Teams publishes summary of
early community conversations
2021
August
Idea of Edit Check presented at Wikimnia
History
edit
Tone Check, and the broader Edit Check initiative, is a response to a range of community conversations and initiatives.
Some
which include those listed below.
For more historical context, please see
Edit check
Editing Team Community Conversation (April 2025)
New page patrol/Reviewers (en.wiki)
(April 2025)
ESEAP Strategy Summit 2025
Wikimedia CEE
annual planning conversation (April 2025)
Afrika Baraza meeting
(May 2025)
Supporting moderators at the Wikimedia Foundation
(August 2023)
Editing the Wiki Way software and the future of editing
(August 2021)
Existing maintenance templates
ar.wiki: تحيز, تعارض مصالح, تعظيم, دعاية, رأي منحاز, عبارة محايدة؟, غير متوازن, مصدر منحاز, وجهة نظر معجب, أهمية مبالغ بها ,استشهاد منشور ذاتي،تحيز,تعارض مصالح,تعظيم ،تلاعب بالألفاظ ,حيادية خريطة , عاية, رأي منحاز,سيرة شخصية ذاتية ,،سيرة شخصية نشر ذاتي فقط ,،عبارة محايدة؟ ,،غير متوازن , مبهمة , ،مساهمة مدفوعة غير مصرح عنها ,،مصادر متحزبة ,مصدر منحاز, ،نشر ذاتي سطري , نظرية هامشية ,وجهات نظر قليلة , وجهة نظر معجب
cs.wiki:
de.wiki:
en.wiki:
es.wiki:
fa.wiki:
fr.wiki: Non-neutre, Désaccord de neutralité, Section non neutre, Dithyrambe, Curriculum vitae, Catalogue de vente, Promotionnel, section promotionnelle, Name dropping, Passage promotionnel, Passage lyrique, Passage non neutre
id.wiki: Tak netral, Berbunga-bunga, Iklan, Seperti resume, Fanpov, Peacock, Autobiografi, Konflik kepentingan
it.wiki:
ja.wiki:
Template:観点
Template:宣伝
Template:大言壮語
lv.wiki:
tps://lv.wikipedia.org/wiki/Veidne:Pov,https://lv.wikipedia.org/wiki/Veidne:Konfl,https://lv.wikipedia.org/wiki/Veidne:Autobiogr%C4%81fija
no.wiki:
pl.wiki:
{{Dopracować{{!}}param_name=...}}
(Template Dopracować is a general template for issues, being precised with params, relevant parameters:
pov
neutralność
reklama
spam
polonocentryzm
povpol
zależne
wieszak
źródła promocyjne
źródła zależne
– case insensitive)
ro.wiki:
, also the template
with the parameters ton, ton nepotrivit or PDVN
ru.wiki:
ttps://ru.wikipedia.org/wiki/Шаблон:Проверить_нейтральность
(inline one)
zh.wik: ,
Advert
Fanpov
Newsrelease
Review
Tone
Unencyclopedic
Trivia
Autobiography
COI
BLPdispute
POV
Copy edit
Edit Check
edit
This initiative sits within the larger
Edit Check project
– an effort to meet people
while they are editing
with actionable feedback about Wikipedia policies.
Edit Check is intended to
simultaneously
deliver impact for two key groups of people.
Experienced volunteers
who need:
Relief from repairing preventable damage
Capacity to confront complexity
New(er) volunteers
who need:
Actionable feedback
Compelling opportunities to contribute
Clarity about what is expected of them
FAQ
edit
Why does Tone Check uses machine learning?
edit
AI increases Wikipedia projects' ability to detect promotional/non-neutral language
before
people publish it.
Which machine learning model does Tone Check use?
edit
We use an open-source model called
BERT
The model we use is not a
large language model
(LLM).
It is actually a smaller language model which the Machine learning team prefers, because it tells us how probable each of its predictions is, and it's easier to adapt to our custom data.
What language(s) does/will Tone Check support?
edit
As of
August 2025
, Tone Check supports the following languages:
English, Spanish, French, Japanese, Portuguese
Please see
T388471#10781906
for details about how and these languages were prioritized to start.
The goal remains for Tone Check to support all languages. New languages are regularly added, and
evaluated by community members
I tried the feature and Tone Check did not appear, why?
edit
If tone checking does
not
appear when you think it should, it is probably because the model is not sufficiently confident that the text you added has a tone issue.
Where "sufficiently confident" in this context refers to the
probability threshold
Tone Check is set to.
This probability threshold can be
configured on a per wiki basis
What does "a probability threshold of 0.8" means for the model mean?
edit
The
model
may make a lot of different predictions, but it is configured to act on the ones we're fairly confident about.
Said in another way, showing Tone Check where it
should be shown
is more important than showing it in
all possible places
where it could be shown.
What logging mechanisms, if any, will be implemented on the wiki to allow volunteers to see when Tone Check has been shown?
edit
To start, Tone Check will introduce two new
hidden
edit tags
editcheck-tone-shown
– This tag will be added to all edits in which ≥1 Tone Check is shown.
editcheck-tone
– This tag will be added to any edit made using the visual editor by someone who has published ≤100 cumulative edits that the
Tone Check model
is 80% confident contains a tone issue.
This approach follows what was
implemented for Reference Check
Is it possible to see what has been changed by the user after Tone check has been displayed?
edit
This would require storing both the original text and the modified text.
We do not plan to work on this issue at the moment, as it has technical implications (new table in the database) as well as Legal implications (currently, only the published version is taken into account).
Why not implement Tone Check as an Abuse filter?
edit
A key idea of Tone Check, and the broader
Edit check
system, is that newcomers are most likely to act on feedback when it is
Surfaced while newcomers are editing
Shown in relation to the specific content the feedback is about
As currently implemented, AbuseFilter is not able to surface feedback in ways that align with these two design principles.
What will we do to ensure Tone Check does not cause people to publish more subtle forms of promotional, derogatory, or otherwise subjective language that is more difficult for the model and people to detect?
edit
Logging and Oversight:
any time Tone Check is shown within an edit session that results in a published edit, a
hidden
edit tag (
editcheck-tone-shown
) will be added. Patrollers/reviewers can in turn use this edit tag to identify edits that might warrant deeper scrutiny and subsequently, decide what – if any – moderation they ought to take in response. Further, by being able to isolate edits in which Tone Check was shown from edits when it was not, together – staff and volunteers – can assess the holistic impact of the feature. This includes the potential for it to proliferate forms of promotional, derogatory, or otherwise subjective language that is more difficult for the model and people to detect.
Time-bound controlled experiment:
Tone Check is being developed with the understanding that it is not yet clear whether the intervention will be effective at delivering the
impact it is intended to cause
. To evaluate this, the team will be
running a controlled experiment
, with a definitive start and end date. The data this experiment gathers will enable staff and volunteers to evaluate what to do next.
Together, we think these two points will equip staff and volunteers with the qualitative and quantitative data needed to assess the holistic impact of the feature.
Further, we think the time-bound and tightly-scoped nature of the experiment will enable us to collectively learn in a way that minimizes irreparable harm to the wikis.
What control will volunteers have over how Tone Check behaves and who it is available to?
edit
Volunteers, on a per-project basis, can configure the following aspects of Tone Check:
The
account states
(logged in, logged out) Tone Check have the potential to see while they are editing
The
number of edits
someone needs to have published for Tone Check to be shown. By default, Tone Check will only be shown to people who have published ≤100 cumulative edits.
The
names of sections
that Tone Check will not be activated within.
See
Edit check/Configuration
for more details.
Please start a discussion on
Talk: Edit check/Configuration
if you think there additional aspects of Tone Check that you think could benefit from being configured on-wiki.
What control do volunteers have over how the model behaves?
edit
In addition to being able to
configure how Tone Check behaves
, the model underlying Tone Check will be retrained on an ongoing basis.
This way, Tone Check remains in sync with the ways volunteers evolve Wikipedia policies to adapt with shifts in how people edit.
References
edit
peacock check
prompts people who are adding text to a Wikipedia article that other people are likely to perceive as non-neutral/promotional/etc.
newer volunteers
refers to people who have published ≤100 cumulative edits.
Emphasis on "
help
" seeing as how all decisions will depend on a variety of data, all of which need to be weighted and considered to make informed decisions.
en:User talk: Chipmunkdavid
en:User talk:NightWolf1223
en:User talk: Parksfan1955
en:User talk: The Grid
en:User talk: Bunnypranav
en:User talk: Xandru4
en:User talk: Meritkosy
en:User talk: Fuzheado
ja:User talk: Wadakuramon
ja:User talk: Saebo
ja:User talk: VZP10224
ja:User talk: Hexirp
ja:User talk: Afaz
Tech/News/2025/17
en:Wikipedia talk:Manual of Style/Words to watch
Discussion Projet:Aide et accueil
Wikipedia talk:Growth Team features
Wikipedia talk:WikiProject Editor Retention
Retrieved from "
Category
WMF Projects
Hidden categories:
WMF Projects 2025q1
WMF Projects 2025q2
WMF Projects 2025q3
WMF Projects 2025q4
WMF Projects 2026q1
WMF Projects 2026q2
Drafts
Edit check/Tone Check
Add topic