Wikipedia:Wikipedia Signpost/2024-09-04/

Wikipedia:Wikipedia Signpost/2024-09-04/News and notes - Wikipedia
Jump to content
From Wikipedia, the free encyclopedia
Wikipedia:Wikipedia Signpost
2024-09-04
WikiCup enters final round, MCDC wraps up activities, 17-year-old hoax article unmasked: JCW compilation now tracks free DOIs, Wiki Loves Monuments getting started, WMF's status as UN observer stymied by China for fourth time.
← Back to Contents
View Latest Issue
4 September 2024
File:Giza Pyramids during "Forever is Now" exhibition.jpg
Mona Hassan Abo-Abda
CC BY-SA 4.0
75
450
News and notes
WikiCup enters final round, MCDC wraps up activities, 17-year-old hoax article unmasked
Contribute
PDF download
E-mail
Mastodon
X (Twitter)
Bluesky
Reddit
By
Bri
Ciell
Headbomb
Andreas Kolbe
Oltrepier
, and
HaeB
The WikiCup gears up for its final round
The 2024
WikiCup
, hosted by users
Cwmhiraeth
Epicgenius
and
Frostly
, is entering its final phase, after Round 4 ended on 29 August. A total number of
135 users
, including the
late
Vami IV
, joined the contest at the start of this year; however, just eight of them have made it to the ultimate showdown. Here are the finalists, ranked from first to last as per
their scores in the latest round
Generalissima
(1150 points);
Arconning
(791 points);
AirshipJungleman29
(718 points);
BennyOnTheLoose
(714 points);
Hey man im josh
(601 points);
BeanieFan11
(579 points);
AryKun
(499 points);
Sammi Brie
(472 points).
Since its creation back in 2007, the WikiCup has strived to "encourage content creation and improvement and make editing on Wikipedia more fun", and this year's edition is no exception: according to
the official data
, competitors have so far contributed to 44
featured articles
, 72
featured lists
, 385
good articles
, 94
In the News
credits, and over 300
Did You Know
credits; thanks to their efforts, 38 articles were also added to
featured topics
and
good topics
On behalf of
The Signpost
, we would like to thank the judges and every participant in the 2024 WikiCup, and wish good luck to the eight finalists.
Journals cited by Wikipedia
compilation now tracks free DOIs
Tired of running into paywalls as you try to find new information? Look for the green free-access lock
next to DOIs and other identifiers in citations!
Related articles
Open Access
Tens of thousands of freely available sources flagged
4 December 2023
Top scholarly citers, lack of open access references, predicting editor departures
27 March 2022
The Wikipedia SourceWatch
31 March 2019
New guideline for technical collaboration
4 November 2016
Wikimedia Foundation adopts open-access research policy
25 March 2015
More articles
Wikipedia predicts flu more accurately than Google; 43% of academics have edited Wikipedia
30 April 2014
Licensed for reuse? Citing open-access sources in Wikipedia articles
15 January 2014
Loss of an Internet genius
14 January 2013
Wikimedia Foundation endorses open-access petition to the White House; pending changes RfC ends
28 May 2012
Wikimedia and the "seismic shift" towards open-access research publication
14 May 2012
As of 18 August, the
Journals cited by Wikipedia
(JCW) compilation (see
previous
Signpost
coverage
) now
tracks
the number of distinct
DOIs
present on Wikipedia, and how many are flagged with
|doi-access=free
. Several of these are automatically tracked and tagged as
free to read
by templates and bots (see
previous
Signpost
coverage
). As of the 1 August
dump
, the compilation kept track of 3.70M citations, of which 2.41M had DOIs. Of the citations that had DOIs, 661,103 were identified as free to read, or about 27.44%.
The
17–18 August 2024 update
of the
CS1
CS2
modules further identified the
Leibniz International Proceedings in Informatics
doi prefix
10.4230
) and the
Living Reviews
journal series (doi prefix
10.12942
) as free-to-read registrants, as well as 11 individual journals that can be identified by the starting pattern of DOIs (like
10.1046/j.1365-8711...
10.1093/mnras..
, and
10.1111/j.1365-2966...
for the
Monthly Notices of the Royal Astronomical Society
). Citation bot will automatically flag those with
|doi-access=free
when it runs on the article (see
our guide on how to use Citation bot yourself
).
If you notice a DOI link that takes you to a free-to-read article that wasn't flagged by the bot, you can flag the citation manually with
|doi-access=free
. You can also try to use
WP:OABOT
(see
our guide on how to use OAbot yourself
). If you are aware of fully free-to-read journals/publishers that aren't already kept track of by the CS1/CS2 templates (see
CS1/2 FAQ
), leave a note at
Help talk:CS1
and
User talk:Citation bot
Following the 20 August dump, the compilation kept track of 3.72M citations, of which 2.42M had DOIs. Of the citations that had DOIs, 663,976 were identified as free to read, or about 27.46% (up from 27.44%). It took a few days for the
server cache
to clear and tracking categories to be populated. I estimate that the 'true' count should have been about 666K, mostly due to
MNRAS
and
MNRAS Letters
being identified as free to read.
Related to the JCW update, all CS1/2 templates (like
{{
cite journal
}}
and
{{
citation
}}
), and the standalone templates
{{
doi
}}
and
{{
doi-inline
}}
, now support the flagging of free-to-read DOIs with
|doi-access=free
. The standalone versions, however, are not currently supported by any bot, nor do they have tracking categories.
Thanks to
Trappist the monk
for their efforts on templates and the identification of free-to-read publishers/journals (I was also involved), as well as the maintainers of
Citation bot
JL-Bot
, and
OAbot
(particularly
AManWithNoPlan
JLaTondre
and
Nemo bis
) for facilitating the mass-tagging of free-to-read articles.
Update:
Following the 1 September dump, most of the caching issues were resolved, and we have a count of 3.73M citations, of which 2.42M had DOIs (an increase of 15,261 since 1 August). Of the citations that had DOIs, 668,036 were identified as free to read, or about 27.56%. An increase of 6,933 free DOIs (both new and newly-identified), representing 0.11% of all DOI citations, since 1 August.
AI policy positions of the Wikimedia Foundation
In a
blog post
, the Wikimedia Foundation provides an overview of several statements it has submitted since last year in response to
[...] governments and international organizations [...] seeking stakeholder feedback about how [AI] policies should be formulated in order to best serve the public interest. [...] The Foundation’s comments have fallen into two categories. Some are directly relevant to the work being done by volunteer Wikipedia editors around the world, such as on copyright and openness of foundational AI models. Others applied our values and the valuable lessons we have learned from our AI/ML work to benefit public interest projects focused on free knowledge and the online information ecosystem—i.e., decentralized community-led decision-making, privacy, stakeholder inclusion, and internet commons.
— "
AI for the people: How machines can help humans improve Wikipedia
" (Wikimedia Foundation)
For example, in a
response to the US Copyright Office's Request for Comments on AI and Copyright
, the Foundation states that it "generally supports uses of Wikipedia content for purposes including AI model development", but (as summarized in the blog post) argues that
At a minimum, AI developers who include Wikipedia in the training data used to create large language models (LLMs) should publicly acknowledge that use and give credit to Wikipedia and the volunteer editors who made this rich source of raw materials for LLMs.
At the same time, the Foundation's statement indicates that this attribution might not always be legally required, depending on whether courts decide that the unauthorized use of copyrighted content in training of such AI models is covered by
fair use
(in which case the attribution requirements of Wikipedia's
CC BY-SA 4.0
license would be moot). The Foundation refrains from taking a categorical position on this legal question: "Based on our analysis, we do not believe that training AI models should either be categorically fair use or categorically not fair use. Rather, the particulars of the training process and the way courts view the purposes of a use should inform whether a particular training process is fair or not." The analysis does however offer some detailed if speculative observations on how courts might evaluate the
four fair use factors
in this context. For example, it is argued that because "the vastness of the datasets used in training mean that any single copy [of a copyrighted work] is barely a drop in the ocean of the whole", judges may want to focus on "the extent to which a work is weighted in the development of a model": "Hypothetically, if a copyright protected work was manually weighted to have an outsized impact in model development, then one could argue that although the uses of other full works may be fair, the amplification of one particular work in the training set is not." (Various LLMs are known to have weighted Wikipedia more highly than other parts of their training dataset, for example
GPT-3
.)
On the other hand, the Wikimedia Foundation's statement also urged the Copyright Office to take not only the perspective of copyright owners into account, but also that of the users of copyrighted works and of AI-based tools – noting that "The Foundation is somewhat uniquely positioned as both the host of a primary source of training material for generative AI and a user of many AI and ML tools that aid human editors with the creation of free knowledge." In particular, it cautions to keep public interest in mind in possible future changes to copyright laws and AI regulations, e.g.
On the use of data, specifically, we encourage regulators and legislators to align their approaches with existing models, such as the European Union’s inclusion of an
exemption for text and data mining
in the Directive on Copyright in the Digital Single Market, that enable public interest research and other beneficial uses of protected works.
[...] we encourage the Office to consider the potential impacts that changes to copyright law could have on competition among AI developers. If copyright law changes are enacted such that the acquisition and use of training materials becomes more expensive or difficult, there is a risk that dominant firms with greater resources will become further entrenched while smaller companies, including nonprofit organizations, struggle to keep up with mounting development costs.
The Farewell of the MCDC
Chosen by communities, selected by affiliates, and appointed by the WMF
, the Movement Charter Drafting Committee (MCDC), a committee of 15 Wikimedians, first took on the job of drafting
a Charter for the Wikimedia movement
in November 2021.
There were multiple feedback rounds, a lot of conversations, more discussions and
a final ratification vote
where the community and affiliate support was overwhelming (albeit with a low turnout in both cases due to the voter eligibility criteria), but the WMF's
Board of Trustees
decided the draft was not good enough (not
safe to try
). As
reported in the previous issue
of
The Signpost
, the Foundation published three pilot projects to take the work forward.
In August 2024, the committee (which still included 11 people), shared their
process and ratification reflections
pre-
Wikimania
. Before dissolving on 30 August, they also
published their recommendations for next steps
, including a response to the three pilots proposed by the WMF.
Ciell
, former MCDC member
Brief notes
WLM 2023 winner from Egypt,
Giza Pyramids during "Forever is Now" exhibition
by Mona Hassan Abo-Abda.
China blocks Wikimedia Foundation's WIPO accreditation for the fourth time
: In what has become an annual tradition, an application by the Foundation to be accredited as a permanent observer at the
World Intellectual Property Organization
(WIPO) failed because of opposition by the Chinese government, despite support from several other countries. As
explained
by the Foundation's Global Advocacy team, the WIPO is "the specialized United Nations (UN) agency that determines global policies on copyright, patents, and trademarks. Observer status would enable the Foundation to participate in and contribute to WIPO committees where intellectual property norms are set." The Foundation first applied in 2020, when Beijing's delegate objected to "a large amount of content and disinformation in violation of [the] ‘
One China
’ principle" (see
Signpost
coverage: "
Beijing blocks WMF from World Intellectual Property Organization, citing Wikimedia Taiwan
"). China has since also
blocked several Wikimedia chapters
from gaining permanent observer status.
Wikimedia ESEAP Conference
Wikimedia Community User Group Malaysia
has uploaded
a video
with highlights of the May 2024 Wikimedia
ESEAP Conference
to YouTube.
Articles for Improvement
: This week's
Article for Improvement
is
Cancel culture
. Please be bold in helping improve this article! Next week's Article for Improvement (beginning 9 September 2024) is
Caroline Islands
New record low count for active administrators
: A new low point of 427
active administrators
was reached on 26 August, marking a new decline after
the slight increase
in July. Meanwhile, the
first RfA since June
was initiated just a few days before we went to press with this issue.
The curtain rises on a new edition of WLM
: The fifteenth edition of
Wiki Loves Monuments
kicked off on 1 September. For this year's photo contest, which will focus on built heritage, 53 countries
have enrolled
. The respective national campaigns will be hosted until 31 October, and then each nation will send their Top 10 to the international jury for the Grand Finale in December.
Article on non-existent species removed
: "Pratylenchus dulscus," a
ten-word article without references
, was deleted on 22 August, after existing in the encyclopedia for over 17 years. Though not strictly speaking a hoax – it appears to have been created in good faith, based on a species name in a Wikipedia list article that had been
altered
by an IP – it is currently listed among the
longest-lived hoaxes on Wikipedia
The
Campaigns teams
would like to learn about how you collaborate online
: Editors can take this
Google Form survey
or share examples of
successful collaborations on Meta Wiki
Previous
"News and notes"
Next
"News and notes" →
In this issue
4 September 2024
all comments
News and notes
In the media
Recent research
News from the WMF
Wikimania
Serendipity
Traffic report
Humour
+ Add a comment
Discuss this story
These comments are automatically
transcluded
from this article's
talk page
. To follow comments,
add the page to your watchlist
If your comment has not appeared here, you can try
purging the cache
"Long-lived hoax article removed: "Pratylenchus dulscus,"". Sigh. I've been saying this for years, but before we call something a hoax, we need to do due diligence. Where is a discussion confirming this is a hoax, i.e. intentional misinformation, and not just some typo or good-faithed mistake? I've been (slowly) providing some analysis for the 'false statements in articles' section of the
Wikipedia:List of hoaxes on Wikipedia
page; I haven't gotten to 'hoax articles' yet, but some are not hoaxes, just errors. Please read the definition of what a hoax is, folks, and don't assume that an error is a hoax. That "Pratylenchus dulscus" may be a hoax, or it may be some sort of a typo. We can't assume bad faith (intentional fabrification) per
WP:AGF
. --
Piotr Konieczny aka Prokonsul Piotrus
reply here
14:11, 4 September 2024 (UTC)
reply
I wasn't involved in the write-up of this story, but
[1]
contains further information, to wit:
Editor Somanypeople created List of almond diseases in early 2007. On March 16 2007, the list of almond diseases was vandalized by an IP, replacing P. vulnus with P. dulscus (P. vulnus is supported by sources in the list). In the ensuing months, Somanypeople went on to create articles for species listed in the list of almond diseases, including Pratylenchus dulscus (apparently not being aware of the vandal's edit). An article was created for Pratylenchus in August 2007, which included P. dulscus in the list of species, presumably because Wikipedia had an article for it at that point. I'm going to remove the link to this article from the genus article and will restore a link to P. vulnus in the almond disease list.
So it appears indeed that the article was not created as a hoax. Thanks for mentioning it.
Andreas
JN
466
15:14, 4 September 2024 (UTC)
reply
I wrote the comment that is being quoted. I wouldn't call Pratylenchus dulscus a hoax myself. It was a sloppy good-faith creation that was ultimately rooted in vandalism.
Meloidogyne gajuscus
and
Meloidogyne fruglia
were two other 17 year "hoaxes" I found a few days later that were created in identical circumstances; Somanypeople created a list of plant diseases, it was
vandalized
, and Somanypeople then created articles for non-existent species based on the vandalism.
Plantdrew
talk
16:10, 4 September 2024 (UTC)
reply
Seems like an odd take to call it a hoax when it's merely a byproduct of vandalism.
OhanaUnited
Talk page
21:44, 4 September 2024 (UTC)
reply
OhanaUnited
Plantdrew
Jayen466
A common type of error (i.e. it is sadly common to call a regular error or vandalism a hoax, even if there is no proven intent to mislead). Is there any chance of correcting it in The Signpost, at least? I'll also @
TenPoundHammer
who added it to the list of hoaxes. We could really use more folks reviewing entries there and separating confirmed hoaxes from plausible or unlikely. Here, there is no reason to assume anon was vandalizing - it could be a typo, accidental page save, or some good faithed if wrong error fixing ("I think it sounds better in Latin this way", whatever).
Piotr Konieczny aka Prokonsul Piotrus
reply here
06:53, 5 September 2024 (UTC)
reply
I've rewritten the entry per the discussion above. Look okay?
Andreas
JN
466
07:39, 5 September 2024 (UTC)
reply
Jayen466
May be adjust the article's title as well. It still says hoax on the start of News and Notes
Soni
talk
07:58, 5 September 2024 (UTC)
reply
Above my pay grade as changing the title would affect other pages as well.
JPxG
Could you take a look?
Andreas
JN
466
08:17, 5 September 2024 (UTC)
reply
Even if the hoax would be correct (and we pretty much agree here it wasn't), I found it weird from the beginning that this made it into the heading despite being just a small note at the bottom. A bit too clickbaitish for regular TS style anyway...
WP:TROUT
should be applied somewhere, perhaps :)
Piotr Konieczny aka Prokonsul Piotrus
reply here
09:19, 5 September 2024 (UTC)
reply
It doesn't even need vandalism to get something false into WP. A while ago I sent an article called
Snake Bight, Florida
, which was described as a ghost town, to AfD. "Snake Bight" is a
bight
on the coast of Florida Bay in Everglades National Park. A prehistoric canal called the "Snake Bight Canal" runs inland from Snake Bight. The National Park Service maintains a hiking trail along the canal called the "Snake Bight Canal Trail". The ruins of a former
fish factory
can be seen from the trail. That all led to the ruins being labeled the "ghost town of Snake Bight" on a web site listing ghost towns, even though there was never a populated place called "Snake Bight". A source existed, so of course there had to be a WP article.
Donald Albury
13:58, 6 September 2024 (UTC)
reply
Make sure we cover what matters to you –
leave a suggestion
About
Archives
Newsroom
Suggestions
Retrieved from "
Category
Wikipedia Signpost archives 2024-09
Wikipedia
Wikipedia Signpost/2024-09-04/News and notes
Add topic