Wikidata:Bot requests - Wikidata
Jump to content
From Wikidata
Translate this header box!
Add a new request
Bot requests
If you have a
bot
request, add a new section using the button and describe exactly what you want. To reduce the process time, first discuss the legitimacy of your request with the community in the Project chat or on a Wikiproject's talk page. Please refer to previous discussions justifying the task in your request.
For botflag requests, see
Wikidata:Requests for permissions
Tools available to all users which can be used to accomplish the work without the need for a bot:
PetScan
for creating items from Wikimedia pages and/or adding same statements to items
(note: PetScan edits are made through QuickStatements)
QuickStatements
for creating items and/or adding different statements to items
Harvest Templates
for importing statements from Wikimedia projects
OpenRefine
to import any type of data from tabular sources
WikibaseJS-cli
to write shell scripts to create and edit items in batch
Programming libraries
to write scripts or bots that create and edit items in batch
You can also find unsolved requests in the archives of this page. Please restore them if you implement them.
On this page, old discussions are archived. An overview of all archives can be found at this page's
archive index
. The current archive is located at
2026/04
SpBot
archives
all sections tagged with
{{
Section resolved
|1=~~~~}}
after 3 days.
Project chat
Translators' noticeboard
Bot requests
Lexicographical data
Request a query
Requests for permissions
Interwiki conflicts
Administrators' noticeboard
Requests for deletions
Property proposal
Bureaucrats' noticeboard
Report a technical problem
Requests for comment
Properties for deletion
Requests for checkuser
Request to add subscriber counts to subreddits (2025-03-11)
edit
Request date: 12 March 2025, by
Prototyperspective
Link to discussions justifying the request
Task description
Lots of items have a or multiple subreddit(s) about the item's subject set (which are often the largest online community or discussion-place / aggregated content location relating to the subject).
I think the subreddit subscriber counts are useful for many purposes like helping estimate roughly how popular things are and/or enabling some sorting since there are few other counts integrated into Wikidata on popularity – e.g. imagine a list of music genres for example, there a column that roughly shows how popular they are (among people online today) that one could sort by would be useful (one doesn't have to sort by it and it doesn't have to be perfect and there could be more columns like it). It can also be used for analysis of rise or slowdown/decline of subs (or to see such at a glance on the Wikidata item) etc.
However, many items do not have the subscriber count set or only have a very old one set. This is different for X/Twitter where most items have that set and it seems to get updated frequently by some bot. Here is a list of items with subreddit(s) sorted by the set subscriber count:
Wikidata:List of largest subreddits
. It shows that even of the largest subreddits, only few have a subscriber count set.
Please set the subscriber counts for all uses of
subreddit
(P3984)
and add new one for the ones (with preferred rank) that already have one set (that is old). As qualifiers it needs
point in time
(P585)
and
subreddit
(P3984)
. It would be best to run this bot task regularly, for example twice per year.
--
Prototyperspective
talk
23:38, 12 March 2025 (UTC)
reply
Licence of data to import (if relevant)
Discussion
Forgot to say it would be set on
social media followers
(P8687)
of the item with a qualifier specifying the subreddit and a qualifier for the point in time. --
Prototyperspective
talk
14:43, 30 March 2025 (UTC)
reply
Could this be used for this?
Maxlath
:.
Prototyperspective
talk
13:27, 30 May 2025 (UTC)
reply
Note that this gets more useful the more subreddits are added –
the subreddit catalog can be used for that
(1/3 done but needs update).
Prototyperspective
talk
22:17, 15 November 2025 (UTC)
reply
More and more subreddits from the catalog get integrated into WD. For some purposes/applications I guess one could query some external site for current and historic subscriber numbers for a given topic's (or topics of a certain field etc) subreddits. However, such can't be used in Wikidata SPARQL queries afaik for example.
I asked an LLM the following:
how to get the number of reddit subscribers and add them to wikidata items via a bot, can you code the wikidata bot
and it returned the following:
Use Reddit's API to fetch subscriber counts and pywikibot to add a statement on Wikidata. Recommended property: social media followers (P8687) qualified with subreddit (P3984) and point in time (P585). (Alternative: use number of subscribers P3744 as a main property for organizations.)
Below is a minimal, runnable Python example. You must:
Create a Reddit app and get client_id, client_secret, user_agent.
Set up a Wikidata bot account and pywikibot user-config (login with the bot account).
Respect Reddit and Wikidata rate limits and explainable sources for changes.
Code (Python 3):
Python code
# requirements: praw, pywikibot, python-dateutil
# pip install praw pywikibot python-dateutil
import
praw
import
pywikibot
from
pywikibot
import
pagegenerators
from
datetime
import
datetime
from
dateutil
import
tz
# CONFIG — fill these in
REDDIT_CLIENT_ID
'your_reddit_client_id'
REDDIT_CLIENT_SECRET
'your_reddit_client_secret'
REDDIT_USER_AGENT
'wikidata-subscriber-bot/0.1 (by u/yourusername)'
# Mapping of Wikidata Q-items to subreddit names (no leading r/)
MAPPING
'QXXXX'
'python'
# example: Wikidata item QXXXX -> subreddit r/python
'QYYYY'
'learnprogramming'
# Initialize Reddit
reddit
praw
Reddit
client_id
REDDIT_CLIENT_ID
client_secret
REDDIT_CLIENT_SECRET
user_agent
REDDIT_USER_AGENT
# Initialize Wikidata site
site
pywikibot
Site
"wikidata"
"wikidata"
repo
site
data_repository
()
def
make_quantity
value
):
return
pywikibot
WbQuantity
amount
str
value
),
unit
None
def
add_subscriber_statement
wikidata_qid
subreddit
subscribers
):
item
pywikibot
ItemPage
repo
wikidata_qid
item
get
()
claims
item
claims
# Property IDs
P_SOCIAL_FOLLOWERS
'P8687'
# social media followers
P_SUBREDDIT
'P3984'
# subreddit (external id)
P_POINT_IN_TIME
'P585'
# point in time
# Create main claim (quantity)
pywikibot
Claim
repo
P_SOCIAL_FOLLOWERS
setTarget
make_quantity
subscribers
))
# Qualifier: subreddit (string external identifier)
qual_sub
pywikibot
Claim
repo
P_SUBREDDIT
qual_sub
setTarget
subreddit
# Qualifier: point in time (timestamp)
now
datetime
now
tz
tz
tzutc
())
qual_time
pywikibot
Claim
repo
P_POINT_IN_TIME
qual_time
setTarget
pywikibot
WbTime
year
now
year
month
now
month
day
now
day
precision
11
timezone
))
# Attach qualifiers
addQualifier
qual_sub
addQualifier
qual_time
# Optional: add a reference pointing to the subreddit URL
ref
pywikibot
Claim
repo
'P854'
# reference URL
ref
setTarget
'https://www.reddit.com/r/'
subreddit
'/'
addSource
ref
# Add claim to item
item
addClaim
summary
'Add
subscribers
followers for r/
subreddit
(bot)'
def
get_subscriber_count
subreddit
):
try
sr
reddit
subreddit
subreddit
return
sr
subscribers
except
Exception
as
'Error fetching r/
subreddit
return
None
def
main
():
for
qid
subreddit
in
MAPPING
items
():
subs
get_subscriber_count
subreddit
if
subs
is
None
continue
qid
-> r/
subreddit
subs
add_subscriber_statement
qid
subreddit
subs
if
__name__
==
'__main__'
main
()
--
Prototyperspective
talk
00:11, 3 April 2026 (UTC)
reply
I wonder if this could cause issues of reddit blocking the IP or sth because of too many requests.
I let the AI improve the script above like so:
how to add the subscriber counts to wikidata with the below python scripts but fetch the items that have the subreddit property (P3984) set (and their subreddit title/link) from wikidata first (and use that)? Here is the existing python script:[newline and the python code above]
# requirements: praw, pywikibot, python-dateutil
# pip install praw pywikibot python-dateutil
import
praw
import
pywikibot
from
pywikibot
import
pagegenerators
from
datetime
import
datetime
from
dateutil
import
tz
# CONFIG — fill these in
REDDIT_CLIENT_ID
'your_reddit_client_id'
REDDIT_CLIENT_SECRET
'your_reddit_client_secret'
REDDIT_USER_AGENT
'wikidata-subscriber-bot/0.1 (by u/yourusername)'
# Initialize Reddit
reddit
praw
Reddit
client_id
REDDIT_CLIENT_ID
client_secret
REDDIT_CLIENT_SECRET
user_agent
REDDIT_USER_AGENT
# Initialize Wikidata site
site
pywikibot
Site
"wikidata"
"wikidata"
repo
site
data_repository
()
def
make_quantity
value
):
return
pywikibot
WbQuantity
amount
str
value
),
unit
None
def
add_subscriber_statement
wikidata_qid
subreddit
subscribers
):
item
pywikibot
ItemPage
repo
wikidata_qid
try
item
get
()
except
Exception
as
'Failed to load
wikidata_qid
return
# Property IDs
P_SOCIAL_FOLLOWERS
'P8687'
# social media followers
P_SUBREDDIT
'P3984'
# subreddit (external id)
P_POINT_IN_TIME
'P585'
# point in time
# Create main claim (quantity)
pywikibot
Claim
repo
P_SOCIAL_FOLLOWERS
setTarget
make_quantity
subscribers
))
# Qualifier: subreddit (string external identifier)
qual_sub
pywikibot
Claim
repo
P_SUBREDDIT
qual_sub
setTarget
subreddit
# Qualifier: point in time (timestamp)
now
datetime
now
tz
tz
tzutc
())
qual_time
pywikibot
Claim
repo
P_POINT_IN_TIME
qual_time
setTarget
pywikibot
WbTime
year
now
year
month
now
month
day
now
day
precision
11
timezone
))
# Attach qualifiers
addQualifier
qual_sub
addQualifier
qual_time
# Reference: link to subreddit
ref
pywikibot
Claim
repo
'P854'
# reference URL
ref
setTarget
'https://www.reddit.com/r/'
subreddit
'/'
addSource
ref
# Add claim to item
try
item
addClaim
summary
'Add
subscribers
followers for r/
subreddit
(bot)'
'Added
subscribers
followers for
wikidata_qid
(r/
subreddit
)'
except
Exception
as
'Failed to add claim to
wikidata_qid
def
get_subscriber_count
subreddit
):
try
sr
reddit
subreddit
subreddit
return
sr
subscribers
except
Exception
as
'Error fetching r/
subreddit
return
None
def
fetch_items_with_subreddit_property
limit
None
):
"""
Returns list of tuples (qid, subreddit_string).
Uses SPARQL to find items with P3984.
"""
query
"""
SELECT ?item ?itemLabel ?subreddit WHERE {
?item wdt:P3984 ?subreddit .
"""
generator
pagegenerators
WikidataSPARQLPageGenerator
query
site
site
results
[]
count
for
page
in
generator
qid
page
title
()
# e.g., "Q12345"
# read the P3984 value from the item to get exact string (handles multiple values)
try
page
get
()
claims
page
claims
if
'P3984'
in
claims
for
claim
in
claims
'P3984'
]:
try
val
claim
getTarget
()
if
isinstance
val
str
):
subreddit
val
else
# sometimes stored as ExternalID or dict
subreddit
str
val
results
append
((
qid
subreddit
))
count
+=
if
limit
and
count
>=
limit
return
results
except
Exception
continue
except
Exception
as
'Failed to fetch claims for
qid
continue
return
results
def
main
():
# Optionally set a limit to avoid hitting rate limits during testing
ITEMS_LIMIT
None
# or an int, e.g., 50
items
fetch_items_with_subreddit_property
limit
ITEMS_LIMIT
if
not
items
'No items with P3984 found.'
return
for
qid
subreddit
in
items
# normalize subreddit: remove leading "r/" if present
subreddit
subreddit
strip
()
if
subreddit
lower
()
startswith
'r/'
):
subreddit
subreddit
:]
subs
get_subscriber_count
subreddit
if
subs
is
None
continue
qid
-> r/
subreddit
subs
add_subscriber_statement
qid
subreddit
subs
if
__name__
==
'__main__'
main
()
"You must have a valid Reddit app (client id/secret) and a Wikidata login configured for pywikibot (usually via user-config.py)." The counts would for example enable queries for charts that involve these. --
Prototyperspective
talk
13:55, 20 April 2026 (UTC)
reply
Request process
Request to add missing icons via logo property (2025-04-29)
edit
Request date: 29 April 2025, by
Prototyperspective
Link to discussions justifying the request
Wikidata:Project chat/Archive/2025/03#Missing icons (logo property)
Task description
Many items that have an icon in
logo image
(P154)
do not have an image set in
icon
(P2910)
That's an issue because sometimes logos are icons (like app icons on a phone) and sometimes they are wide banner-like logos as for example with
Q19718090#P154
and
Q50938515#P154
. If one would then query
icon, and if no icon set: logo
that would result in mixed data of both these small more or less rectangular icons and other types of logos. When using that in a table column for example it would make the column much wider and having varying images in the column.
So I think it would be best if an icon was consistently in
icon
(P2910)
without having to query
logo image
(P154)
To understand what I mean
, take a look at:
Wikidata:List of free software accounts on Bluesky
which has a nice-looking icon for nearly all entries and compare it with:
Wikidata:List of free software accounts on Mastodon
where the icon is missing for most items.
Licence of data to import (if relevant)
Discussion
Is there a straightforward way to find all items missing an icon where an icon is set in logo? Would it be better to copy it to the icon property or to move it to it (if unclear, I'd say just copy it)? Lastly, there also is a prop
small logo or icon
(P8972)
but if that prop is to be used, shouldn't SVG files in icon be always copied to it in addition, assuming again this property is useful and should be set. That is because SVG files (that are set in the icon and/or logo property) can always be also used as small icon or not? --
Prototyperspective
talk
18:48, 29 April 2025 (UTC)
reply
Note that by now many more items have been added to that free software accounts on Bluesky list, many of which do not have an icon. A simple explanation of the difference between a logo and an icon is that logos often have texts with them and are in horizontal format while icons are rectangular and usually have no text and sometimes just one word or a letter. In far more than 50% of cases the logo can simply be copied to the icon property. If one checks whether it's rectangular that probably already increases it to a percentage where it's reasonable to do this via mass-editing.
Prototyperspective
talk
11:35, 26 September 2025 (UTC)
reply
Request process
Request to make the legal citations of Statutory Instruments an alias. (2025-09-02)
edit
Request date: 7 September 2025, by
ToxicPea
Task description
I would like to request that a bot make the read the value of
legal citation of this text
(P1031)
for every item whose value of
instance of
(P31)
is
UK Statutory Instrument
(Q7604686)
Welsh Statutory Instrument
(Q100754500)
Scottish statutory instrument
(Q7437991)
, or
statutory rules of Northern Ireland
(Q7604693)
and make the value of
legal citation of this text
(P1031)
an alias of that item.
Discussion
EthanRobertLee
Iwan.Aucamp
Wallacegromit1
talk
08:30, 18 July 2020 (UTC)
, focus on historical and international law/legislation
reply
Belteshassar
Popperipopp
Ainali
Lore.Mazza34
Yupik
El Dubs
c960657
Cavernia
Copystar
Geertivp
Notified
participants of WikiProject Law
No objection to this? It represents more than 132 000 items.
Louperivois
talk
02:27, 20 December 2025 (UTC)
reply
Have you decided which language field you’re going to put this in? Is it universal enough to go into mul or should it be English?
Belteshassar
talk
06:36, 20 December 2025 (UTC)
reply
Comment
ToxicPea
: It would be easier to assess the request if an example value of each of these P31s were added to the request.
Ainali
talk
07:27, 20 December 2025 (UTC)
reply
Here an example for UK instruments,
The Occupational Pensions (Revaluation) Order 2025
(Q137038072)
and for Welsh instruments,
The Welsh Elections Financial Assistance Scheme (Disabled Candidates) Regulations 2025
(Q136214123)
The others have the same format as the UK instruments.
ToxicPea
talk
19:57, 20 December 2025 (UTC)
reply
Request process
Request to change genre for film adaptations to 'has characteristic' (2025-09-28)
edit
Request date: 29 September 2025, by
Gabbe
Link to discussions justifying the request
Property_talk:P136#genre_(P136)_=_film_adaptation_(Q1257444)_or_based_on_(P144)_=_item_?
Task description
For items with
instance of
(P31)
set to
film
(Q11424)
(or one of its subclasses) and
genre
(P136)
set to
film adaptation
(Q1257444)
film based on literature
(Q52162262)
film based on book
(Q52207310)
film based on a novel
(Q52207399)
, or
film based on actual events
(Q28146524)
, the property for said statement should be changed to
has characteristic
(P1552)
. If the statements have sources or qualifiers these should accompany it.
Similarly for items with
instance of
(P31)
set to
television series
(Q5398426)
(or one of its subclasses) and
genre
(P136)
set to
television adaptation
(Q101716172)
television series based on a novel
(Q98526239)
or
television series based on a video game
(Q131610623)
The reason is that "based on a book" (and so on), is not a "genre".
Discussion
Request process
Request to import links/IDs of full films available for free in YouTube (2025-09-29)
edit
Request date: 29 September 2025, by
Prototyperspective
Link to discussions justifying the request
Wikidata:Project chat/Archive/2025/02#Adding links/IDs of full movies in YouTube
Task description
This isn't a simple task but it would be very useful: please import the links (IDs) to full movies for free on YouTube via
YouTube video ID
(P1651)
to the items about the films.
If this is functioning well, as a second step please expand it so that also
new items
are created for films that are on YouTube and IMDb but not yet in Wikidata. Maybe that should be a separate subsequent request.
For importing films, there would need to be some way to finding and adding
source YT channels
to import from that host such films that are scanned by the script for films (e.g. film-length duration + IMDb item with matching name)
Example:
Popcornflix
(It has many and some are already integrated:
example
). There's also a Wikipedia article about it:
Popcornflix
Example 2:
PopNet
Example 3:
ZDFDoku
(German public broadcast channel)
One could find lots of individual films and channels containing these also by searching/scanning YouTube for e.g.
full films
(with long duration)
Complications
Videos that are offline should get their link removed from items. This may need some separate bot request. I noticed some of the added ones are offline (and quite a few geoblocked or just trailers – see below).
There are many channels containing full films. Maybe there already is a list of channels containing such somewhere or one creates a wiki-page where people can add legitimate channels containing full films.
I think the film should not be linked if it was uploaded less than e.g. 4 months ago to make sure it's not some illegitimate upload.
The language qualifier should be set. Generally, the language matches that of the video title.
It should be specified which type of video it is: the
object of statement has role
(P3831)
should be set to
full video/film available on YouTube for free
(Q133105529)
. This distinguishes it from film trailers and could also be used to e.g. specify when it's one full episode of a series set on the series item. Currently, nearly none of the items have this value set and many trailers do not have it specified that they're trailers. This could be fixed in large part using the
duration
(P2047)
qualifier since long videos are usually the full film and short ones the trailer.
If films are geoblocked in some or many regions, that should be specified (including the info where). This may require some new qualifier/property/item(s). Please comment if you have something to add for this. I think for now or early imports, it would be good to simply not import geoblocked videos. It may be less of an issue for non-English videos where it's not geoblocked in all regions where many people watch videos in that language.
I don't know if there is a qualifier that could be used to specify whether the film at the URL is available for free or only for purchase but such a qualifier should also be set to be able to distinguish these from YT videos only available for purchase.
Background: Adding this data may be very useful in the future to
potentially improve
WikiFlix
by a lot which currently only shows films with the full video on Commons. So much so good but for films
from 1930 or newer
, YouTube has much more free films and this could be an UI to browse well-organized free films, including short films, in a UI dedicated to films and on the Web without having to install anything and using data in Wikidata, e.g. based on the genre.
It could become one of the most useful and/or real-world popular uses of Wikidata.
Next to each film there would be metadata from Wikidata and more like the film description and IMDb rating or even
Captain Fact
(Q109017865)
fact-check info could be fetched via the IDs set in it. If there is no URL for the film cover specified, it would just load the cached thumbnail as film cover. Somewhere at the top of Wikiflix there could be a toggle button whether or not to also show full free films on YouTube etc or only – mostly very old – public domain films as is currently the case. Theoretically, there could also be separate sites like it not on toolforge and possibly not even querying Wikidata directly. Lastly, until YT ID imports are done at a substantial scale, people could use listeria tables like these two I recently created which people can also use to improve the data – like adding main subjects or removing offline links – on these films and keep track of new additions:
Wikidata:List of documentaries available on YouTube
Wikidata:List of short films available on YouTube
There may already be tools for this out there that only need to be adjusted such as yt-dlp and import scripts / bots for other IDs that are being imported.
Note that later on one could use the same for full videos in public broadcast media centers (properties are mostly not there yet) like the ZDF Mediathek. Also one could import data from sites like
doku-streams
and
fernsehserien
. It would integrate full films scattered across many sites and YT channels and extend it with wiki features as well as improve the usefulness of Wikidata by having items for more films.
Previous thread
Licence of data to import (if relevant)
Irrelevant since it's just links but nevertheless see the discussion and point 3 under Complications.
Discussion
See the bottom paragraph for one additional type of data to import: public broadcast series & films in online mediacenters. This data would give Wikidata an edge over IMDb and make it uniquely useful as IMDb only has data on a small fraction of documentaries made by public broadcast and lots of these are available for free on YouTube or in their mediacenter and of good quality. (It's harder to find and browse them on YouTube.) Maybe at some point some of them even get dubbed to other languages so they're for example also available in English. --
Prototyperspective
talk
15:13, 15 December 2025 (UTC)
reply
Request process
Request to import IMDb ratings (2025-10-07)
edit
Request date: 8 October 2025, by
Prototyperspective
Link to discussions justifying the request
Property talk:P444#Most film items miss the IMDb rating
Task description
This is one of the most-used structured data number many people use in their daily lives and needed for any applications using Wikidata for movies. One such application is
WikiFlix
which could become a Netflix-alternative UI for browsing and watching freely available full films.
For example, this query of Wikidata can't work because films don't have their IMDb rating set
Not even the popular films named in
Wikidata:Property proposal/IMDb rating
have their IMDb rating set.
Could somebody import this data for all the films that have
IMDb ID
(P345)
set?
Again, it would be very useful regardless of whether Wikiflix gets used a lot and I think Wikiflix could become the main way people learn about and first use Wikidata outside of Wikipedia where these ratings would be important data to have. Note that it also needs the qualifiers for the date and number of user-ratings.
Licence of data to import (if relevant)
Discussion
I dont think the IMDB ratings can be imported in Wikidata;
IMDb's license
is proprietary and restrictive, so it is incompatible with Wikidata's CC0 license.
Difool
talk
15:27, 19 November 2025 (UTC)
reply
Not true and quite absurd. It's irrelevant what license they claim. This is just a number that you can't copyright just like you can't copyright the factual age in years of a human.
Prototyperspective
talk
18:44, 19 November 2025 (UTC)
reply
I understand that facts like names and phone numbers can't be copyrighted, but my doubt was whether ratings really count as facts. But what about IMDb's Conditions of Use that state 'Robots and Screen Scraping: You may not use data mining, robots, screen scraping, or similar data gathering and extraction tools on this site, except with our express written consent'?
Difool
talk
01:21, 20 November 2025 (UTC)
reply
Indeed Wikimedia has to buy a license for user ratings
[1]
it is proprietary data.
Matthias M.
talk
08:17, 20 November 2025 (UTC)
reply
No, it doesn't. I think with "user ratings", they're referring to the text of user reviews. Again, it doesn't matter what IMDb claims – one can't copyright mere factual numbers like 3.2. It's like a person licensing the number of their age
or a sports organization licensing sports results
or book publishers licensing the number of pages of a book. Not possible.
If people here are so overly cautious then maybe it needs some Wikimedia legal or users to look into this and clarify. Wikidata will get nowhere in terms of genuine usefulness beyond Wikipedia or public adoption/use with this super cautious approach to data. Lots of apps and tools that didn't buy a license show IMDb ratings, including DuckDuckGo and Google. Instead of immediately assuming absurd copyright claims would be genuine, please first investigate whether this is actually the case (see e.g. the link above).
You may not use data mining, robots, screen scraping, or similar data gathering and extraction tools on this site, except with our express written consent'?
Good point. Is it possible to prohibit people from doing this in such a broad way? If it is, then it still seems only a risk to the user doing the import. If necessary, maybe somebody could contact IMDb to ask whether they'd be fine with Wikidata importing the scores data. Do any of you two or others here know of a place to ask about this?
Prototyperspective
talk
12:50, 20 November 2025 (UTC)
reply
Maybe it's an idea to retrieve the ratings from DuckDuckGo/Google/Bing, while double checking it with the values from IMDB?
Difool
talk
07:10, 25 November 2025 (UTC)
reply
That's a great idea! It would work like this but I don't know how it could be done technically.
However, double checking would again be scraping or similar data gathering so it would have the same problem that one could avoid by scraping from DDG/Google/… One idea would be extending a gadget/user-script to show a button next to imdb scores "Refresh" which if clicked would let a user manually get and add the latest score – maybe that could be / implemented in a way to be something else than "screen scraping, or similar data gathering and extraction" but I'm not sure since the score would still somehow be extracted from the page. Or does that sentence only refer to screen scraping and similar but not data gathering/scraping
via its API
Prototyperspective
talk
13:35, 25 November 2025 (UTC)
reply
A technical limitation of browser userscripts is that they can't directly fetch pages from other websites due to CORS restrictions. Pulling data from an API would be possible, but most web search APIs either cost money or have been discontinued (such as Bing or DuckDuckGo). A JavaScript where the user manually enters a rating and the script then automatically adds the statements is certainly possible, I'll look into that that. I've collected ratings for the IMDb Top 250 movies using Bing, Google, and DuckDuckGo, so bulk imports are also possible
In the search results I saw IMDb, Rotten Tomatoes, Metacritic, Letterboxd and some other ratings. Which of these sources should be included, and how should the statements look like?
For IMDb I saw this statement
Q107215963#P444
; for Rotten Tomatoes this one
Q22905787#P444
. Should references be required, and if so, what should they look like? Is it necessary to include the number of reviews/ratings as well? I was thinking about not adding reviews if there's one already present and its not older than say one year.
Difool
talk
15:47, 27 November 2025 (UTC)
reply
A JavaScript where the user manually enters a rating and the script then automatically adds the statements
I don't see how a script would be useful if the user already needs to look up and enter the rating manually. If it would do both, then it would be useful.
Pulling data from an API would be possible, but most web search APIs either cost money or have been discontinued
No, it would be collected by some user/bot to compile a small database of scores for all films that IMDb has a page on. Then the data would be imported from it. This could be done without an API by letting the bot do e.g. a google search for all the original film titles of the films in Wikidata and if the Google website displays a score, extract it from there.
I've collected ratings for the IMDb Top 250
would be great if you added them but that's just a tiny fraction and doesn't solve the issue. Other users could maybe could collect several magnitudes of more scores for items. I don't know how IMDb ratings are gathered in Kodi by the way but they display for every item if you configure Kodi like so.
Which of these sources should be included
All of these 4 would be useful.
Should references be required
Would be good imo. Just the URL – the time is set in the score qualifier.
number of reviews/ratings
would be good and may be good to require.
not adding reviews if there's one already present and its not older than say one year
Good question. I think updating at most annually sounds reasonable, except for during the first 6 months or so after the release.
Prototyperspective
talk
15:51, 1 December 2025 (UTC)
reply
I did create a javascript to manual add scores; see
User:Difool/AddReviewScores.js
, maybe you could try it out, see for example
[2]
. Some things I encountered that I didn't think of before hand:
Critic reviews scores (such as Tomatometer) don't change after a certain period following a film's release, so they don't need to be updated.
Property
number of reviews/ratings
(P7887)
is stored as a string, but the formatting is ambiguous. For example, IMDb displays values like "465K", should this be written as "465000" instead?
If you have multiple scores from the same provider, the most recent one should be set to preferred rank. And if so, then scores from other providers need to be set to the preferred rank too.
I don't know how IMDb ratings are gathered in Kodi
I checked the code and found that they scrape IMDb's website directly. Other data is retrieved from The Movie Database API using a fixed key.
Difool
talk
01:42, 19 December 2025 (UTC)
reply
That's great, thanks a lot!
Nevertheless, I don't think a user-script is a sufficient or the best approach to this. It's more like a workaround until something is built that imports the data at scale. I still think this
needs to be done via some large-scale script import, e.g. by scraping Google results or scraping another website that shows the IMDb ratings
I tried the script but it doesn't work: tried just adding an IMDb rating but it doesn't edit the item. This error is in the Firefox console:
Uncaught ReferenceError: showError is not defined
I thought the script would pull the rating and count from IMDb if you click the button. Don't know if that already falls under
data gathering and extraction tools
in IMDb's use-policy, maybe somebody here knows. That way would be much better.
number of reviews/ratings
(P7887)
should I guess then be changed or not? Yes, I think 465K should be written as a number (K there is just about the precision where it may be worth considering specifying that as a qualifier). Setting the preferred rank and removing it from earlier values is something the script could/should also do if it's to be used widely. And is there really no property for Rotten Tomatoes'
Popcornmeter score
(Q131100566)
yet (do you know if it has been proposed)?
I checked the code and found that…
Thanks! In my opinion it would be best if we did the same or at least investigated thoroughly if/how the former could be done and then do that. See for example
[3]
[4]
[5]
, and
OMDb API
. The latter seems to work quite well for getting scores of films based on IMDb ID.
Prototyperspective
talk
19:24, 19 December 2025 (UTC)
reply
It's absolutely possible to build a tool that mass‑scrapes IMDb pages to retrieve scores, but the problem is that Wikidata publishes all it contents under CC0 and not just shows it to a user like Kodi does. IMDb explicitly prohibits scraping, though I'm not sure how that holds from a legal standpoint (I have seen "Web scraping is legal if you scrape data that is publicly available on the internet", but maybe the republishing is a problem). Before writing a scraping tool, I'd want to be certain that it's legally permissible. Maybe you can think of a way to make sure of this, for example consulting Wikimedia Legal.
Pulling ratings directly from IMDb (inside the browser, originating from Wikidata!) isn't technically possible (at least I don't know how to do it) because of CORS restrictions.
Given these limitations, entering the scores manually with a helper tool is a safe starting path. I want to make sure we get the scores and references right so they're consistent. At the moment it's rather laborious to do this manually, and as far as I can see it isn't documented (for example
on the Movies project page
) how to do it consistently.
I fixed the error you mentioned: the script expects you to fill in all input fields; if you don't want that, you need to remove the unused rows.
On
number of reviews/ratings
(P7887)
: yes, writing the full number out seems the most reasonable approach (465000 instead of 465K).
For
Popcornmeter score
(Q131100566)
, you can look at
Unleashed
(Q27959497)
, where the score was added by the now‑defunct
RottenBot
, as an example.
Difool
talk
03:19, 20 December 2025 (UTC)
reply
Good points, thanks. I don't know of a way but maybe you could mail these. The script now works after the latest changes and is a good starting point but doesn't really scale. Why RottenBot defunct – could it be revived?
I now heard one can get one month of a free trial access to IMDb's metadata via API. That should be enough to get all the ratings in if the importer is coded well. So that may be the best route to get the ratings in and another user could use another trial at another point in time to get ratings for new film items added since the last import.
Prototyperspective
talk
19:14, 10 February 2026 (UTC)
reply
I agree this would be very nice to have and I suggest contacting IMDB and asking nicely for a dump file with id, rating and number of user ratings.
Since we link to them it will generate traffic to them and help them make money. That should be reason enough for them to willingly publish this as CC0 data for anyone to incorporate.
It's basically a win-win situation and their PR department could spin it as helping the open data ecosystem and making reviews of our fantastic movie art a part of the UN global digital heritage, and so forth...
So9q
talk
17:35, 28 November 2025 (UTC)
reply
Request process
Request to specify language of images, videos & audios (2025-11-07)
edit
Request date: 8 November 2025, by
Prototyperspective
Link to discussions justifying the request
Task description
Media files in items can be in any language but often the language is not specified. The language qualifier is used for example by the Wikidata infobox on Commons where it displays the file in the user's language if available.
Please set
language of work or name
(P407)
for videos and audios as well as images like diagrams with text in them via the metadata in
Commons
categories.
There are multiple properties that can get media files set such as
video
(P10)
and
schematic
(P5555)
See
c:Category:Audio files by language
c:Category:Information graphics by language
c:Category:Images by language
and
c:Category:Videos by language
I already did this for
c:Category:Spoken English Wikipedia
files and for
c:Category:Videos in German
and a few other languages.
I described step-by-step how I did it
here
on the talk page of wish
Suggest media set in Wikidata items for their Wikipedia articles
which is another example of how this data could be useful/used (and there are many more for why specifying the language of files is important).
That was a 1) to a large degree slow and manual process 2) only done for a few languages and 3) isn't done periodically as a bot could do. One can't possibly check 300 language categories for 3 media types every second month or so. A challenge could be
miscategorizations
– however these are very rare – especially for all files not underneath the large cat 'Videos in English' – and the bot would set multiple languages to the files – so one could do a scan of all files that have multiple languages set and fix them (unless the file indeed is multilingual).
Here
is another procedure including SPARQL (thanks to
Zache
) that uses Commons categories to set media file qualifiers in Wikidata, specifically the recording date of audio versions of Wikipedia articles (important metadata e.g. as many are outdated by over a decade). Maybe some of this is useful here too.
Licence of data to import (if relevant)
(not relevant)
Discussion
Oppose
Data about a file belongs in the file's own structured data on Commons. -
Nikki
talk
05:49, 26 January 2026 (UTC)
reply
Your comment comes from a nonunderstanding of Wikidata. Images, videos and audios have a language qualifier set on the Wikidata property. Many files have it set but not all. Structured data on Commons is extremely incomplete, especially when it comes to language metadata where nearly no files have that specified, and not even close to Commons categories such as
c:Category:Videos by language
and, more importantly, SD could be used to set the qualifier – if it's set that doesn't magically display in the qualifier on Commons. Before commenting please first learn about what is possible and what the current state of things is, thanks.
Prototyperspective
talk
15:15, 26 January 2026 (UTC)
reply
Request process
Request to import subject named as & tags for Stack Exchange sites (2025-11-10)
edit
Request date: 10 November 2025, by
Prototyperspective
Link to discussions justifying the request
Task description
Could somebody please import the
subject named as
(P1810)
qualifiers for
Stack Exchange site URL
(P6541)
This could then be displayed in a new column at
Wikidata:List of StackExchange sites
and then one could sort the table by it and compare it to
to add all the missing Stack Exchange sites
to items.
It would also be good if
Stack Exchange tag
(P1482)
were imported as well as they are mostly just set for StackOverflow but not other Stack Exchange sites. For an example, I added two other tag URLs to
Q1033951#P1482
I think these things could be done with a script. There sites page linked above has this info all on one page and maybe there is another more structured format of it or a list of these sites that includes URL and name. One could also have it open the URLs and then import the page title as subject named as. For the tags, one could check for a tag named exactly like the item or, if present, like the linked stackoverflow tag.
Licence of data to import (if relevant)
(not relevant)
Discussion
Once this is completed; I or another user could go through the table with the
in a window on the side to note down all the stackexchange sites not yet in Wikidata. This would enable completion of the StackExchange site data / interlinking on Wikidata which is useful for all kinds of things. --
Prototyperspective
talk
17:37, 1 February 2026 (UTC)
reply
Request process
Request to import data on Linux distributions (2025-11-19)
edit
Request date: 19 November 2025, by
Prototyperspective
Could the remaining items please be imported from DistroWatch in some way?
Link to discussions justifying the request
the reddit post and
User talk:Matthias M.#DistroWatch Mix'n'Match incomplete
Task description
After posting about
Wikidata:List of Linux distributions
764
items)
on reddit
, a user told me about there are
1,110
distributions in DistroWatch's database
Import could be done for example by scraping the site, then formatting the results, and then running quickstatements to create the items with the data but I don't know how this would best be done. One could also add some missing data to the existing items.
For each distribution, DistroWatch has for example the
official website URL
(P856)
, the distro it is
based on
(P144)
, the supported desktop environments (
GUI toolkit or framework
(P1414)
), etc
Licence of data to import (if relevant)
Discussion
This has to wait until we have matched all the existing Linux distributions; otherwise, this risks duplicates. Also, the amount is so small and the data so unstructured (HTML tables) that it is better done manually with oversight anyway.
Matthias M.
talk
14:05, 20 November 2025 (UTC)
reply
Here is the Mix'n'Match catalog for Linux distributions
(created by you, thanks). I suppose this is what you're referring to? Those seem to all be matched so what do you mean? It's a few hundred items so I don't think it's a small number. Maybe somebody already has a tool or script that can be adapted to do this with little change to it required.
the data so unstructured (HTML tables)
does DistroWatch have some API or export functionality? If not, maybe one could scrape it and then have some tool, possibly an AI-based one, covnert the data into importable format.
Prototyperspective
talk
14:29, 20 November 2025 (UTC)
reply
Request process
Request to Move abused P31 properties to P1552 during property proposals (2025-12-02)
edit
Request date: 2 December 2025, by
Immanuelle
Link to discussions justifying the request
Wikidata:Property_proposal/Engishiki_Funding_Category
Wikidata_talk:WikiProject_Shinto#Relevant_property_proposals
Task description
Go through all items that have
instance of
(P31)
of any of these values, and then move it to
has characteristic
(P1552)
Preserve all qualifiers
List:
Kokuhei-sha
(Q135160342)
Kanpei-sha
(Q135160338)
Shrines receiving Hoe and Quiver
(Q135009152)
Shrines receiving Hoe offering
(Q135009205)
Shrines receiving Quiver offering
(Q135009221)
Shrines receiving Tsukinami-sai and Niiname-sai offerings
(Q135009132)
Shrines receiving Tsukinami-sai and Niiname-sai and Ainame-sai offerings
(Q135009157)
Shikinai Shōsha
(Q134917287)
Shikinai Taisha
(Q134917288)
Myōjin Taisha
(Q9610964)
Junior First Rank
(Q11071121)
Junior Third Rank
(Q11071123)
Junior Fifth Rank
(Q11071125)
Junior Fourth Rank
(Q11071127)
Senior First Rank
(Q11123258)
Senior Third Rank
(Q11123261)
Senior Second Rank
(Q11123277)
Senior Fifth Rank
(Q11123280)
Senior Fourth Rank
(Q11123338)
Third Rank
(Q11354375)
Second Rank
(Q11371333)
Fourth Rank
(Q11419606)
Greater Initial Rank
(Q11433041)
Lesser Initial Rank
(Q11464527)
Junior Seventh Rank
(Q11488718)
Junior Ninth Rank
(Q11488719)
Junior Eighth Rank
(Q11488720)
Junior Second Rank
(Q11488721)
Unranked
(Q11504610)
Senior Seventh Rank
(Q11545345)
Senior Ninth Rank
(Q11545350)
Senior Eighth Rank
(Q11545368)
Senior Sixth Rank
(Q11545372)
Junior Sixth Rank
(Q14624983)
Licence of data to import (if relevant)
Discussion
I believe that doign this will make the items substantially more usable in the short term while the following property proposals are being done
Wikidata:Property proposal/Engishiki Funding Category
Wikidata:Property proposal/Engishiki Rank
Wikidata:Property proposal/Divine Rank
P31 was abused for this purpose in an old bot import and is not the good propery for them even provisionally.
Immanuelle
talk
08:56, 2 December 2025 (UTC)
reply
Request process
I'm inclined to wait for the closure of property proposals. These are definetly not suitable P31 values but they have been there for several months so they can stay for a couple of days/weeks more after which we will make things the right way by moving the declarations to the final destination.
Louperivois
talk
23:52, 3 December 2025 (UTC)
reply
Louperivois
the request on
Wikidata:Property proposal/Divine Rank
was approved although not created.
Immanuelle
talk
23:40, 16 December 2025 (UTC)
reply
Yeah it’s finished now.
Immanuelle
talk
22:06, 18 December 2025 (UTC)
reply
Hello, can you point me which of the ranks aforementioned are going to
Japanese court rank
(P14005)
and which to "Engishiki Rank" that is not created yet.
Louperivois
talk
22:25, 18 December 2025 (UTC)
reply
Louperivois
these are the ones that are going to
Japanese court rank
(P14005)
Unranked
(Q11504610)
Lesser Initial Rank
(Q11464527)
Greater Initial Rank
(Q11433041)
Junior Ninth Rank
(Q11488719)
Senior Ninth Rank
(Q11545350)
Junior Eighth Rank
(Q11488720)
Senior Eighth Rank
(Q11545368)
Junior Seventh Rank
(Q11488718)
Senior Seventh Rank
(Q11545345)
Junior Sixth Rank
(Q14624983)
Senior Sixth Rank
(Q11545372)
Junior Fifth Rank
(Q11071125)
Senior Fifth Rank
(Q11123280)
Fourth Rank
(Q11419606)
Junior Fourth Rank
(Q11071127)
Senior Fourth Rank
(Q11123338)
Third Rank
(Q11354375)
Junior Third Rank
(Q11071123)
Senior Third Rank
(Q11123261)
Second Rank
(Q11371333)
Junior Second Rank
(Q11488721)
Senior Second Rank
(Q11123277)
Junior First Rank
(Q11071121)
Senior First Rank
(Q11123258)
The ones going to Engishiki rank (uncreated) are
Shikinai Shōsha
(Q134917287)
Shikinai Taisha
(Q134917288)
Myōjin Taisha
(Q9610964)
And the ones going to Ritsuryo funding category (uncreated) are
Kokuhei-sha
(Q135160342)
Kanpei-sha
(Q135160338)
Shrines receiving Hoe and Quiver
(Q135009152)
Shrines receiving Hoe offering
(Q135009205)
Shrines receiving Quiver offering
(Q135009221)
Shrines receiving Tsukinami-sai and Niiname-sai offerings
(Q135009132)
Shrines receiving Tsukinami-sai and Niiname-sai and Ainame-sai offerings
(Q135009157)
Immanuelle
talk
00:47, 19 December 2025 (UTC)
reply
Immanuelle
Done for P14005.
Louperivois
talk
13:52, 19 December 2025 (UTC)
reply
Additional property proposal in question
Wikidata:Property proposal/Gifu Prefectural Shrine Association ranking
and
Kinpei-sha
(Q119929592)
Ginpei-sha
(Q137886068)
Hakuhei-sha
(Q137886071)
Immanuelle
talk
19:45, 26 January 2026 (UTC)
reply
This one is indicated with
has characteristic
(P1552)
, not P31, it is for the future property proposal.
Immanuelle
talk
04:30, 26 February 2026 (UTC)
reply
Request to remove underspecified types of Ritsuryō funding (2025-12-22)
edit
Request date: 22 December 2025, by
Immanuelle
Link to discussions justifying the request
Wikidata:Property_proposal/Engishiki_Funding_Category
Task description
for all instances of
Shrines receiving Hoe and Quiver
(Q135009152)
Shrines receiving Hoe offering
(Q135009205)
Shrines receiving Quiver offering
(Q135009221)
Shrines receiving Tsukinami-sai and Niiname-sai offerings
(Q135009132)
Shrines receiving Tsukinami-sai and Niiname-sai and Ainame-sai offerings
(Q135009157)
please remove a
instance of
(P31)
Kanpei-sha
(Q135160338)
. Please do this even if the removed statement has a source or qualifiers on it.
Licence of data to import (if relevant)
Discussion
Doing this will satisfy the single value constraint in the property proposal
Wikidata:Property_proposal/Engishiki_Funding_Category
Immanuelle
talk
23:03, 22 December 2025 (UTC)
reply
I would like to wait until the property proposal is accepted. --
Ameisenigel
talk
10:06, 10 January 2026 (UTC)
reply
Request process
Request to connect unconnected disambiguation pages to wikidata items by a bot (2025-12-26)
edit
Request date: 26 December 2025, by
M2k~dewiki
Link to discussions justifying the request
Hello, in the past,
User:PLbot
resp.
User:DeltaBot
created new disambiguation wikidata items if not yet existing for unconnected disambiguation pages
connected disambiguation pages to existing disambiguation wikidata items if existing
in various project languages
Since the scripts had problems with disambiguation pages with pages titles including brackets, for example:
page title + "(disambiguation)"
page title + "(Begriffsklärung)"
page title + "(desambiguación)"
page title + "(flertydig)"
...
and created new Wikidata objects for every different (...) page title instead of connected to already existing items, this task of the Bot has been stopped in September 2025:
User_talk:DeltaBot#Creating_items_that_have_"(disambiguation)"_in_their_titles
User_talk:DeltaBot#Duplicate_item_creations
Wikidata:Administrators'_noticeboard/Archive/2025/09#User_talk:DeltaBot#Duplicate_item_creations
Task description
My request would be to modify/adapt/adopt this task, so it does not create duplicates any more.
Existing Code can be found at:
The bot should remove the part in brackets (....) before checking for existance of wikidata items.
duplicate disambiguation items created in the past should be merged
NEW: the disambiguation items could link to the
family name and/or first name items
if existing, using
Property:P1889
, for example:
d:Q252067
<-->
d:Q4925932
<-->
d:Q2696104
Examples of unconnected disambiguation pages:
English Wikipedia
Spanish Wikipedia
MisterSynergy
Mike Peel
: for information.
Thanks a lot! --
M2k~dewiki
talk
14:19, 26 December 2025 (UTC)
reply
Discussion
M2k~dewiki
User:Pi bot
already does this as part of
Wikidata:Requests for permissions/Bot/Pi bot 19
and
the Wikidata game
? This probably does explain why there's recently been quite a backlog of disambig items in particular in the Wikidata game... Thanks.
Mike Peel
talk
17:05, 26 December 2025 (UTC)
reply
Hello, currently there is a backlog of about 360 unconnected disambiguation pages for the english language wikipedia:
There might be also be a backlog for disambiguation items in the about 300 other project languages, e.h. for the spanish language wikipededia:
Currently there is a backlog of 5000 unconnected pages in the english language wikipedia:
There are about 450 unconnected biographical articles in the english language wikipedia:
For connecting items using QuickStatements the following scripts can be used:
M2k~dewiki
talk
17:33, 26 December 2025 (UTC)
reply
M2k~dewiki
: There are 8308 potential matches between English Wikipedia and Wikidata currently waiting for human review in
the game
, so the numbers you're saying here aren't too worrying for my point of view. As people work through matches in the game, they should go down Thanks.
Mike Peel
talk
17:40, 26 December 2025 (UTC)
reply
Since 2023 the number of unconnected pages has been increasing:
M2k~dewiki
talk
17:45, 26 December 2025 (UTC)
reply
M2k~dewiki
: The backlog in the game is now cleared, the numbers in duplicity for enwiki have dropped significantly. Remember that with this setup, it will never get to zero, as Pi bot doesn't create new items for 14 days after the article's been created, or 7 days since it was last edited, to give people time to match them up with existing items that the bot hasn't found. There shouldn't be any obvious matches left, though. Thanks.
Mike Peel
talk
17:15, 16 January 2026 (UTC)
reply
Request process
Request to remove copyvio links (2026-01-13)
edit
Request date: 13 January 2026, by
LaundryPizza03
Link to discussions justifying the request
m:Talk:Spam_blacklist#worldradiohistory.com
Task description
Remove links to "worldradiohistory.com" and "chakoteya.net" across all Wikidata pages. LinkSearch counts
83 uses across Wikidata for worldradiohistory.com
, and
107 for chakoteya.net
Licence of data to import (if relevant)
Discussion
Request process
Request to add カミノヤシロ to Old Japanese katakanizations of shrine names (2026-02-26)
edit
Request date: 26 February 2026, by
Immanuelle
Link to discussions justifying the request
Wikidata_talk:WikiProject_Shinto#Cleaning_up_Engishiki_kana
Task description
Use
this SPARQL query
to find such data. In the
name in kana
(P1814)
qualifier please edit it to append "カミノヤシロ".
Licence of data to import (if relevant)
Discussion
As per discussion changed the preferred reading
Immanuelle
talk
04:18, 8 March 2026 (UTC)
reply
A bit of a warning. As per
this discussion
there is a bit of inconsistency on the data to be edited. I am going to try to propose some additional steps that will likely be best done before this for optimal coverage.
Basically an earlier mass edit missed some pages, so many
instance of
(P31)
Disputed Shikinaisha or Shikigeisha
(Q135038714)
did not get their name in kana moved to official name like this.
Immanuelle
talk
20:22, 10 March 2026 (UTC)
reply
For this the same katakana change is to be made but it is not in a qualifier. However the name in kana should be moved to be a qualifier for the official name
Immanuelle
talk
17:31, 14 March 2026 (UTC)
reply
Request process
Request to add
KCI article ID
(P14184)
from
work available at URL
(P953)
(2026-03-14)
edit
Request date: 14 March 2026, by
Toolipo
Link to discussions justifying the request
There are a lot of items with KCI article url in
work available at URL
(P953)
but without
KCI article ID
(P14184)
Task description
If you search "www.kci.go.kr/kciportal/landing/article.kci?arti_id=" -P14184, you can find such items. Extract KCI article ID from P953 and add it to P14184.
Licence of data to import (if relevant)
Discussion
Request process
BRFA
filed
Alex NB IT
talk
19:47, 12 April 2026 (UTC)
reply
Request to .. (2026-04-20)
edit
Request date: 20 April 2026, by
Paucabot
Link to discussions justifying the request
We recently created the new property
perspective
(P14345)
Wikidata:Property proposal/anatomical view
).
Task description
With
Rebot
, I plan to move:
About
570
statements from
depicts
(P180)
to
perspective
(P14345)
as a property
About
1400
statements from
depicts
(P180)
to
perspective
(P14345)
as a qualifier of
image
(P18)
Thanks,
Paucabot
talk
15:29, 20 April 2026 (UTC)
reply
Licence of data to import (if relevant)
Discussion
Request process
Request to revert destructive bot edits (2026-04-21)
edit
Request date: 21 April 2026, by
E L Yekutiel
Link to discussions justifying the request
Wikidata:Administrators' noticeboard#Malfunctioning bot
Task description
Yesterday, the bot
User:Tegebot
performed the following procedure hundreds (thousands? I didn't count) of times, sometimes together with
User:Nurbot
Create a new item.
Add a sitelink to ttwiki.
Merge an existing item into the newly created item
, leaving a redirect.
The correct procedure should be to add the sitelink to an existing item, and in any case, if it is found that a newly created item is the same as an existing one, the newer one should be merged into the existing one, and not the other way around. Merges of existing items is a good way to break flows.
This has all happened in a period of about 3 hours yesterday (April 20th), and following my report in WD:AN, the bot was blocked (apparently it was not an approved bot). These are exactly all of the edits from that date, if it helps (some of the new item creation & ttwiki sitelinking were performed by
User:Nurbot
, at the same date).
I was told in WD:AN that these edits should be reverted (by a bot due to the quantity), so I'm writing this request here.
The previous edits of both bots were the addition of thousands of ttwiki sitelinks to existing items, which sounds legitimate to me (Nurbot went back to doing that today); I don't think that those edits should be reverted.
(Of course, I assume that the April 20th edits were not performed as intentional vandalism, but as some careless mistake; nevertheless, that mistake should still be fixed).
Thanks to
User:Mbkv717
for noticing and reporting on our local wiki, and to
Yona
for pointing me here.
Licence of data to import (if relevant)
Discussion
Huge mess. Is anyone aware who the maintainer of User:Tegebot is? —
MisterSynergy
talk
17:01, 21 April 2026 (UTC)
reply
maybe
User:Il Nur
Neriah
talk
18:05, 22 April 2026 (UTC)
reply
Request process
Request to periodically mass-delete surely non-notable items (2026-04-21)
edit
Request date: 21 April 2026, by
Epìdosis
Link to discussions justifying the request
WD:N
(see also
Wikidata:Requests for comment/Notability policy reform
Task description
Considering the current Notability policy, and the current discussion about its reform, I think it would probably be useful to have an admin-bot periodically deleting surely not notable items; in order to find them, and to exclude any kind of possible false positives, I consider 100% safe to delete items meeting
all
the following conditions:
no sitelinks -
?item wikibase:sitelinks 0
(so not notable per N1)
no sources (so not notable per N2), specifically:
no references in any statements -
MINUS { ?item ?p ?st . ?st prov:wasDerivedFrom ?ref }
no external identifiers -
?item wikibase:identifiers 0
no
described by source
(P1343)
MINUS { ?item wdt:P1343 ?ds }
no
described at URL
(P973)
MINUS { ?item wdt:P973 ?du }
no incoming links from other items -
MINUS { ?item2 ?r ?item . ?property wikibase:directClaim ?r }
(so not notable per N3)
All these criteria are encoded in
which unfortunately times out; to avoid this, adding one restrictive statement is necessary, like e.g. in
. I think the total results might be at least tens of thousands (note: a much smaller of surely deletable items is
User:Pasleim/notability
, updated by @
DeltaBot
: managed by @
Pasleim
MisterSynergy
:).
I think the bot should run daily (or weekly) and:
delete the aformentioned items, filtering only those created more than 48 hours ago, setting as motivation "Does not meet the
notability policy
" + one of the labels of the item
ideally, the bot should check if the item had been created as part of an
edit group
keep a log page with some statistics (e.g. number of deleted items per day and month; number of deleted items per creator; edit groups with more than 100 deleted items)
send a standard message (I suggest codifying it in a template) to the creator of the item, specifying the list of deleted items, each with QID and label, and how to request further information or the undeletion; if some of the items were created as part of an edit group, this should be mentioned in the message; in order not to flood user talk pages, each message should be about all the items deleted in the same run of the bot; if the list is very long (e.g. above 100 items), it should be substituted by the total number of deleted items and a sample of e.g. 10 items
note: for the first run of the bot, which might take more than one day, I suggest sending the message only at the end of it
Discussion
This process is describing "surely incomplete items", not "surely non-notable items". There is quite a big difference.--
Pharos
talk
18:01, 21 April 2026 (UTC)
reply
Kinda risky and aggressive. We do continuously look after quite a bunch of non-notable items and delete them without notice. I don’t see a need for this kind of automated mass-deletion right now. —
MisterSynergy
talk
18:13, 21 April 2026 (UTC)
reply
Not sure this would be good. The current main problems of WD are the lack of data, not the opposite, and technical issues with scalability as it relates to the former. If these items are deleted I think one should at least add some warning note to the item at least 30 days before deletion. --
Prototyperspective
talk
14:40, 22 April 2026 (UTC)
reply
Request process
Retrieved from "
Categories
Requests
Development
Data import
Wikidata
Bot requests
Add topic