⚓ T414338 FY25-26 WE5.4.12: Identify the

⚓ T414338 FY25-26 WE5.4.12: Identify the provenance of image requests
Page Menu
Phabricator
Create Task
Maniphest
T414338
FY25-26 WE5.4.12: Identify the provenance of image requests
Open, Needs Triage
Public
Actions
Edit Task
Edit Related Tasks...
Create Subtask
Edit Parent Tasks
Edit Subtasks
Merge Duplicates In
Close As Duplicate
Edit Related Objects...
Edit Commits
Edit Mocks
Mute Notifications
Protect as security issue
Assigned To
matmarex
Authored By
JTweed-WMF
Jan 12 2026, 1:21 PM
2026-01-12 13:21:05 (UTC+0)
Tags
MediaWiki-Platform-Team (Q3 Kanban Board)
(Epic in Progress)
OKR-Work
(Backlog)
Epic
MediaWiki-File-management
(Backlog)
Commons
(Incoming)
MW-1.46-notes (1.46.0-wmf.19; 2026-03-10)
Patch-For-Review
Referenced Files
None
Subscribers
Aklapper
BCornwall
CDanis
Joe
JTweed-WMF
Krinkle
Mr._Starfleet_Command
Tgr
Description
Problem
As part of the work under
WE5.4
to protect our infrastructure from abusive scraping, we want to be able to understand the provenance of image requests. This means being able to distinguish when and where a URL to an image was generated.
This will allow us to use this information as a signal in request filtering at the CDN, by helping to determine if a request is coming from a browser session visiting the website, an API query, from dumps or if they are the result of hotlinking.
Approach
Generate signed URLs for image requests, by adding query parameters that contain the provenance information and a signature that can be trivially validated at the CDN. The signature should be an HMAC that includes the URL, source (web, api, dumps), timestamp and a secret.
Acceptance criteria
Generated image URLs include provence query parameters
Generated image URLs include an HMAC signature
Signature contents and HMAC algorithm agreed with SRE
SRE can configure the CDN based on the source that generated an image URL
SRE can configure the CDN based on the freshness of an image URL
Status updates
4 Feb 2026,
T414338#11584348
13 Mar 2026,
T414338#11804201
9 Apr 2026,
T414338#11804224
Details
Related Changes in Gerrit:
Subject
Repo
Branch
Lines +/-
mmv.bootstrap: Avoid double download when thumb is unscaled original
mediawiki/extensions/MultimediaViewer
master
+12
-8
Enable wgTrackMediaRequestProvenance on Commons
operations/mediawiki-config
master
+0
-1
Enable wgTrackMediaRequestProvenance on remaining Wikipedias
operations/mediawiki-config
master
+0
-28
Enable wgTrackMediaRequestProvenance on wikidata.org
operations/mediawiki-config
master
+0
-1
Enable wgTrackMediaRequestProvenance on most group1 wikis
operations/mediawiki-config
master
+24
-2
Enable $wgTrackMediaRequestProvenance on group0 wikis
operations/mediawiki-config
master
+1
-0
Enable $wgTrackMediaRequestProvenance on testwikis and beta cluster
operations/mediawiki-config
master
+12
-0
Media: Add provenance parameters to thumbnail and media file URLs
mediawiki/core
master
+114
-5
FileRepo: Rename cache-busting param to '_' on file description pages
mediawiki/core
master
+1
-1
Media: Generalize cache-busting query params on file description pages
mediawiki/core
master
+27
-52
Customize query in gerrit
Related Objects
Search...
Task Graph
Mentions
Status
Subtype
Assigned
Task
Open
matmarex
T414338
FY25-26 WE5.4.12: Identify the provenance of image requests
Resolved
BUG REPORT
matmarex
T419458
Media dialog in VisualEditor shows odd UTM param strings where file type should be
Open
BUG REPORT
Krinkle
T422586
MediaViewer downloads high-res image twice if original is a medium-size JPEG
Open
Krinkle
T424082
MediaViewer preview sometimes lacks provenance parameters
Mentioned In
T424082: MediaViewer preview sometimes lacks provenance parameters
T422586: MediaViewer downloads high-res image twice if original is a medium-size JPEG
T418957: Add client-side logging for non-MediaWiki action API errors (HTTP 429)
T419921: TypeError: MediaWiki\Extension\OAuth\ResourceServer::getUser(): Return value must be of type MediaWiki\User\User, false returned
T417278: Choosing client credentials grant for OAuth 2 results in an access token (JWT) with the 'sub' field empty
T419135: Gadget-Stockphoto.js on Commons uses non-common thumbnail sizes, leading to a HTTP 429
T419458: Media dialog in VisualEditor shows odd UTM param strings where file type should be
T246054: Consider dropping the '1.5x' size logos from srcsets
T414805: FY 25/26 WE 5.4.10 Standard Thumbnail Sizes Only
T417309: mw.util.parseImageUrl() returns invalid thumb URLs for images where original size is under requested width
T414337: Identify requests for media files from logged-in users
Mentioned Here
T419135: Gadget-Stockphoto.js on Commons uses non-common thumbnail sizes, leading to a HTTP 429
T422586: MediaViewer downloads high-res image twice if original is a medium-size JPEG
T419458: Media dialog in VisualEditor shows odd UTM param strings where file type should be
T417278: Choosing client credentials grant for OAuth 2 results in an access token (JWT) with the 'sub' field empty
T418957: Add client-side logging for non-MediaWiki action API errors (HTTP 429)
T419921: TypeError: MediaWiki\Extension\OAuth\ResourceServer::getUser(): Return value must be of type MediaWiki\User\User, false returned
T402792: Consider rate limiting non-standard thumbnail sizes
T414337: Identify requests for media files from logged-in users
Event Timeline
JTweed-WMF
created this task.
Jan 12 2026, 1:21 PM
2026-01-12 13:21:05 (UTC+0)
Restricted Application
added a subscriber:
Aklapper
View Herald Transcript
Jan 12 2026, 1:21 PM
2026-01-12 13:21:06 (UTC+0)
JTweed-WMF
moved this task from
Essential Work
to
Next
on the
MediaWiki-Platform-Team (Q3 Kanban Board)
board.
Jan 12 2026, 2:18 PM
2026-01-12 14:18:25 (UTC+0)
Tgr
subscribed.
Jan 16 2026, 10:48 AM
2026-01-16 10:48:33 (UTC+0)
Comment Actions
Needs an exact definition of the provenance parameters. (UTC day +
web
api
@Joe
said somewhere else that we should just avoid doing this for dumps so they get grouped with other unknown-provenance requests. If we do need provenance info for dumps, not sure we are the right team for that.) Note that there are implications to client-side caching and thus site performance.
I can think of three strictness levels for the signature:
Include the exact URL (without query parameters, presumably). Any change to the URL (e.g. resizing, language change for SVG) would require a new API request. Seems to me like a pointless hoop to make clients jump through.
Include the filename. Clients need to obtain a legitimate URL but then can manipulate details like size (e.g. currently MediaViewer does this to improve performance). If you wanted to fetch a large batch of files, you'd need to make a bunch of API requests, so this would still generate a bunch of additional API traffic (OTOH it would presumably make image traffic easier to police since we have more controls for API traffic). There are some nuances to how to implement this as the canonical filename and the filename in the URL (which is what the edge has easy access to) differ in some obscure ways.
Do not include anything other than the provenance parameters (ie. once you have the signature you can reuse the provenance info for other images). Makes things simpler for clients and reduces API traffic but easier to circumvent. Still enough to identify scrapers unless they are actively malicious and have Commons-specific logic to avoid detection.
We'd need to decide whether to generate the signed URL for images inside page content during parsing or on the fly. Again, affects client-side caching.
There are two ways to include provenance in the URL:
upload.wikimedia.org/.../Blah.jpg?prov=...
and
upload.wikimedia.org/.../prov=.../Blah.jpg
(or similar; ie. query part or path part). Including it in the query is a bit more likely to break clients / callers which expect a URL with no query part (and do something like
$query . '?foo=bar'
). Including it in the path will probably need adjustment MediaWiki's own URL parsing, and probably some Apache / other traffic stuff. The latter seems worse.
Probably can be implemented in
File::getUrl()
File::getThumbUrl()
File::getArchiveUrl()
File::getArchiveThumbUrl()
which pretty much everything passes through? We'd have to audit callers and see whether they break. Also, only hacky ways to differentiate web/API at that point.
Tgr
mentioned this in
T414337: Identify requests for media files from logged-in users
Jan 16 2026, 12:17 PM
2026-01-16 12:17:54 (UTC+0)
OWresch-WMF
added a project:
Epic
Jan 21 2026, 1:33 PM
2026-01-21 13:33:43 (UTC+0)
matmarex
claimed this task.
Jan 29 2026, 4:11 PM
2026-01-29 16:11:38 (UTC+0)
OWresch-WMF
moved this task from
Next
to
Epic in Progress
on the
MediaWiki-Platform-Team (Q3 Kanban Board)
board.
Feb 2 2026, 3:29 PM
2026-02-02 15:29:08 (UTC+0)
Krinkle
subscribed.
Edited
Feb 4 2026, 5:23 PM
2026-02-04 17:23:37 (UTC+0)
Comment Actions
My notes from meeting between Jonathan, MW devs, and SRE (Valetin, Chris, Giuseppe):
We believe the most volumous scrapers/bots that hit upload.wikimedia.org aren't scraping Wikipedia or Commons through wiki pages that contain images. And, they (generally) do not manipulate the URL for other widths or file names. Instead, they tend to request a list of images from the
Action API
(presumably on Commons) and then download the
original
files directly.
When we're under heavy load in terms of intra-DC bandwidth (i.e. due to cache misses on media URLs), we currenly try to protect/prioritize some traffic above others based on various heuristics. The heuristics we have are fairly crude and lack context. We can differentiate real browsers from bots with some level of confidence today, but this is not enough by itself because non-abusive bots/tools/apps also download and re-use image and are not browsers. For our main traffic ("text" pageviews and API calls) we have:
the
x-trusted-request framework
to exempt logged-in users/bots, (regardless of client being a browser),
a stable
Referer
header, because first-party API calls and subsequent pageviews are same-origin and thus are have a reliable
Referer
header. This in constrast to cross-origin requests where privacy settings in browsers are much more likely to strip it entirely. The default
Referrer-Policy of origin-when-cross-origin
is fine, but there's enough legitimate users with stricter settings that this is not a strong signal for upload.wikimedia.org.
To the extent we have it, we already use Referer as a signal, but it's not good enough, especially for originals, and trivial bypass.
Session cookies. Unlike on wiki domains, we don't store session cookies on upload.wikimedia.org. The request for that is
T414337: Identify requests for media files from logged-in users
Edge uniques
to start rate limits low for a fresh/cookieless client (and grow as a client gains reputation, thus making it less effective to use many IPs). Unlike on wiki domains, browsers limit the retention of upload.wikimedia.org cookies (treated as third-party cookie), and fragment them between wikis (not shared) which means when a browser that has high reputation on wikipedia.org visits wikinews.org, it starts fresh, including for upload.wikimedia.org requests from there.
From
Turnilo
, text breaks down as 8+ (40%),
0 days (29%)
, 1-7 (29%), upload breaks down as 8+ (42%),
0 days (40%)
, 1-7 (17%). A lot more 0-day traffic in the default/baseline scenario.
Signed URLs could let us detect URL tampering, as the hash would contain e.g. filename and width in it, so if you modified those the hash would no longer match. Today, clients can request uncached widths that aren't used on-wiki. However,
T402792: Consider rate limiting non-standard thumbnail sizes
is in progress and will already let us throttle this to a very low rate (i.e. we can satisfy legacy hotlinks, which should be static, easy to cache, and different in behavior from abusive patterns like enumerating all file names through a botnet, or requesting arbitrary widths). Alternatively, we coudl even redirect to a standard width, after we address first-party edge cases with CSS tracked at
T402792
Signed URLs could let us distinguish API calls from first-party embeds, if we include a bit for this in the hash. This would be akin to a more reliable
Referer
header, something akin to
?ref=wikipedia.org
should work regardless of Referrer-Policy and also reliably protect use outside web browsers where Referer headers don't exist, e.g. a logged-in account is using a bot or tool to interact with page HTML.
We have lots of first-party API calls that will make this harder. Such as: Mediaviewer, Images in search suggestions (OpenSearch API), Images in popups (Page Summary API via PCS), Images shown in mobile apps (Action API)
Timo asked Google about potential impact of signed URLs on Google Images (
recurring Google/Wikimedia partnership meeting, public notes
), and they shared the same concern as I had originally, which is that this would likely lead to duplication of result and dilute ranking. And while this may be an emerging trend on other sites (e.g. image attachments on GitHub issues, Facebook post images, etc), they generally do this behind a login wall, or they opt-out Googlebot. Even then the URLs would still emerge elsewhere via the API and on other sites, which is exactly what we rely on for ranking. It seems to me it would also affect keywords, since images heavily on host pages for relevant keywords. There is no equivalent of

for image files. Google suggested we go with a query string (no path segment) and specifically a UTM parameter, which for better or worse, are defacto stripped for purposes of determining the canonical URL.
MediaWiki by default does not generate arbitrary thumbnails on-demand. Only thumbnails used in articles, at widths used, are generated on disk. Anything else is a 404. To reliably get a thumbnail URL, it must come from a page, or from an API that creates it (and thus has a natural place to detect abuse, with APIs falling under "text" where the above mitigations work already). The 404-handler approach is specifically what we use in WMF production as an optimization. It might be worth long-term reconsidering this. This is effectively the same as what we're trying to do with signed URLs, but without requiring any complexity in URL variants or tracking. The files just wouldn't exist. This isn't trivial, as we'll need to think about async generation, and what we serve in the interim, especially for multi-page files or multilingual files where there isn't an obvious fallback that we serve while a thumbnail is being generated.
As a minimum starting point from where we can learn how well it works, and what we might need more, we will (
@Krinkle
@matmarex
):
Add a basic provenance parameter to first-party traffic to original files. That is, we'll append something like
?ref=wikipedia.org
to non-thumb media URLs served on-site, and leave the rest as-is. No modification, breaking down, or tracking of external traffic. This is where we expect the biggest gain, because originals are meant to be rare, except where an image is so small that the original serves as the thumbnail, which this parameter will let us distinguish.
Scope: skinned pageviews, logged-in API calls, mw-api-int calls.
Open questions that we'll try to answer after this:
What about API calls? We might not need it if organic traffic is naturally cache-hit and then use the lack of provenance on cache-miss. Revisit based on MVP.
What about video files? Treat transcode as original or as thumb? In video player? Does it use the API?
What about download buttons? I.e. the implicit "original" link on the File description page, and the download link in Mediaviewer. They might be rare enough and fine as-is, or we may need to tag them.
Tgr
added a comment.
Feb 4 2026, 6:17 PM
2026-02-04 18:17:06 (UTC+0)
Comment Actions
In
T414338#11584348
@Krinkle
wrote:
There is no equivalent of

for image files.
FWIW there is, the
Link header
, but I doubt anything understands it.
Mr._Starfleet_Command
subscribed.
Feb 5 2026, 10:57 PM
2026-02-05 22:57:38 (UTC+0)
Krinkle
added a comment.
Feb 6 2026, 3:51 PM
2026-02-06 15:51:42 (UTC+0)
Comment Actions
In
T414338#11584551
@Tgr
wrote:
FWIW there is, the
Link header
, but I doubt anything understands it.
I think that might work actually.
I was familiar with it, and the header is part of an HTTP standard (
RFC 8288
), but the individual relationships are not standardised at the HTTP level. For example,
rel=canonical
is standardised in the WHATWG HTML spec (both the tag and header). There is no standard for rel=canonical on other resources, so I didn't consider it.
Google
support for Link header
, but I don't think we should adopt it for HTML because fewer crawlers will support it and it's less portable (headers easily lost once the HTML is saved or passed through application layers).
Google Web Search does index formats beyond HTML, such as PDF and DOCX resources. Their support page actually demonstrates the Link header on a DOCX resource, point to a PDF (despite being non-standard). This is still within Web Search, though, so it might be limited to Googlebot (vs Googlebot-Image for Google Images Search). I will ask them in the next partnership meeting.
If this works, it might also solve the thumbnail-size problem where Google Images often points to small thumbnails (due to Wikipedia articles embedding those) instead of the medium-size previews from the Commons file description page, creating a poor experience (I can't reproduce this in Google or Bing today, maybe something solved this, e.g. JSON-LD for Commons). The Link-header could nudge catalogs like Google Images to a higher resolution (e.g. 1024px instead of 250px) akin to what we present in Mediaviewer and on file description pages.
Krinkle
mentioned this in
T417309: mw.util.parseImageUrl() returns invalid thumb URLs for images where original size is under requested width
Feb 12 2026, 6:51 PM
2026-02-12 18:51:12 (UTC+0)
Krinkle
mentioned this in
T414805: FY 25/26 WE 5.4.10 Standard Thumbnail Sizes Only
Feb 12 2026, 7:09 PM
2026-02-12 19:09:53 (UTC+0)
gerritbot
added a comment.
Feb 14 2026, 6:42 AM
2026-02-14 06:42:21 (UTC+0)
Comment Actions
Change #1239464 had a related patch set uploaded (by Bartosz Dziewoński; author: Bartosz Dziewoński):
[mediawiki/core@master] [WIP] Experiment with provenance tracking parameters for media files
gerritbot
added a project:
Patch-For-Review
Feb 14 2026, 6:42 AM
2026-02-14 06:42:22 (UTC+0)
matmarex
added a comment.
Feb 14 2026, 6:43 AM
2026-02-14 06:43:55 (UTC+0)
Comment Actions
I started experimenting with this, see patch above.
I opted to put the extra parameters in a fairly low-level place that should be used by anything that needs URLs to either original or transformed files (thumbnails). I tried putting in parameters to distinguish what requested the thumbnail (by the entry point, index vs api vs cli etc. – more detailed identification would need more work), the requesting wiki, thumbnail vs original, and a long signature parameter (in this case a copy of all the data as a JWT, but we should put more thought into this if we decide we need to sign the params in production).
I tested various parts of MediaWiki and extensions and they all seem to cope with it with no problems, and use the extra parameters for their image loading – including thumbnails in articles (under old parser and Parsoid), galleries in articles (including mode=slideshow), the file description page, the Popups extension (page previews), the MultimediaViewer extension (for views, and the generated embed and download links), search suggestions (on new Vector), search results (with $wgThumbnailNamespaces), Special:Redirect/file. I only needed one small patch for non-srcset thumbnails in articles to use it. mw.util.parseImageUrl() is used by some of this code; it's a bit dodgy in how it parses URLs, but it does work. TimedMediaHandler includes the parameters for originals too (not for video thumbnails – this would need patching elsewhere), PdfHandler includes them for thumbnails.
Additional testing is needed for foreign repos (the setup we have in production with Commons) – I'll need to work on my local setup for that, or test on the beta cluster. (We should also test how an older MediaWiki version using Commons as a foreign repo would handle this – I expect it'll work just fine as well, but I didn't test yet.)
If this approach makes sense, then I'll pare down the experimental code to only do the bare minimum we need (if I understand correctly: only add a param for the requesting wiki, only for originals, and only for non-api requests) and put it behind some config option.
gerritbot
added a comment.
Feb 17 2026, 2:54 AM
2026-02-17 02:54:43 (UTC+0)
Comment Actions
Change #1239805 had a related patch set uploaded (by Bartosz Dziewoński; author: Bartosz Dziewoński):
[mediawiki/core@master] Generalize cache-busting query params on file description pages
gerritbot
added a comment.
Feb 17 2026, 2:54 AM
2026-02-17 02:54:56 (UTC+0)
Comment Actions
Change #1239807 had a related patch set uploaded (by Bartosz Dziewoński; author: Bartosz Dziewoński):
[mediawiki/core@master] Use consistent name for cache-busting param on file description pages
gerritbot
added a comment.
Feb 17 2026, 8:42 PM
2026-02-17 20:42:40 (UTC+0)
Comment Actions
Change #1239805
merged
by jenkins-bot:
[mediawiki/core@master] Media: Generalize cache-busting query params on file description pages
ReleaseTaggerBot
added a project:
MW-1.46-notes (1.46.0-wmf.17; 2026-02-24)
Feb 17 2026, 9:00 PM
2026-02-17 21:00:48 (UTC+0)
CDanis
subscribed.
Feb 19 2026, 1:54 PM
2026-02-19 13:54:29 (UTC+0)
Krinkle
renamed this task from
Identify the provenance of image requests
to
FY25-26 WE5.4.12: Identify the provenance of image requests
Feb 19 2026, 7:35 PM
2026-02-19 19:35:00 (UTC+0)
Krinkle
updated the task description.
(Show Details)
gerritbot
added a comment.
Feb 19 2026, 8:45 PM
2026-02-19 20:45:49 (UTC+0)
Comment Actions
Change #1239807
merged
by jenkins-bot:
[mediawiki/core@master] FileRepo: Rename cache-busting param to '_' on file description pages
Krinkle
mentioned this in
T246054: Consider dropping the '1.5x' size logos from srcsets
Feb 25 2026, 1:00 AM
2026-02-25 01:00:47 (UTC+0)
matmarex
added a comment.
Feb 25 2026, 10:50 PM
2026-02-25 22:50:45 (UTC+0)
Comment Actions
@Joe
@CDanis
I heard you're the people to talk to about the desired data and format of these query parameters.
Currently, the proposed patch
includes the following data:
Site which is requesting the image, e.g. 'www.mediawiki.org'
Generator (the software component involved), e.g. 'parser' or 'imageinfo'. Entry point is used as fallback if not specified, e.g. 'index', 'api', 'rest'
Format of the requested image, 'original', 'thumbnail' or 'thumbnail_unscaled'
The format is UTM parameters (respectively utm_source, utm_campaign and utm_content, in this order), on the assumption that they'll be stripped by search engines etc.
Example:
Your thoughts on that would be appreciated. I also have two questions:
Do we need to sign these parameters so that they can't be spoofed, or do we start by assuming everyone will play nice? If yes, what format would be convenient? Can we just stick the data in a JWT instead of having separate parameters?
If we're considering switching to a JWT, would it be more convenient to start with a single JSON parameter instead of separate parameters? (I mean something like
matmarex
added a project:
MediaWiki-File-management
Mar 6 2026, 8:46 PM
2026-03-06 20:46:12 (UTC+0)
Maintenance_bot
added a project:
Commons
Mar 6 2026, 9:30 PM
2026-03-06 21:30:38 (UTC+0)
Krinkle
mentioned this in
T419458: Media dialog in VisualEditor shows odd UTM param strings where file type should be
Mar 9 2026, 6:21 PM
2026-03-09 18:21:49 (UTC+0)
Krinkle
added a subtask:
T419458: Media dialog in VisualEditor shows odd UTM param strings where file type should be
gerritbot
added a comment.
Mar 9 2026, 6:56 PM
2026-03-09 18:56:47 (UTC+0)
Comment Actions
Change #1239464
merged
by jenkins-bot:
[mediawiki/core@master] Media: Add provenance parameters to thumbnail and media file URLs
ReleaseTaggerBot
edited projects, added
MW-1.46-notes (1.46.0-wmf.19; 2026-03-10)
; removed
MW-1.46-notes (1.46.0-wmf.17; 2026-02-24)
Mar 9 2026, 7:00 PM
2026-03-09 19:00:27 (UTC+0)
Maintenance_bot
removed a project:
Patch-For-Review
Mar 9 2026, 7:32 PM
2026-03-09 19:32:34 (UTC+0)
matmarex
closed subtask
T419458: Media dialog in VisualEditor shows odd UTM param strings where file type should be
as
Resolved
Mar 13 2026, 2:51 PM
2026-03-13 14:51:27 (UTC+0)
Krinkle
mentioned this in
T419135: Gadget-Stockphoto.js on Commons uses non-common thumbnail sizes, leading to a HTTP 429
Mar 14 2026, 10:11 AM
2026-03-14 10:11:09 (UTC+0)
Joe
added a comment.
Mar 16 2026, 8:30 AM
2026-03-16 08:30:31 (UTC+0)
Comment Actions
In
T414338#11652561
@matmarex
wrote:
@Joe
@CDanis
I heard you're the people to talk to about the desired data and format of these query parameters.
Currently, the proposed patch
includes the following data:
Site which is requesting the image, e.g. 'www.mediawiki.org'
Generator (the software component involved), e.g. 'parser' or 'imageinfo'. Entry point is used as fallback if not specified, e.g. 'index', 'api', 'rest'
Format of the requested image, 'original', 'thumbnail' or 'thumbnail_unscaled'
The format is UTM parameters (respectively utm_source, utm_campaign and utm_content, in this order), on the assumption that they'll be stripped by search engines etc.
Example:
Your thoughts on that would be appreciated. I also have two questions:
Do we need to sign these parameters so that they can't be spoofed, or do we start by assuming everyone will play nice? If yes, what format would be convenient? Can we just stick the data in a JWT instead of having separate parameters?
If we're considering switching to a JWT, would it be more convenient to start with a single JSON parameter instead of separate parameters? (I mean something like
Sorry it's been a few weeks of intense work on other stuff. The proposed format is good as far as I'm concerned, as a first step.
I think adding a signature is useful. It would be enough to have a simple signature like a simple SHA1 of the other parameters as follows:
$SECRET;site=mediawiki.localhost;generator=parser;format=thumbnail
which we can add in
utm_term
(again abusing the term). I would go with a simple sha1 instead of using hmac because the risk of compromise is pretty low.
gerritbot
added a comment.
Mar 16 2026, 7:29 PM
2026-03-16 19:29:59 (UTC+0)
Comment Actions
Change #1253625 had a related patch set uploaded (by Bartosz Dziewoński; author: Bartosz Dziewoński):
[operations/mediawiki-config@master] Enable $wgTrackMediaRequestProvenance on testwikis and beta cluster
gerritbot
added a project:
Patch-For-Review
Mar 16 2026, 7:30 PM
2026-03-16 19:30:00 (UTC+0)
gerritbot
added a comment.
Mar 16 2026, 8:54 PM
2026-03-16 20:54:35 (UTC+0)
Comment Actions
Change #1253625
merged
by jenkins-bot:
[operations/mediawiki-config@master] Enable $wgTrackMediaRequestProvenance on testwikis and beta cluster
Stashbot
added a comment.
Mar 16 2026, 8:57 PM
2026-03-16 20:57:27 (UTC+0)
Comment Actions
Mentioned in SAL (#wikimedia-operations)
[2026-03-16T20:57:26Z] Started scap sync-world: Backport for [[gerrit:1253623|Fix client credentials access tokens (
T417278
T419921
)]], [[gerrit:1253625|Enable $wgTrackMediaRequestProvenance on testwikis and beta cluster (
T414338
)]], [[gerrit:1253626|Configure $wgApiClientErrorSampleRate (
T418957
)]]
Stashbot
mentioned this in
T417278: Choosing client credentials grant for OAuth 2 results in an access token (JWT) with the 'sub' field empty
Mar 16 2026, 8:57 PM
2026-03-16 20:57:30 (UTC+0)
Stashbot
mentioned this in
T419921: TypeError: MediaWiki\Extension\OAuth\ResourceServer::getUser(): Return value must be of type MediaWiki\User\User, false returned
Stashbot
mentioned this in
T418957: Add client-side logging for non-MediaWiki action API errors (HTTP 429)
Comment Actions
Mentioned in SAL (#wikimedia-operations)
[2026-03-16T20:59:17Z] matmarex, catrope: Backport for [[gerrit:1253623|Fix client credentials access tokens (
T417278
T419921
)]], [[gerrit:1253625|Enable $wgTrackMediaRequestProvenance on testwikis and beta cluster (
T414338
)]], [[gerrit:1253626|Configure $wgApiClientErrorSampleRate (
T418957
)]] synced to the testservers (see
). Changes can now be verified there.
Stashbot
added a comment.
Mar 16 2026, 9:05 PM
2026-03-16 21:05:38 (UTC+0)
Comment Actions
Mentioned in SAL (#wikimedia-operations)
[2026-03-16T21:05:37Z] Finished scap sync-world: Backport for [[gerrit:1253623|Fix client credentials access tokens (
T417278
T419921
)]], [[gerrit:1253625|Enable $wgTrackMediaRequestProvenance on testwikis and beta cluster (
T414338
)]], [[gerrit:1253626|Configure $wgApiClientErrorSampleRate (
T418957
)]] (duration: 08m 06s)
Maintenance_bot
removed a project:
Patch-For-Review
Mar 16 2026, 9:31 PM
2026-03-16 21:31:21 (UTC+0)
gerritbot
added a comment.
Mar 24 2026, 4:35 PM
2026-03-24 16:35:13 (UTC+0)
Comment Actions
Change #1260029 had a related patch set uploaded (by Krinkle; author: Krinkle):
[operations/mediawiki-config@master] Enable $wgTrackMediaRequestProvenance on group0 wikis
gerritbot
added a project:
Patch-For-Review
Mar 24 2026, 4:35 PM
2026-03-24 16:35:14 (UTC+0)
gerritbot
added a comment.
Tue, Mar 31, 11:10 PM
2026-03-31 23:10:19 (UTC+0)
Comment Actions
Change #1260029
merged
by jenkins-bot:
[operations/mediawiki-config@master] Enable $wgTrackMediaRequestProvenance on group0 wikis
Stashbot
added a comment.
Tue, Mar 31, 11:10 PM
2026-03-31 23:10:47 (UTC+0)
Comment Actions
Mentioned in SAL (#wikimedia-operations)
[2026-03-31T23:10:45Z] Started scap sync-world: Backport for [[gerrit:1260029|Enable $wgTrackMediaRequestProvenance on group0 wikis (
T414338
)]]
Stashbot
added a comment.
Tue, Mar 31, 11:12 PM
2026-03-31 23:12:46 (UTC+0)
Comment Actions
Mentioned in SAL (#wikimedia-operations)
[2026-03-31T23:12:45Z] krinkle: Backport for [[gerrit:1260029|Enable $wgTrackMediaRequestProvenance on group0 wikis (
T414338
)]] synced to the testservers (see
). Changes can now be verified there.
Maintenance_bot
removed a project:
Patch-For-Review
Tue, Mar 31, 11:31 PM
2026-03-31 23:31:25 (UTC+0)
Stashbot
added a comment.
Tue, Mar 31, 11:51 PM
2026-03-31 23:51:07 (UTC+0)
Comment Actions
Mentioned in SAL (#wikimedia-operations)
[2026-03-31T23:51:06Z] Finished scap sync-world: Backport for [[gerrit:1260029|Enable $wgTrackMediaRequestProvenance on group0 wikis (
T414338
)]] (duration: 40m 21s)
gerritbot
added a comment.
Fri, Apr 3, 3:50 AM
2026-04-03 03:50:11 (UTC+0)
Comment Actions
Change #1267437 had a related patch set uploaded (by Krinkle; author: Krinkle):
[operations/mediawiki-config@master] Enable wgTrackMediaRequestProvenance on most group1 wikis
gerritbot
added a project:
Patch-For-Review
Fri, Apr 3, 3:50 AM
2026-04-03 03:50:12 (UTC+0)
gerritbot
added a comment.
Wed, Apr 8, 7:36 AM
2026-04-08 07:36:04 (UTC+0)
Comment Actions
Change #1267437
merged
by jenkins-bot:
[operations/mediawiki-config@master] Enable wgTrackMediaRequestProvenance on most group1 wikis
Stashbot
added a comment.
Wed, Apr 8, 7:36 AM
2026-04-08 07:36:30 (UTC+0)
Comment Actions
Mentioned in SAL (#wikimedia-operations)
[2026-04-08T07:36:29Z] Started scap sync-world: Backport for [[gerrit:1267437|Enable wgTrackMediaRequestProvenance on most group1 wikis (
T414338
)]]
Stashbot
added a comment.
Wed, Apr 8, 7:38 AM
2026-04-08 07:38:18 (UTC+0)
Comment Actions
Mentioned in SAL (#wikimedia-operations)
[2026-04-08T07:38:18Z] krinkle: Backport for [[gerrit:1267437|Enable wgTrackMediaRequestProvenance on most group1 wikis (
T414338
)]] synced to the testservers (see
). Changes can now be verified there.
Stashbot
added a comment.
Wed, Apr 8, 7:46 AM
2026-04-08 07:46:05 (UTC+0)
Comment Actions
Mentioned in SAL (#wikimedia-operations)
[2026-04-08T07:46:04Z] Finished scap sync-world: Backport for [[gerrit:1267437|Enable wgTrackMediaRequestProvenance on most group1 wikis (
T414338
)]] (duration: 09m 34s)
Krinkle
mentioned this in
T422586: MediaViewer downloads high-res image twice if original is a medium-size JPEG
Wed, Apr 8, 7:57 AM
2026-04-08 07:57:21 (UTC+0)
Maintenance_bot
removed a project:
Patch-For-Review
Wed, Apr 8, 8:32 AM
2026-04-08 08:32:15 (UTC+0)
gerritbot
added a comment.
Thu, Apr 9, 12:43 PM
2026-04-09 12:43:01 (UTC+0)
Comment Actions
Change #1269440 had a related patch set uploaded (by Krinkle; author: Krinkle):
[operations/mediawiki-config@master] Enable wgTrackMediaRequestProvenance on wikidata.org
gerritbot
added a project:
Patch-For-Review
Thu, Apr 9, 12:43 PM
2026-04-09 12:43:02 (UTC+0)
Comment Actions
Change #1269441 had a related patch set uploaded (by Krinkle; author: Krinkle):
[operations/mediawiki-config@master] Enable wgTrackMediaRequestProvenance on Commons
gerritbot
added a comment.
Thu, Apr 9, 12:43 PM
2026-04-09 12:43:06 (UTC+0)
Comment Actions
Change #1269442 had a related patch set uploaded (by Krinkle; author: Krinkle):
[operations/mediawiki-config@master] Enable wgTrackMediaRequestProvenance on remaining Wikipedias
Krinkle
added a comment.
Thu, Apr 9, 12:58 PM
2026-04-09 12:58:55 (UTC+0)
Comment Actions
Progress update
(2-6 Mar, 9-13 Mar; copied here from Asana for transparancy):
Investigate and fix broken thumbnails on officewiki (Timo investigated an found missing thumbnail steps on private wikis, Amir
enabled
this).
Test and merge trial implementation of media provenance URLs in MediaWiki core behind a feature flag (developed by Bartosz and Timo).
T414338
Refactor logic in FileRepo and Media classes in MediaWiki core to reduce duplication and make adding provenance URLs simpler and more reliable.
T414338
Find and fix VisualEditor would-be-bug where media type breaks due to accidental reliance on URLs having no query string.
T419458
Enable media provenance feature in Beta Cluster and on testwikis in production.
T414338
Krinkle
added a comment.
Thu, Apr 9, 1:00 PM
2026-04-09 13:00:13 (UTC+0)
Comment Actions
Progress update
(9 Apr 2026):
Enable media provenance on 573 additional wikis (including all Wiktionary and Wikivoyage wikis, and 18 Wikipedias). We are now live on 720/1068 wikis.
T414338
Found regression in MediaViewer causing double downloads.
T422586
Prepare Stockphoto gadget on Commons ahead of rollout to prevent regression.
T419135
Next steps
Deploy media provenance feature to Wikidata, Commons, and 346 remaining Wikipedias.
Krinkle
updated the task description.
(Show Details)
Thu, Apr 9, 1:00 PM
2026-04-09 13:00:29 (UTC+0)
BCornwall
subscribed.
Mon, Apr 13, 4:17 PM
2026-04-13 16:17:19 (UTC+0)
Krinkle
added a subtask:
T422586: MediaViewer downloads high-res image twice if original is a medium-size JPEG
Mon, Apr 20, 5:55 PM
2026-04-20 17:55:01 (UTC+0)
Krinkle
created subtask
T424082: MediaViewer preview sometimes lacks provenance parameters
Tue, Apr 21, 7:10 PM
2026-04-21 19:10:59 (UTC+0)
Krinkle
mentioned this in
T424082: MediaViewer preview sometimes lacks provenance parameters
gerritbot
added a comment.
Wed, Apr 22, 12:06 AM
2026-04-22 00:06:23 (UTC+0)
Comment Actions
Change #1276086 had a related patch set uploaded (by Krinkle; author: Krinkle):
[mediawiki/extensions/MultimediaViewer@master] mmv.bootstrap: Avoid double download when thumb is unscaled original
Log In to Comment
Content licensed under Creative Commons Attribution-ShareAlike (CC BY-SA) 4.0 unless otherwise noted; code licensed under GNU General Public License (GPL) 2.0 or later and other open source licenses. By using this site, you agree to the Terms of Use, Privacy Policy, and Code of Conduct.
Wikimedia Foundation
Code of Conduct
Disclaimer
CC-BY-SA
GPL
Credits