Wikimedia Downloads: Analytics
Wikimedia Downloads: Analytics Datasets
Data compiled by community and staff, from projects hosted by the Wikimedia Foundation.
Pageviews
: statistics compiled using the current
Pageview Definition
. Available as:
Pageview complete
: Our best effort to provide a comprehensive timeseries of per-article pageview data for Wikimedia projects. Data spans from December 2007 to the present with a uniform format and compression.
Pageview/projectview data filtered to what we believe is only human traffic
. Available since May 2015.
Pageview/projectview data, highly compressed and corrected for outages
. This dataset was historically computed using the best source available at the time:
Dec 2015 - now: compressing and correcting the
pageviews dataset
2007 - Dec 2015: compressing and correcting the
pagecounts-raw dataset
Also known as "pagecounts-ez" maintained by Erik Zachte. More details at the link.
Mediacounts
: statistics from all projects on media file access. Available as:
Request counts for the upload domain (pictures, movies, audio files)
Unique Devices
: statistics using
>the uniques definition
from all individual wiki projects, as well as project families (e.g. all Wikipedias) on unique devices. Available as:
Estimate of unique devices based on a privacy-sensitive last access cookie.
Clickstream
: (referer, resource) pairs extracted from the request logs of Wikipedia. Please visit
the
Clickstream research page
for detailed
information. Available as:
Monthly generated clickstream for wikipedia in English, Russian, German, Spanish, and Japanese
MediaWiki History
: Revision history; User history, including: creation, renames, groups and blocks; Page history, including: creation, moves, deletions and restores. All since the beginning of MediaWiki-time. Available as:
Monthly generated compressed TSV files, organized per wiki and time range.
Data by Country
: Data aggregated at the country level. Available as:
For a given wiki, a range of the number of editors geolocating to a specific country. Monthly TSV files.
Commons Impact Metrics
: Data on how commons media is edited, used, and accessed across Wikimedia projects. Available as:
Monthly generated TSV files.
Wikidata QRank
: A ranking signal for Wikidata entities, periodically computed by aggregating pageviews on all Wikimedia projects in all languages. NOTE: This dataset is maintained by volunteers external to the Wikimedia foundation.
Wikidata QRank
Deprecated datasets (no longer maintained or updated)
Pagecounts
: simple pageview definition. Available from 2007 to 2016. Some of the data does not include counts from the mobile site and no filtering of automata is performed. Available as:
Pagecount data collected by Domas Mituzas, available from 2007 through May 2016
Pagecount/projectcount data including mobile/zero sites, available from October 2014 through May 2016
All Analytics datasets are available under the
Creative Commons CC0 dedication
US