Portal:Data Services - Wikitech
Jump to content
From Wikitech
Please read the
Wikimedia Cloud Services introduction
and the
Getting Started guide
Data Services
includes services that allow for direct access to databases and dumps, as well as web interfaces for querying and programmatic access to data stores.
Data services currently include: Wiki Replicas, Wikimedia Dumps, Shared Storage, CirrusSearch Elasticsearch replicas, Wikimedia Enterprise, Quarry, and PAWS.
Data stores
Wiki Replicas
Wiki Replicas
are MySQL/MariaDB databases that replicate near-realtime from the production MediaWiki databases of Wikimedia Foundation wikis.
Wikimedia Dumps
Wikimedia Dumps
offers a range of data downloads including full text dumps, and other datasets.
Shared Storage
Shared storage
is offered via
NFS
. It includes shared directories offered to VPS and Toolforge users. Wikimedia Dumps are also offered via the Shared Storage services, but treated as a Data Service because of their wide use.
CirrusSearch OpenSearch replicas
The "
Cloud Elastic
" servers are a replica of the CirrusSearch OpenSearch indices made available to Wikimedia Cloud Services applications (both Cloud VPS and Toolforge).
Wikimedia Enterprise
Wikimedia Enterprise
APIs give high-volume and high query rate access to Wikimedia project data. Users of
Toolforge
Cloud VPS
, and
PAWS
can call any of the endpoints described in the
Wikimedia Enterprise documentation
without passing an authorization header
Web interfaces
Quarry and PAWS require a Wikimedia SUL account to login.
Quarry
Quarry
is a graphical web interface that allows users to query Wiki Replicas and ToolsDB using SQL.
PAWS
PAWS
is a Jupyter notebooks installation hosted by Wikimedia Cloud Services that hosts Python notebooks and a terminal accessible through a web browser. You can access Wiki Replicas, ToolsDB and Dumps with PAWS.
See also
Data Services administrative documentation
Retrieved from "
Categories
Portals
Data Services
Portal:Data Services
Add topic
US