Memcached for MediaWiki - Wikitech
Jump to content
From Wikitech
Wikimedia infrastructure
Data centers
Networking
Global traffic routing
MediaWiki SRE
Application servers
PHP 7 and php-fpm
BounceHandler
Citoid
Dumps
Envoy
EtcdConfig for MediaWiki
External storage
MediaWiki HTTP cache headers
MediaWiki On Kubernetes
mw-cron jobs
mw-experimental
MediaWiki Maintenance scripts
MediaWiki JobQueue
Mathoid
Memcached
mw-mcrouter
Mcrouter runbook
Nutcracker
Parser cache
Redis
Shellbox
Videoscaling
MediaWiki Engineering
MediaWiki at WMF
Parser cache
MediaWiki JobQueue
Performance review
PHP upgrade process
performance.wikimedia.org
Web Perf Hero award
Guides:
Frontend best practices
Backend best practices
more...
Runbooks:
Access control
Daily duties
Multimedia
Data Engineering
SRE Data Persistence
SRE Infra Foundations
SRE Observability
Wikidata Platform
Wikimedia Performance
Event Platform
Release Engineering
Fundraising
edit
This page is about
Memcached for MediaWiki
This page is not about other Memcached clusters in production, such as those for
Thumbor
Wikimedia Cloud Services
, and
Swift
MediaWiki's use of Memcached at WMF.
Magic numbers
WANCache (last updated: May 2020.)
Tombstone (aka "hold-off TTL"): 11 seconds.
Interim value: 1 second.
Mcrouter
Gutter TTL: upto 10 minutes (
gutter_ttl
).
WANObjectCache
WANObjectCache
(or
WANCache
) is the primary interface in MediaWiki for interacting with Memcached and mcrouter. WANCache provides a developer-friendly API that naturally follows our best practices and transparently deals with the complex requirements of operating a platform of our scale. This includes preventing cache stampedes, avoiding cache misses for hot data through probabilistic and asynchronous regeneration prior to logical expiry, avoiding network congestion, supporting multiple versions of the software to run alongside each other (and apply purges to both, whilst storing values separately), and avoiding cache pollution during long-running processes or when databases are experiencing replication lag. The WANCache interface came out of the
Multi-DC MediaWiki initiative
, which required us to take these constraints more seriously, though they generally are not unique to Multi-DC and also significantly improved resilience and correctness during the 2015-2021 single-DC period.
WANCache builds on top of
BagOStuff
, which is the lower-level key-value interface to Memcached and other storage backends.
See also:
WANObjectCache high-level documentation
WANObjectCache metrics and internal details
mw:Manual:Object_cache
: what kind of data is stored here, how it gets there etc.
High level
Like a replica database.
There is generally no proactive setting of values during HTTP write actions. Instead, values are computed based on information from replica DBs, and computed on-demand using the
getWithSet(key, ttl, callable)
idiom. This means the application generally only expects cache values to be as up to date as a replica DB would be. Historically, it was common for MediaWiki to populate its cache during HTTP write actions instead. This meant that in a single-DC setup it could loosely be expected that the cache was as up-to-date as the primary DB. As part of the multi-dc effort, this was changed starting in 2015, and thus its expectations were loosened to that of a replica DB.
No synchronisation.
MediaWiki's WANCache layer does not require synchronisation of cached values across data centers. Instead, it considers each datacenter's Memcached cluster as independent. Each populating its own values as-needed on dc-local app servers from dc-local replica DBs.
Tombstones (broadcasted purge)
. During HTTP write actions, MediaWiki asks WANCache to purge cache keys of which it has modified the source data. These purges take the form of short-lived Memcached keys known as "tombstones". We do not use the
DELETE
command because we want each data center to be able to populate its memcached independently, thus requiring no cross-dc primary database connection, thus reading from a local replica, thus values ingested in the cache may be as stale as a replica can be. Implementing a Memcached purge as
DELETE
would mean both in the same DC and other DCs, the same key could be re-populated immediately with the same stale value we just deleted. Instead, WANCache formulates its purge as a
SET
operation that stores a placeholder value known as a "tombstone" (lasts for approx. 10 seconds for local and remote replica DBs to catch up).
Interim values
. Upon seeing such tombstone, WANCache acts much like a cache miss, except that the newly computed value is not written back over the tombstone (as the computed value may be stale). Instead, to avoid a recompute stampede, these maybe-stale values are stored as an "interim value" in a sister key which is only kept for a few seconds.
Memcached commands
Intra-dc:
Read traffic from the
getWithSet
idiom results in a
GETS
command (getMulti) that fetches the main key, plus any sister keys that might exist.
Write traffic from the
getWithSet
idiom results in either
ADD
if the key was known to be absent, or
Memcached->mergeViaCas
if a value existed but either required (or was elected for) regeneration.
Cross-dc:
Purge traffic uses the
/*/mw-wan/
prefix to tell mcrouter to broadcast this to other pools and clusters as well. The actual command is generally
SET
as it needs to induce a "hold-off" period using the tombstone (per the above). In rare cases where a hold-off is not needed (e.g. if the purge is not related to a DB write), then the broadcasted event will use
DELETE
Getting revision/page from WANCache key
If you're trying to track down the specific revision text given an SqlBlobStore key, the somewhat convoluted procedure is documented at
mw:Manual:Caching#Revision_text
Infrastructure
There are two logical pools of memcached servers for MediaWiki:
Main
: The main pool for has 18
shards
and runs on the
mc10XX
hosts (in
Eqiad
) and
mc20XX
hosts (in
Codfw
).
Gutter
: The gutter pool has 3 shards per DC, and hosted on
mc-gp100x
and
mc-gp200x
hosts (launch task:
T244852
).
MediaWiki configuration
MediaWiki connects to the memcached cluster through a local
mcrouter
proxy called "mw-mcrouter". This provides a number of benefits.
MediaWiki at WMF
has multiple components that each independently use and share the same local mcrouter proxy and memcached cluster for their data. Specifically:
WANObjectCache
$wgMainCacheType
), under the
WANCache:
prefix.
ParserCache
$wgParserCacheType
), under the
$wiki:pcache:
prefix.
MicroStash (
$wgMicroStashType
), under the
$wiki:$keygroup:
and
global:$keygroup:
prefix. See also
mw:Object cache#MicroStash
Servers
The list of Memcached servers can be found in
Puppet
. This list is passed to mw-mcrouter after being parsed by some
wonderful code
Mcrouter
Main article:
mw-mcrouter
Each kubernetes node is running an instance of the
mw-mcrouter
proxy, which does:
consistent shards data across the memcached servers
connection pooling
failover to
gutter pool
in case of a server unavailability
cross-dc replication via TLS
T271967
Any mediawiki pod will have its memcached requests routed to the
mw-mcrouter
pod running on the same node.
Note: mw-wikifunctions is not using mw-mcrouter, but rather an in-pod mcrouter container.
Mcrouter Routes
Each MediaWiki api/appserver accesses memcached through its local Mcrouter instance
. Mcrouter introduces the concepts of
routes
and pools and each route applies consistent hashing on the key name to know where to send it, i.e. which of the 18 shards for memcached.
There are several routes available in our configuration, which are addressable via a route prefix that mcrouter
strips from the key
before forwarding the memcached command.
Main route
. This route is declared as
/$region/mw/
but is not addressed by MediaWiki as such. It routes to the dc-local "Main" pool shards. If a shard is perceived as unavailable from an appserver ("TKO") the local mcrouter forwards all commands (incl gets, sets, and locks) to a shard of the "Gutter" pool instead (launch task:
T244852
).
This route is used by the majority of traffic, through
WANObjectCache::getWithSet
calls in MediaWiki.
MediaWiki doesn't use the
/$region/mw/
prefix. Instead
/$region/mw/
is the default route and MediaWiki sends these commands without any routing prefix.
Switchover to and from the gutterpool is decided by Mcrouter locally (per-appserver), it is not centrally coordinated. The keys stored in a gutter server have a reduced TTL.
WAN route
. This route is declared as
/$region/mw-wan/
. It routes to the dc-local "Main" pool shards as well as the "Proxies" for all non-local DCs.
This route is for internal use by MediaWiki's WANObjectCache to broadcast its purges ("tombstones"). This happens from calls to
WANObjectCache::purge
(invalidates a single key) or
WANObjectCache::touchCheckKey
(effectively invalidate many keys, through a shared "check" key; somewhat like the Varnish
XKey
mechanism).
This route is not used for storing "regular" values is not exposed to any generic
WANObjectCache::getWithSet
or
BagOStuff
calls.
Example
The memcached key
WANCache:v:metawiki:translate-groups
(belongs to the
Translate extension
) is formatted by the WANCache library. When Translate wants to get the value of this key, WANCache will send a
GET
command from MediaWiki to
localhost:11213
, where mcrouter is listening. The command is then further routed to
mc1022
(based on key hashing). MediaWiki is totally ignorant about the
mc[1,2]0XX
host, it only knows about sending commands to a localhost port. A mcrouter admin command helps figure out where keys are hashed/routed to:
elukey@mw1345:~$
echo
"get __mcrouter__.route(get,WANCache:v:metawiki:translate-groups)"
nc
localhost
11213
-q
VALUE
__mcrouter__.route
get,WANCache:v:metawiki:translate-groups
16
10
.64.0.83:11211
END
elukey@mw1345:~$
dig
-x
10
.64.0.83
+short
mc1022.eqiad.wmnet.
Some things to notice:
The special prefix
__mcrouter__.route
is intercepted by mcrouter. These are admin commands, for which proxy returns directly without contacting the memcached hosts. This function returns the target of the consistent hashing of the key name.
Mcrouter listens on port 11213 on all MediaWiki
app servers
, meanwhile on every
mc10XX
host memcached listens on port 11211.
To get a key and dump it to a file it is sufficient to:
elukey@mw1345:~$
echo
"get WANCache:v:metawiki:translate-groups"
nc
localhost
11213
-q
dump.txt
elukey@mw1345:~$
du
-hs
dump.txt
380K
dump.txt
In this case the key's value is pretty big, and it needs PHP to be interpreted correctly (to unserialize it), but nonetheless we got some useful information (like the size of the key). This could be useful when it is necessary to quickly get how big a key is, rather than knowing its content.
Rebooting & Restarting
See
SRE/Service_Operations/Documentation/Reboots#Datastores
Runbooks
Memcached server failure
Performance/Runbooks/Analyze memcached
(How to use memkeys or cachedump)
Dashboards
Mcrouter-on-k8s
mw-memcached errors
Memcached Dashboard
Memcached Gutterpool status
Links
mcrouter image
mcrouter image versions
list of mediawiki memcached production servers
Introducing mcrouter: A memcached protocol router for scaling memcached deployments
Retrieved from "
Categories
How-To
Caching
MediaWiki production
SRE Service Operations
Memcached for MediaWiki
Add topic