codfw-rollout
Page Menu
Phabricator
codfw-rollout
Workboard
Open Tasks
Members
Manage
Projects
codfw-rollout
codfw-rollout
Goal
Archived
Public
Watch Project
Members
This project does not have any members.
View All
Watchers
This project does not have any watchers.
View All
Details
Description
Work related to setting up the
codfw data-center
in 2014.
For tracking on-site tasks at codfw, see
ops-codfw
instead.
Recent Activity
View All
Feb 11 2025
Gehel
edited projects for
T121741: Test a full switch-over of search traffic to the codfw datacenter
, added:
Discovery-Search (Current work)
; removed
Discovery-Search
Feb 11 2025, 3:40 PM
Discovery-Search (Current work)
codfw-rollout
codfw-rollout-Jan-Mar-2016
CirrusSearch
Discovery-ARCHIVED
Gehel
edited projects for
T130366: Should we have a specific check for SSL certificate expiration on elasticsearch
, added:
Discovery-Search (Current work)
; removed
Discovery-Search
Feb 11 2025, 3:40 PM
Discovery-Search (Current work)
Patch-For-Review
codfw-rollout
codfw-rollout-Jan-Mar-2016
CirrusSearch
Elasticsearch
Discovery-ARCHIVED
SRE
Gehel
edited projects for
T130365: Enable metric collection on nginx for elasticsearch
, added:
Discovery-Search (Current work)
; removed
Discovery-Search
Feb 11 2025, 3:40 PM
Discovery-Search (Current work)
Patch-For-Review
codfw-rollout
codfw-rollout-Jan-Mar-2016
CirrusSearch
Elasticsearch
Discovery-ARCHIVED
SRE
Oct 22 2024
elukey
closed
T234234: Port architecture of irc-recentchanges to Kafka
, a subtask of
T128592: Add redundancy to IRC recent changes service
, as
Resolved
Oct 22 2024, 10:39 AM
Sustainability
SRE
codfw-rollout
May 15 2023
jijiki
moved
T163354: Find a way to verify mediawiki-config IPs ahead of datacenter switchovers
from
Incoming ๐ซ
to
Stalled ๐
on the
serviceops-deprecated
board.
May 15 2023, 9:25 AM
serviceops-deprecated
Datacenter-Switchover
SRE
codfw-rollout
May 12 2023
Dzahn
added a project to
T163354: Find a way to verify mediawiki-config IPs ahead of datacenter switchovers
serviceops-deprecated
May 12 2023, 3:20 PM
serviceops-deprecated
Datacenter-Switchover
SRE
codfw-rollout
Dzahn
added a project to
T163354: Find a way to verify mediawiki-config IPs ahead of datacenter switchovers
Datacenter-Switchover
May 12 2023, 3:20 PM
serviceops-deprecated
Datacenter-Switchover
SRE
codfw-rollout
Nov 14 2022
jijiki
moved
T135122: Reduce etcd technical debt
from
Incoming ๐ซ
to
๐พ Datastores
on the
serviceops-deprecated
board.
Nov 14 2022, 1:25 PM
serviceops-deprecated
Technical-Debt
codfw-rollout
SRE
jcrespo
added a project to
T135122: Reduce etcd technical debt
serviceops-deprecated
Nov 14 2022, 12:25 PM
serviceops-deprecated
Technical-Debt
codfw-rollout
SRE
Nov 4 2022
jbond
closed
T135128: Turn on etcd TLS for intra-cluster communications
, a subtask of
T135122: Reduce etcd technical debt
, as
Resolved
Nov 4 2022, 2:14 PM
serviceops-deprecated
Technical-Debt
codfw-rollout
SRE
jbond
closed
T135128: Turn on etcd TLS for intra-cluster communications
as
Resolved
I believe this is now in place but please re-open if im wrong
Nov 4 2022, 2:14 PM
serviceops-deprecated
codfw-rollout
SRE
Apr 29 2022
fgiunchedi
closed
T135125: Install a second etcd cluster in codfw
as
Resolved
We do have conf2* up and running nowadays, resolving
Apr 29 2022, 10:07 AM
codfw-rollout
SRE
fgiunchedi
closed
T135125: Install a second etcd cluster in codfw
, a subtask of
T135122: Reduce etcd technical debt
, as
Resolved
Apr 29 2022, 10:07 AM
serviceops-deprecated
Technical-Debt
codfw-rollout
SRE
Feb 8 2022
Nintendofan885
removed a watcher for
codfw-rollout
Jay8g
Feb 8 2022, 3:34 PM
Feb 4 2022
Umherirrender
removed a project from
T163495: Mediawiki revision-related queries are failing with high rate for enwiki on codfw
Patch-For-Review
Feb 4 2022, 11:37 PM
Platform Team Workboards (Done with CPT)
Platform Engineering (Needs Cleaning - Security, stability, performance, and scalability (TEC1))
codfw-rollout
Wikimedia-Incident
MediaWiki-General
gerritbot
added a comment to
T163495: Mediawiki revision-related queries are failing with high rate for enwiki on codfw
Change 349479
abandoned
by Umherirrender:
[mediawiki/core@master] ApiQueryContributors: Improve query behavior
Reason:
Old outdated patch set
Feb 4 2022, 11:34 PM
Platform Team Workboards (Done with CPT)
Platform Engineering (Needs Cleaning - Security, stability, performance, and scalability (TEC1))
codfw-rollout
Wikimedia-Incident
MediaWiki-General
Feb 3 2022
jcrespo
lowered the priority of
T163354: Find a way to verify mediawiki-config IPs ahead of datacenter switchovers
from
High
to
Low
5 years without updates- setting the priority to reflect reality rather than the original idea.
Feb 3 2022, 12:48 PM
serviceops-deprecated
Datacenter-Switchover
SRE
codfw-rollout
Dec 8 2021
taavi
added a comment to
T108580: HTTPS for internal service traffic
Dec 8 2021, 8:32 AM
Traffic-Icebox
codfw-rollout
SRE
HTTPS
ema
closed
T108580: HTTPS for internal service traffic
as
Resolved
Many of the assumptions made when this task was created have changed since the migration to ATS for cache backends (no more IPSec, the difference between Tier1 and Tier2 DCs is now gone, ...). We are now in a world where all backend caches access the origins via TLS, which I think largely covers what we wanted to achieve here.
@BBlack
: I'm marking the task as resolved, but of course feel free to reopen / create other tasks as needed if you think that anything is missing.
Dec 8 2021, 8:32 AM
Traffic-Icebox
codfw-rollout
SRE
HTTPS
Dec 7 2021
taavi
closed
T263829: cloudweb2001-dev: add TLS termination
, a subtask of
T108580: HTTPS for internal service traffic
, as
Resolved
Dec 7 2021, 6:20 PM
Traffic-Icebox
codfw-rollout
SRE
HTTPS
Nov 30 2021
taavi
closed
T263830: contint.wikimedia.org: add TLS termination
, a subtask of
T108580: HTTPS for internal service traffic
, as
Resolved
Nov 30 2021, 5:13 PM
Traffic-Icebox
codfw-rollout
SRE
HTTPS
Apr 14 2021
Krinkle
closed
T128592: Add redundancy to IRC recent changes service
as
Resolved
Apr 14 2021, 11:31 PM
Sustainability
SRE
codfw-rollout
Krinkle
added a comment to
T128592: Add redundancy to IRC recent changes service
Ack, not missing messages !- active-active. So long as reconnect to the same hostname is expected to work within a reasonable amount of time, I guess we can close this. Requiring a public DNS change and for clients to not be subject to a cache is not ideal, e.g. a service IP internally or some other indirection seems better, but that's an improvement for later perhaps.
Apr 14 2021, 11:31 PM
Sustainability
SRE
codfw-rollout
MoritzMuehlenhoff
added a comment to
T128592: Add redundancy to IRC recent changes service
In
T128592#6996726
@Legoktm
wrote:
Is it even possible for IRC to be active-active? Doesn't the client have to maintain a connection with a single server, and if that server drops, they disconnect, retry and get a connection again (maybe internally to a different server)? In that downtime though you're going to miss a few events. Unless the server remembers what your last position was (which the EventStreams protocol does!), I'm not sure how we avoid that.
Apr 14 2021, 7:37 AM
Sustainability
SRE
codfw-rollout
Apr 13 2021
Legoktm
added a comment to
T128592: Add redundancy to IRC recent changes service
Current status: irc2001 is irc.wm.o, and irc1001 is receiving events from MediaWiki and is a hot spare that can be failed over to by adjusting the irc.wm.o CNAME (on a 5min TTL).
Apr 13 2021, 11:45 PM
Sustainability
SRE
codfw-rollout
Legoktm
closed
T278255: Set up spare irc1001.wikimedia.org in eqiad
, a subtask of
T128592: Add redundancy to IRC recent changes service
, as
Resolved
Apr 13 2021, 11:31 PM
Sustainability
SRE
codfw-rollout
Mar 23 2021
Legoktm
added a subtask for
T128592: Add redundancy to IRC recent changes service
T278255: Set up spare irc1001.wikimedia.org in eqiad
Mar 23 2021, 7:06 PM
Sustainability
SRE
codfw-rollout
Feb 28 2021
Aklapper
closed
T133164: Document eqiad/codfw transition plan for OCG
as
Declined
Declining as OCG has been dead for many years - see
T177931
Feb 28 2021, 9:29 AM
Documentation
OCG-General
SRE
codfw-rollout
Feb 9 2021
Aklapper
archived
codfw-rollout
Feb 9 2021, 4:48 PM
Aklapper
edited Description on
codfw-rollout
Feb 9 2021, 4:48 PM
Dec 17 2020
Dzahn
added a comment to
T108580: HTTPS for internal service traffic
In
T108580#6488253
@BBlack
wrote:
$ grep 'replacement: http:' hieradata/common/profile/trafficserver/backend.yaml
replacement: http://puppetmaster1001.eqiad.wmnet
#replacement: http://puppetmaster2001.codfw.wmnet
replacement: http://contint.wikimedia.org
replacement: http://cloudweb2001-dev.wikimedia.org
replacement: http://cloudweb2001-dev.wikimedia.org
Do we need to clean these up in some new subtasks
Dec 17 2020, 5:02 PM
Traffic-Icebox
codfw-rollout
SRE
HTTPS
jbond
closed
T263831: puppetmaster[12]001: add TLS termination
, a subtask of
T108580: HTTPS for internal service traffic
, as
Resolved
Dec 17 2020, 4:59 PM
Traffic-Icebox
codfw-rollout
SRE
HTTPS
Oct 16 2020
Maintenance_bot
removed a project from
T163351: codfw API slaves overloaded during the 2017-04-19 codfw switch
Patch-For-Review
Oct 16 2020, 6:36 PM
codfw-rollout
DBA
SRE
Sep 25 2020
ema
added a subtask for
T108580: HTTPS for internal service traffic
T263831: puppetmaster[12]001: add TLS termination
Sep 25 2020, 8:27 AM
Traffic-Icebox
codfw-rollout
SRE
HTTPS
ema
added a subtask for
T108580: HTTPS for internal service traffic
T263830: contint.wikimedia.org: add TLS termination
Sep 25 2020, 8:26 AM
Traffic-Icebox
codfw-rollout
SRE
HTTPS
ema
added a subtask for
T108580: HTTPS for internal service traffic
T263829: cloudweb2001-dev: add TLS termination
Sep 25 2020, 8:26 AM
Traffic-Icebox
codfw-rollout
SRE
HTTPS
ema
added a comment to
T108580: HTTPS for internal service traffic
In
T108580#6488253
@BBlack
wrote:
Do we need to clean these up in some new subtasks
Sep 25 2020, 8:25 AM
Traffic-Icebox
codfw-rollout
SRE
HTTPS
Sep 23 2020
BBlack
updated subscribers of
T108580: HTTPS for internal service traffic
All subtasks gone, but there are technically stlil a few edges cases showing up in the trafficserver backend-facing config. Specifically:
Sep 23 2020, 4:44 PM
Traffic-Icebox
codfw-rollout
SRE
HTTPS
BBlack
closed
T109325: Outbound HTTPS for varnish backend instances
as
Invalid
There is no more varnish-be
Sep 23 2020, 4:38 PM
codfw-rollout
Varnish
SRE
HTTPS
Traffic
BBlack
closed
T109325: Outbound HTTPS for varnish backend instances
, a subtask of
T108580: HTTPS for internal service traffic
, as
Invalid
Sep 23 2020, 4:38 PM
Traffic-Icebox
codfw-rollout
SRE
HTTPS
BBlack
closed
T109321: Inbound TLS for tier-1 varnish backend caches
, a subtask of
T108580: HTTPS for internal service traffic
, as
Invalid
Sep 23 2020, 4:38 PM
Traffic-Icebox
codfw-rollout
SRE
HTTPS
Jul 17 2020
Marostegui
moved
T133523: Decide how to improve parsercache replication, sharding and HA
from
Backlog
to
Meta/Epic
on the
DBA
board.
Jul 17 2020, 5:25 AM
SRE-Sprint-Week-Sustainability-March2023
MW-1.39-notes (1.39.0-wmf.22; 2022-07-25)
Patch-For-Review
Epic
Sustainability (Incident Followup)
DBA
Jul 2 2020
Krinkle
added a comment to
T128592: Add redundancy to IRC recent changes service
Per today's Multi-DC meeting, I'm detaching this from the current workboard. It was our understanding that the messages here are largerely and perhaps even exclusively sent from the primary DC (assuming RC events only originate from write actions and from GET requests we classify as write actions, per
T91820
).
Jul 2 2020, 4:38 PM
Sustainability
SRE
codfw-rollout
Krinkle
edited projects for
T128592: Add redundancy to IRC recent changes service
, added:
Sustainability
; removed
Sustainability (MediaWiki-MultiDC)
Jul 2 2020, 4:32 PM
Sustainability
SRE
codfw-rollout
Krinkle
moved
T128592: Add redundancy to IRC recent changes service
from
Discuss next
to
Untriaged
on the
Sustainability (MediaWiki-MultiDC)
board.
Jul 2 2020, 2:49 PM
Sustainability
SRE
codfw-rollout
Krinkle
moved
T128592: Add redundancy to IRC recent changes service
from
Current: Performance Team
to
Discuss next
on the
Sustainability (MediaWiki-MultiDC)
board.
Jul 2 2020, 2:48 PM
Sustainability
SRE
codfw-rollout
May 21 2020
Krinkle
moved
T133523: Decide how to improve parsercache replication, sharding and HA
from
Limbo
to
Perf recommendation
on the
Performance-Team (Radar)
board.
May 21 2020, 1:31 AM
SRE-Sprint-Week-Sustainability-March2023
MW-1.39-notes (1.39.0-wmf.22; 2022-07-25)
Patch-For-Review
Epic
Sustainability (Incident Followup)
DBA
Krinkle
renamed
T133523: Decide how to improve parsercache replication, sharding and HA
from
[RFC] improve parsercache replication, sharding and HA
to
Decide how to improve parsercache replication, sharding and HA
May 21 2020, 1:31 AM
SRE-Sprint-Week-Sustainability-March2023
MW-1.39-notes (1.39.0-wmf.22; 2022-07-25)
Patch-For-Review
Epic
Sustainability (Incident Followup)
DBA
Krinkle
added a project to
T133523: Decide how to improve parsercache replication, sharding and HA
Sustainability (Incident Followup)
May 21 2020, 1:28 AM
SRE-Sprint-Week-Sustainability-March2023
MW-1.39-notes (1.39.0-wmf.22; 2022-07-25)
Patch-For-Review
Epic
Sustainability (Incident Followup)
DBA
May 11 2020
Krinkle
removed a subtask for
T128592: Add redundancy to IRC recent changes service
T232483: Port IRCRecentChanges to Kafka
May 11 2020, 9:09 PM
Sustainability
SRE
codfw-rollout
Content licensed under Creative Commons Attribution-ShareAlike (CC BY-SA) 4.0 unless otherwise noted; code licensed under GNU General Public License (GPL) 2.0 or later and other open source licenses. By using this site, you agree to the Terms of Use, Privacy Policy, and Code of Conduct.
Wikimedia Foundation
Code of Conduct
Disclaimer
CC-BY-SA
GPL
Credits
US