Server Admin Log - Wikitech
Jump to content
From Wikitech
(Redirected from
Server admin log
2026-04-24
10:42 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1195 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260424-104235-fceratto.json
10:42 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1195.eqiad.wmnet with reason: Maintenance
10:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1186 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260424-104210-fceratto.json
10:41 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1159.eqiad.wmnet with reason: host reimage
10:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1241 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260424-104016-fceratto.json
10:38 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2228.codfw.wmnet with reason: host reimage
10:36 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1159.eqiad.wmnet with reason: host reimage
10:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1186', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260424-103202-fceratto.json
10:31 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1241 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260424-103146-fceratto.json
10:31 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1241.eqiad.wmnet with reason: Maintenance
10:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1238 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260424-103116-fceratto.json
10:30 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-serve1015.eqiad.wmnet
10:26 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-serve1015.eqiad.wmnet
10:22 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2228.codfw.wmnet with OS trixie
10:22 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1159.eqiad.wmnet with OS trixie
10:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1186', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260424-102154-fceratto.json
10:21 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2228: Reimage to Trixie
10:21 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1159: Reimage to Trixie
10:21 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2228: Reimage to Trixie
10:21 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2228.codfw.wmnet with reason: Reimage to Trixie
10:21 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1159: Reimage to Trixie
10:21 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1159.eqiad.wmnet with reason: Reimage to Trixie
10:21 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-serve1015.eqiad.wmnet
10:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1238', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260424-102108-fceratto.json
10:17 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-serve1015.eqiad.wmnet
10:15 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-serve1015.eqiad.wmnet
10:12 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-serve1015.eqiad.wmnet
10:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1186 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260424-101146-fceratto.json
10:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1238', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260424-101056-fceratto.json
10:02 aokoth@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1004.wikimedia.org with reason: Security Release -
T424175
10:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1238 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260424-100047-fceratto.json
09:57 cmooney@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-serve1015.eqiad.wmnet
09:57 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ml-serve1015.eqiad.wmnet on all recursors
09:56 cmooney@cumin1003: START - Cookbook sre.dns.wipe-cache ml-serve1015.eqiad.wmnet on all recursors
09:56 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
09:56 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: change records for ml-serve1014 - cmooney@cumin1003"
09:56 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: change records for ml-serve1014 - cmooney@cumin1003"
09:54 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1186 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260424-095450-fceratto.json
09:54 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1186.eqiad.wmnet with reason: Maintenance
09:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1184 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260424-095425-fceratto.json
09:52 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1238 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260424-095228-fceratto.json
09:52 cmooney@cumin1003: START - Cookbook sre.dns.netbox
09:52 cmooney@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-serve1015.eqiad.wmnet
09:52 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1238.eqiad.wmnet with reason: Maintenance
09:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1221 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260424-095159-fceratto.json
09:50 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-serve1015.eqiad.wmnet
09:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1184', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260424-094417-fceratto.json
09:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1221', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260424-094151-fceratto.json
09:40 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-serve1015.eqiad.wmnet
09:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1184', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260424-093409-fceratto.json
09:34 cmooney@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host ml-serve1014.eqiad.wmnet
09:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1221', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260424-093143-fceratto.json
09:28 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ml-serve1014.eqiad.wmnet on all recursors
09:28 cmooney@cumin1003: START - Cookbook sre.dns.wipe-cache ml-serve1014.eqiad.wmnet on all recursors
09:27 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
09:27 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: change records for ml-serve1014 - cmooney@cumin1003"
09:27 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: change records for ml-serve1014 - cmooney@cumin1003"
09:24 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-serve1015.eqiad.wmnet
09:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1184 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260424-092401-fceratto.json
09:23 cmooney@cumin1003: START - Cookbook sre.dns.netbox
09:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1221 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260424-092135-fceratto.json
09:21 cmooney@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-serve1014.eqiad.wmnet
09:19 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-serve1015.eqiad.wmnet
09:16 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-serve1015.eqiad.wmnet
09:13 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1221 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260424-091316-fceratto.json
09:13 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1015,1024-1025].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
09:12 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1221.eqiad.wmnet with reason: Maintenance
09:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1199 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260424-091237-fceratto.json
09:11 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-serve1015.eqiad.wmnet
09:04 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1184 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260424-090454-fceratto.json
09:04 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1184.eqiad.wmnet with reason: Maintenance
09:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1169 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260424-090429-fceratto.json
09:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1199', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260424-090229-fceratto.json
09:01 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-serve1015.eqiad.wmnet
08:56 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-serve1015.eqiad.wmnet
08:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260424-085421-fceratto.json
08:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1199', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260424-085221-fceratto.json
08:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260424-084414-fceratto.json
08:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1199 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260424-084213-fceratto.json
08:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1169 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260424-083406-fceratto.json
08:31 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1199 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260424-083118-fceratto.json
08:31 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1199.eqiad.wmnet with reason: Maintenance
08:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1190 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260424-083050-fceratto.json
08:29 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host hcaptcha-proxy5003.wikimedia.org
08:29 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) hcaptcha-proxy5003.wikimedia.org on all recursors
08:29 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache hcaptcha-proxy5003.wikimedia.org on all recursors
08:29 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
08:29 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove records for VM hcaptcha-proxy5003.wikimedia.org - jmm@cumin2002"
08:29 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove records for VM hcaptcha-proxy5003.wikimedia.org - jmm@cumin2002"
08:27 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-serve1014.eqiad.wmnet
08:24 jmm@cumin2002: START - Cookbook sre.dns.netbox
08:24 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) hcaptcha-proxy5003.wikimedia.org on all recursors
08:24 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache hcaptcha-proxy5003.wikimedia.org on all recursors
08:24 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
08:24 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM hcaptcha-proxy5003.wikimedia.org - jmm@cumin2002"
08:24 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM hcaptcha-proxy5003.wikimedia.org - jmm@cumin2002"
08:22 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-serve1014.eqiad.wmnet
08:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1190', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260424-082041-fceratto.json
08:19 jmm@cumin2002: START - Cookbook sre.dns.netbox
08:19 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host hcaptcha-proxy5003.wikimedia.org
08:15 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1169 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260424-081539-fceratto.json
08:15 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1169.eqiad.wmnet with reason: Maintenance
08:12 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host hcaptcha-proxy5003.wikimedia.org
08:12 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) hcaptcha-proxy5003.wikimedia.org on all recursors
08:12 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache hcaptcha-proxy5003.wikimedia.org on all recursors
08:12 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
08:12 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove records for VM hcaptcha-proxy5003.wikimedia.org - jmm@cumin2002"
08:12 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove records for VM hcaptcha-proxy5003.wikimedia.org - jmm@cumin2002"
08:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1190', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260424-081033-fceratto.json
08:09 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
08:08 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
08:08 jmm@cumin2002: START - Cookbook sre.dns.netbox
08:08 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) hcaptcha-proxy5003.wikimedia.org on all recursors
08:08 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache hcaptcha-proxy5003.wikimedia.org on all recursors
08:08 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
08:05 jmm@cumin2002: START - Cookbook sre.dns.netbox
08:05 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts netflow5002.eqsin.wmnet
08:05 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
08:05 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: netflow5002.eqsin.wmnet decommissioned, removing all IPs except the asset tag one - ayounsi@cumin1003"
08:04 jmm@cumin2002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
08:03 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: netflow5002.eqsin.wmnet decommissioned, removing all IPs except the asset tag one - ayounsi@cumin1003"
08:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1190 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260424-080025-fceratto.json
08:00 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
07:54 jmm@cumin2002: START - Cookbook sre.dns.netbox
07:54 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host hcaptcha-proxy5003.wikimedia.org
07:51 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1190 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260424-075145-fceratto.json
07:51 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1190.eqiad.wmnet with reason: Maintenance
07:50 ayounsi@cumin1003: START - Cookbook sre.hosts.decommission for hosts netflow5002.eqsin.wmnet
07:45 ayounsi@cumin1003: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=99) for hosts netflow5002.eqsin.wmnet
07:45 ayounsi@cumin1003: START - Cookbook sre.hosts.decommission for hosts netflow5002.eqsin.wmnet
06:28 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on db[2142-2143].codfw.wmnet with reason: Cloning
05:56 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 264595
05:53 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 264595
05:49 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 58717
05:48 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 58717
05:44 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 20940
05:41 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 20940
05:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 19165
05:40 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 19165
05:33 marostegui@cumin1003: dbctl commit (dc=all): 'Remove db2148 from dbctl
T424309
', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260424-053342-marostegui.json
03:30 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db2181 (
T410589
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260424-033021-ladsgroup.json
03:30 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db2181.codfw.wmnet with reason: Maintenance
03:29 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2167 (
T410589
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260424-032955-ladsgroup.json
03:19 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2167', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260424-031947-ladsgroup.json
03:09 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2167', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260424-030938-ladsgroup.json
02:59 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2167 (
T410589
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260424-025930-ladsgroup.json
02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 32s)
02:01 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
00:05 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host rdb2013.codfw.wmnet with OS trixie
2026-04-23
23:38 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host rdb2014.codfw.wmnet with OS trixie
23:38 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host rdb2013.codfw.wmnet with OS trixie
23:31 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host rdb2014.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
23:26 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host rdb2014.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
23:00 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host rdb2014.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
23:00 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host rdb2014.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
22:38 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host rdb2013.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
22:37 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host rdb2014.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
22:34 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host rdb2014.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
22:31 ladsgroup@deploy1003: Finished scap sync-world: Backport for
QuickView: Fix relying on non-standard sizes (T424032)
(duration: 07m 19s)
22:28 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host rdb2014.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
22:28 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host rdb2014.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
22:28 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
22:27 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host rdb2013.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
22:27 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host rdb2014.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
22:26 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host rdb2014.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
22:26 ladsgroup@deploy1003: ladsgroup: Backport for
QuickView: Fix relying on non-standard sizes (T424032)
synced to the testservers (see
). Changes can now be verified there.
22:24 ladsgroup@deploy1003: Started scap sync-world: Backport for
QuickView: Fix relying on non-standard sizes (T424032)
22:21 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudelastic1011.eqiad.wmnet with OS trixie
22:21 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host rdb2013.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
22:18 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host rdb2014.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
22:14 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host rdb2014.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
22:14 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host rdb2013.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
22:13 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host rdb2014
22:12 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host rdb2014
22:12 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host rdb2013
22:12 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host rdb2013
22:08 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
22:08 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding rdb2013 to codfw - jhancock@cumin2002"
22:07 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding rdb2013 to codfw - jhancock@cumin2002"
22:03 jhancock@cumin2002: START - Cookbook sre.dns.netbox
21:54 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudelastic1011.eqiad.wmnet with reason: host reimage
21:48 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudelastic1011.eqiad.wmnet with reason: host reimage
21:36 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1011.eqiad.wmnet with OS trixie
21:10 krinkle@deploy1003: Finished scap sync-world: Backport for
ext.wikiEditor: Set background-size for toolbar buttons (T414805)
(duration: 05m 47s)
21:06 krinkle@deploy1003: krinkle: Continuing with deployment
21:05 krinkle@deploy1003: krinkle: Backport for
ext.wikiEditor: Set background-size for toolbar buttons (T414805)
synced to the testservers (see
). Changes can now be verified there.
21:04 krinkle@deploy1003: Started scap sync-world: Backport for
ext.wikiEditor: Set background-size for toolbar buttons (T414805)
21:03 krinkle@deploy1003: Finished scap sync-world: Backport for
ext.wikiEditor: Set background-size for toolbar buttons (T414805)
(duration: 03m 05s)
21:03 krinkle@deploy1003: krinkle: Rolling back deployment
21:02 krinkle@deploy1003: krinkle: Backport for
ext.wikiEditor: Set background-size for toolbar buttons (T414805)
synced to the testservers (see
). Changes can now be verified there.
21:00 krinkle@deploy1003: Started scap sync-world: Backport for
ext.wikiEditor: Set background-size for toolbar buttons (T414805)
20:51 cscott@deploy1003: Finished scap sync-world: Backport for
Deploy Parsoid Read Views to banwiki/ganwiki (T423785)
(duration: 06m 02s)
20:47 cscott@deploy1003: cscott: Continuing with deployment
20:47 cscott@deploy1003: cscott: Backport for
Deploy Parsoid Read Views to banwiki/ganwiki (T423785)
synced to the testservers (see
). Changes can now be verified there.
20:45 cscott@deploy1003: Started scap sync-world: Backport for
Deploy Parsoid Read Views to banwiki/ganwiki (T423785)
19:28 otto@deploy1003: Finished scap sync-world: Backport for
Remove stream 'mediawiki.dump.revision_content_history.reconcile.rc0' (T417694)
EventStreamConfig - add rc0 streams for html and feature count change (T423920)
(duration: 22m 05s)
19:24 otto@deploy1003: xcollazo, otto: Continuing with deployment
19:14 otto@deploy1003: xcollazo, otto: Backport for
Remove stream 'mediawiki.dump.revision_content_history.reconcile.rc0' (T417694)
EventStreamConfig - add rc0 streams for html and feature count change (T423920)
synced to the testservers (see
). Changes can now be verified there.
19:09 jasmine@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-ctrl[2004-2005].codfw.wmnet
19:09 jasmine@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-ctrl[2004-2005].codfw.wmnet
19:06 jasmine_: “ran homer on lsw1-c7-codfw and lsw1-b2-codfw following new control planes (
T390861
)"
19:06 otto@deploy1003: Started scap sync-world: Backport for
Remove stream 'mediawiki.dump.revision_content_history.reconcile.rc0' (T417694)
EventStreamConfig - add rc0 streams for html and feature count change (T423920)
18:19 jasmine@cumin2002: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Syncing netbox hieradata to fetch BGP for new control planes - jasmine@cumin2002 -
T390861
18:13 jasmine@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Syncing netbox hieradata to fetch BGP for new control planes - jasmine@cumin2002 -
T390861
17:09 bd808@deploy1003: helmfile [eqiad] DONE helmfile.d/services/developer-portal: apply
17:09 bd808@deploy1003: helmfile [eqiad] START helmfile.d/services/developer-portal: apply
17:08 bd808@deploy1003: helmfile [codfw] DONE helmfile.d/services/developer-portal: apply
17:08 bd808@deploy1003: helmfile [codfw] START helmfile.d/services/developer-portal: apply
17:07 bd808@deploy1003: helmfile [staging] DONE helmfile.d/services/developer-portal: apply
17:07 bd808@deploy1003: helmfile [staging] START helmfile.d/services/developer-portal: apply
16:46 jasmine@dns1004: END - running authdns-update
16:44 jasmine@dns1004: START - running authdns-update
16:39 jasmine@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on wikikube-ctrl2005.codfw.wmnet with reason: Downtiming to avoid page in case of race condition
16:29 herron@cumin1003: END (PASS) - Cookbook sre.kafka.change-confluent-distro-version (exit_code=0) Change Confluent distribution for Kafka A:kafka-logging-codfw cluster: Change Confluent distribution.
16:26 ladsgroup@deploy1003: Finished scap sync-world: Backport for
Media: Fallback to the largest standard size if an overly large one is requested (T418745 T423895)
(duration: 05m 53s)
16:22 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
16:22 ladsgroup@deploy1003: ladsgroup: Backport for
Media: Fallback to the largest standard size if an overly large one is requested (T418745 T423895)
synced to the testservers (see
). Changes can now be verified there.
16:20 ladsgroup@deploy1003: Started scap sync-world: Backport for
Media: Fallback to the largest standard size if an overly large one is requested (T418745 T423895)
16:16 Amir1: re-enabling general ban on any non-standard thumb (
T414805
16:13 klausman@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
16:13 klausman@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
16:12 klausman@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
16:12 klausman@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
16:12 klausman@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
16:11 klausman@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
16:10 herron@cumin1003: START - Cookbook sre.kafka.change-confluent-distro-version Change Confluent distribution for Kafka A:kafka-logging-codfw cluster: Change Confluent distribution.
15:54 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host ncredir5004.eqsin.wmnet
15:54 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ncredir5004.eqsin.wmnet with OS bookworm
15:48 sukhe: sudo cumin -b31 "A:cp and not P{cp2041* or cp2042*}" "run-puppet-agent --enable 'merging CR 1276017'"
T420604
. finish rollout of removing CSP in VCL from beta
15:34 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ncredir5004.eqsin.wmnet with reason: host reimage
15:30 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ncredir5004.eqsin.wmnet with reason: host reimage
15:25 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db2167 (
T410589
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260423-152514-ladsgroup.json
15:25 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db2167.codfw.wmnet with reason: Maintenance
15:24 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2166 (
T410589
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260423-152450-ladsgroup.json
15:16 aokoth@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab2002.wikimedia.org with reason: Security Release -
T424175
15:14 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2166', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260423-151441-ladsgroup.json
15:07 aokoth@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab2002.wikimedia.org with reason: Security Release -
T424175
15:06 aokoth@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1003.wikimedia.org with reason: Security Release -
T424175
15:04 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2166', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260423-150433-ladsgroup.json
15:03 moritzm: installing rsync security updates
14:57 aokoth@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1003.wikimedia.org with reason: Security Release -
T424175
14:54 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2166 (
T410589
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260423-145425-ladsgroup.json
14:48 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ncredir5004.eqsin.wmnet with OS bookworm
14:47 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM ncredir5004.eqsin.wmnet - jmm@cumin2002"
14:46 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM ncredir5004.eqsin.wmnet - jmm@cumin2002"
14:46 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ncredir5004.eqsin.wmnet on all recursors
14:46 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache ncredir5004.eqsin.wmnet on all recursors
14:46 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
14:46 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM ncredir5004.eqsin.wmnet - jmm@cumin2002"
14:46 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM ncredir5004.eqsin.wmnet - jmm@cumin2002"
14:42 jmm@cumin2002: START - Cookbook sre.dns.netbox
14:42 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host ncredir5004.eqsin.wmnet
14:39 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host ncredir5003.eqsin.wmnet
14:39 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ncredir5003.eqsin.wmnet with OS bookworm
14:34 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
14:33 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
14:33 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
14:33 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
14:22 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts tcp-proxy5002.eqsin.wmnet
14:22 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
14:22 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: tcp-proxy5002.eqsin.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
14:22 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: tcp-proxy5002.eqsin.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
14:15 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ncredir5003.eqsin.wmnet with reason: host reimage
14:10 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ncredir5003.eqsin.wmnet with reason: host reimage
14:10 jmm@cumin2002: START - Cookbook sre.dns.netbox
14:06 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts db2145.codfw.wmnet
14:06 marostegui@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
14:06 marostegui@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2145.codfw.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
14:06 marostegui@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2145.codfw.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
14:00 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts tcp-proxy5002.eqsin.wmnet
13:59 marostegui@cumin1003: START - Cookbook sre.dns.netbox
13:59 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts tcp-proxy5001.eqsin.wmnet
13:59 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
13:59 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: tcp-proxy5001.eqsin.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
13:59 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: tcp-proxy5001.eqsin.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
13:52 marostegui@cumin1003: START - Cookbook sre.hosts.decommission for hosts db2145.codfw.wmnet
13:39 Lucas_WMDE: UTC afternoon backport+config window done
13:36 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for
Enable the CampaignEvents extension on incubator (T421749)
(duration: 06m 11s)
13:32 lucaswerkmeister-wmde@deploy1003: mhorsey, lucaswerkmeister-wmde: Continuing with deployment
13:32 lucaswerkmeister-wmde@deploy1003: mhorsey, lucaswerkmeister-wmde: Backport for
Enable the CampaignEvents extension on incubator (T421749)
synced to the testservers (see
). Changes can now be verified there.
13:32 jmm@cumin2002: START - Cookbook sre.dns.netbox
13:30 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for
Enable the CampaignEvents extension on incubator (T421749)
13:28 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ncredir5003.eqsin.wmnet with OS bookworm
13:27 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts tcp-proxy5001.eqsin.wmnet
13:25 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM ncredir5003.eqsin.wmnet - jmm@cumin2002"
13:25 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM ncredir5003.eqsin.wmnet - jmm@cumin2002"
13:25 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ncredir5003.eqsin.wmnet on all recursors
13:25 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache ncredir5003.eqsin.wmnet on all recursors
13:25 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
13:25 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM ncredir5003.eqsin.wmnet - jmm@cumin2002"
13:23 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2048 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260423-132311-fceratto.json
13:22 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
13:22 aude@deploy1003: Finished scap sync-world: Backport for
Parsoid Read Views: 100% rollout to Russian Wikipedia (T423188)
Opt-in new accounts to ReadingLists beta feature on all Wikipedia wikis (T420881)
(duration: 06m 42s)
13:22 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
13:21 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
13:21 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
13:18 aude@deploy1003: cscott, aude: Continuing with deployment
13:16 aude@deploy1003: cscott, aude: Backport for
Parsoid Read Views: 100% rollout to Russian Wikipedia (T423188)
Opt-in new accounts to ReadingLists beta feature on all Wikipedia wikis (T420881)
synced to the testservers (see
). Changes can now be verified there.
13:15 aude@deploy1003: Started scap sync-world: Backport for
Parsoid Read Views: 100% rollout to Russian Wikipedia (T423188)
Opt-in new accounts to ReadingLists beta feature on all Wikipedia wikis (T420881)
13:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2048', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260423-131303-fceratto.json
13:12 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM ncredir5003.eqsin.wmnet - jmm@cumin2002"
13:05 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-analytics-test: apply
13:05 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-analytics-test: apply
13:03 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host thanos-be2009.codfw.wmnet with OS bullseye
13:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2048', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260423-130255-fceratto.json
13:01 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-test-k8s: apply
13:01 eevans@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on aqs1015.eqiad.wmnet with reason: Decommissioning —
T412830
13:00 jmm@cumin2002: START - Cookbook sre.dns.netbox
13:00 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host ncredir5003.eqsin.wmnet
13:00 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
12:55 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host netflow5003.eqsin.wmnet
12:55 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host netflow5003.eqsin.wmnet with OS bookworm
12:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2048 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260423-125247-fceratto.json
12:45 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es2048 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260423-124535-fceratto.json
12:45 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es2048.codfw.wmnet with reason: Maintenance
12:45 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on thanos-be2009.codfw.wmnet with reason: host reimage
12:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2040 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260423-124504-fceratto.json
12:39 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on thanos-be2009.codfw.wmnet with reason: host reimage
12:38 kharlan@deploy1003: Finished scap sync-world: Backport for
hCaptcha: Retry SiteVerify up to two times (T421204)
(duration: 06m 25s)
12:36 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on netflow5003.eqsin.wmnet with reason: host reimage
12:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2040', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260423-123456-fceratto.json
12:34 kharlan@deploy1003: kharlan: Continuing with deployment
12:33 kharlan@deploy1003: kharlan: Backport for
hCaptcha: Retry SiteVerify up to two times (T421204)
synced to the testservers (see
). Changes can now be verified there.
12:32 kharlan@deploy1003: Started scap sync-world: Backport for
hCaptcha: Retry SiteVerify up to two times (T421204)
12:30 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on netflow5003.eqsin.wmnet with reason: host reimage
12:30 kharlan@deploy1003: Finished scap sync-world: Backport for
hCaptcha: Disable Private Access Tokens in secure-api URL (T424216)
(duration: 06m 57s)
12:26 kharlan@deploy1003: kharlan: Continuing with deployment
12:24 kharlan@deploy1003: kharlan: Backport for
hCaptcha: Disable Private Access Tokens in secure-api URL (T424216)
synced to the testservers (see
). Changes can now be verified there.
12:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2040', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260423-122448-fceratto.json
12:23 kharlan@deploy1003: Started scap sync-world: Backport for
hCaptcha: Disable Private Access Tokens in secure-api URL (T424216)
12:19 kharlan@deploy1003: Finished scap sync-world: Backport for
hCaptcha: Don't prevent opening links present in the hCaptcha popup (T408812)
(duration: 08m 11s)
12:16 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host thanos-be2009.codfw.wmnet with OS bullseye
12:15 kharlan@deploy1003: harroyo-wmf, kharlan: Continuing with deployment
12:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2040 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260423-121439-fceratto.json
12:12 kharlan@deploy1003: harroyo-wmf, kharlan: Backport for
hCaptcha: Don't prevent opening links present in the hCaptcha popup (T408812)
synced to the testservers (see
). Changes can now be verified there.
12:11 kharlan@deploy1003: Started scap sync-world: Backport for
hCaptcha: Don't prevent opening links present in the hCaptcha popup (T408812)
12:08 kart_: staging: Update cxserver to 2026-04-23-114216-production (
T423002
12:04 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es2040 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260423-120400-fceratto.json
12:03 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es2040.codfw.wmnet with reason: Maintenance
12:03 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2039 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260423-120332-fceratto.json
12:00 kartik@deploy1003: helmfile [staging] DONE helmfile.d/services/cxserver: apply
12:00 kartik@deploy1003: helmfile [staging] START helmfile.d/services/cxserver: apply
11:53 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2039', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260423-115324-fceratto.json
11:47 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host netflow5003.eqsin.wmnet with OS bookworm
11:44 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
11:44 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
11:43 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2039', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260423-114316-fceratto.json
11:42 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM netflow5003.eqsin.wmnet - jmm@cumin2002"
11:42 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM netflow5003.eqsin.wmnet - jmm@cumin2002"
11:42 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) netflow5003.eqsin.wmnet on all recursors
11:42 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache netflow5003.eqsin.wmnet on all recursors
11:42 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
11:42 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM netflow5003.eqsin.wmnet - jmm@cumin2002"
11:40 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM netflow5003.eqsin.wmnet - jmm@cumin2002"
11:36 jmm@cumin2002: START - Cookbook sre.dns.netbox
11:36 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host netflow5003.eqsin.wmnet
11:33 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2039 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260423-113307-fceratto.json
11:21 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es2039 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260423-112133-fceratto.json
11:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es2039.codfw.wmnet with reason: Maintenance
11:21 moritzm: installing ngtcp2 security updates
11:20 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
11:19 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
11:13 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
11:13 hnowlan@deploy1003: Finished deploy [restbase/deploy@8a25036]: Add urwikisource
T415975
(repeat attempt, last deploy did not include change) (duration: 11m 55s)
11:13 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
11:12 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host doh5004.wikimedia.org
11:12 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host doh5004.wikimedia.org with OS bookworm
11:08 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
11:08 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
11:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1037 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260423-110359-fceratto.json
11:01 hnowlan@deploy1003: Started deploy [restbase/deploy@8a25036]: Add urwikisource
T415975
(repeat attempt, last deploy did not include change)
11:00 hnowlan@deploy1003: Finished deploy [restbase/deploy@8a25036]: Add urwikisource
T415975
(repeat attempt, last deploy did not include change) (duration: 33m 20s)
10:57 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host thanos-be2008.codfw.wmnet with OS bullseye
10:54 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on doh5004.wikimedia.org with reason: host reimage
10:53 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1037', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260423-105351-fceratto.json
10:50 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on doh5004.wikimedia.org with reason: host reimage
10:43 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1037', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260423-104343-fceratto.json
10:42 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
10:39 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on thanos-be2008.codfw.wmnet with reason: host reimage
10:37 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
10:33 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
10:33 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1037 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260423-103334-fceratto.json
10:32 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on thanos-be2008.codfw.wmnet with reason: host reimage
10:27 hnowlan@deploy1003: Started deploy [restbase/deploy@8a25036]: Add urwikisource
T415975
(repeat attempt, last deploy did not include change)
10:24 lucaswerkmeister-wmde@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikidata-query-gui: apply
10:24 lucaswerkmeister-wmde@deploy1003: helmfile [eqiad] START helmfile.d/services/wikidata-query-gui: apply
10:24 lucaswerkmeister-wmde@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikidata-query-gui: apply
10:24 lucaswerkmeister-wmde@deploy1003: helmfile [codfw] START helmfile.d/services/wikidata-query-gui: apply
10:24 lucaswerkmeister-wmde@deploy1003: helmfile [staging] DONE helmfile.d/services/wikidata-query-gui: apply
10:23 lucaswerkmeister-wmde@deploy1003: helmfile [staging] START helmfile.d/services/wikidata-query-gui: apply
10:21 isaranto@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
10:20 isaranto@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
10:19 marostegui@cumin1003: dbctl commit (dc=all): 'Repool pc2 with pc2022 as codfw master
T418973
', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260423-101957-marostegui.json
10:19 daniel@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/aux-k8s-services/redioscope: apply
10:19 daniel@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/aux-k8s-services/redioscope: apply
10:19 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es1037 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260423-101855-fceratto.json
10:18 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1037.eqiad.wmnet with reason: Maintenance
10:17 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
10:16 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
10:16 daniel@deploy1003: Finished scap sync-world: Backport for
api rate limits: use global apihighlimits-requestor group. (T419796)
(duration: 07m 37s)
10:16 marostegui@cumin1003: dbctl commit (dc=all): 'Make pc2022 master of pc2
T418973
', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260423-101611-marostegui.json
10:15 marostegui@cumin1003: dbctl commit (dc=all): 'Add pc2022, remove pc2012
T418973
T424201
', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260423-101544-marostegui.json
10:15 daniel@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/aux-k8s-services/redioscope: apply
10:15 ayounsi@cumin1003: END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox-canary
10:14 ayounsi@cumin1003: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox-canary
10:14 daniel@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/aux-k8s-services/redioscope: apply
10:14 ayounsi@cumin1003: END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox
10:13 ayounsi@cumin1003: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox
10:12 daniel@deploy1003: daniel: Continuing with deployment
10:10 daniel@deploy1003: daniel: Backport for
api rate limits: use global apihighlimits-requestor group. (T419796)
synced to the testservers (see
). Changes can now be verified there.
10:10 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host thanos-be2008.codfw.wmnet with OS bullseye
10:08 daniel@deploy1003: Started scap sync-world: Backport for
api rate limits: use global apihighlimits-requestor group. (T419796)
10:00 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host doh5004.wikimedia.org with OS bookworm
10:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2047 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260423-100035-fceratto.json
09:59 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM doh5004.wikimedia.org - jmm@cumin2002"
09:59 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM doh5004.wikimedia.org - jmm@cumin2002"
09:58 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) doh5004.wikimedia.org on all recursors
09:58 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache doh5004.wikimedia.org on all recursors
09:58 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
09:58 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM doh5004.wikimedia.org - jmm@cumin2002"
09:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2047', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260423-095027-fceratto.json
09:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2047', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260423-094019-fceratto.json
09:32 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM doh5004.wikimedia.org - jmm@cumin2002"
09:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2047 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260423-093010-fceratto.json
09:27 aikochou@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
09:25 jmm@cumin2002: START - Cookbook sre.dns.netbox
09:25 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host doh5004.wikimedia.org
09:23 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es2047 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260423-092303-fceratto.json
09:22 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es2047.codfw.wmnet with reason: Maintenance
09:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2037 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260423-092232-fceratto.json
09:19 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host doh5003.wikimedia.org
09:19 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host doh5003.wikimedia.org with OS bookworm
09:17 aikochou@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
09:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2037', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260423-091224-fceratto.json
09:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2037', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260423-090216-fceratto.json
09:00 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on doh5003.wikimedia.org with reason: host reimage
09:00 marostegui@cumin1003: dbctl commit (dc=all): 'Remove db2146 from dbctl
T424179
', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260423-090014-marostegui.json
08:58 jmm@puppetserver1001: conftool action : set/pooled=no; selector: name=tcp-proxy5002.eqsin.wmnet
08:58 jmm@puppetserver1001: conftool action : set/pooled=no; selector: name=tcp-proxy5001.eqsin.wmnet
08:56 jmm@puppetserver1001: conftool action : set/pooled=yes; selector: name=tcp-proxy5004.eqsin.wmnet
08:56 jmm@puppetserver1001: conftool action : set/weight=1; selector: name=tcp-proxy5004.eqsin.wmnet
08:55 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on doh5003.wikimedia.org with reason: host reimage
08:53 jmm@puppetserver1001: conftool action : set/pooled=yes; selector: name=tcp-proxy5003.eqsin.wmnet
08:52 jmm@puppetserver1001: conftool action : set/weight=1; selector: name=tcp-proxy5003.eqsin.wmnet
08:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2037 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260423-085207-fceratto.json
08:40 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es2037 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260423-084035-fceratto.json
08:40 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es2037.codfw.wmnet with reason: Maintenance
08:12 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2152.codfw.wmnet with reason: Maintenance
08:10 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host thanos-be2007.codfw.wmnet with OS bullseye
08:06 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host doh5003.wikimedia.org with OS bookworm
08:05 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM doh5003.wikimedia.org - jmm@cumin2002"
08:05 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM doh5003.wikimedia.org - jmm@cumin2002"
08:05 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) doh5003.wikimedia.org on all recursors
08:05 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache doh5003.wikimedia.org on all recursors
08:05 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
08:05 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM doh5003.wikimedia.org - jmm@cumin2002"
08:04 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM doh5003.wikimedia.org - jmm@cumin2002"
08:01 jmm@cumin2002: START - Cookbook sre.dns.netbox
08:01 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host doh5003.wikimedia.org
07:50 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on thanos-be2007.codfw.wmnet with reason: host reimage
07:45 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on thanos-be2007.codfw.wmnet with reason: host reimage
07:22 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host thanos-be2007.codfw.wmnet with OS bullseye
07:15 marostegui@cumin1003: dbctl commit (dc=all): 'Remove db2145 from dbctl
T424177
', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260423-071500-marostegui.json
06:58 marostegui@cumin1003: dbctl commit (dc=all): 'Repool ms1 with db2252 as new codfw master
T418979
', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260423-065803-marostegui.json
06:57 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2252: Cloning
06:57 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.parsercache (exit_code=99)
06:56 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
06:56 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2252: Cloning
06:53 marostegui@cumin1003: dbctl commit (dc=all): 'Make db2252 master of ms3
T418979
', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260423-065323-marostegui.json
06:52 marostegui@cumin1003: dbctl commit (dc=all): 'Remove db2143 from ms3, add db2252
T418979
', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260423-065214-marostegui.json
06:28 jelto: gerrit2003 maintenance finished -
T333143
06:05 jelto: start gerrit2003 maintenance -
T333143
05:57 jelto@cumin1003: DONE (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 0:35:00 on gerrit.discovery.wmnet with reason: Gerrit maintenance
05:57 jelto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:35:00 on gerrit2003.wikimedia.org with reason: Gerrit maintenance
05:41 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db[2143,2252].codfw.wmnet,db1153.eqiad.wmnet with reason: Cloning
05:41 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2143: Cloning db2252 from db2143
05:41 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
05:41 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
05:41 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2143: Cloning db2252 from db2143
05:28 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on pc[2012,2022].codfw.wmnet,pc1012.eqiad.wmnet with reason: Cloning
05:28 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool pc2012: Cloning pc2022 from pc2012
05:28 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
05:28 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
05:28 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool pc2012: Cloning pc2022 from pc2012
05:27 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on pc[2012,2022].codfw.wmnet with reason: Cloning
03:15 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db2166 (
T410589
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260423-031538-ladsgroup.json
03:15 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db2166.codfw.wmnet with reason: Maintenance
03:15 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2164 (
T410589
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260423-031512-ladsgroup.json
03:05 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2164', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260423-030504-ladsgroup.json
02:54 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2164', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260423-025455-ladsgroup.json
02:44 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2164 (
T410589
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260423-024447-ladsgroup.json
02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 12s)
02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
2026-04-22
15:40 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db[2141,2250].codfw.wmnet with reason: clone
15:08 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db2164 (
T410589
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260422-150817-ladsgroup.json
15:08 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db2164.codfw.wmnet with reason: Maintenance
15:07 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2163 (
T410589
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260422-150752-ladsgroup.json
14:57 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2163', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260422-145744-ladsgroup.json
14:47 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2163', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260422-144736-ladsgroup.json
14:37 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2163 (
T410589
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260422-143728-ladsgroup.json
11:59 mvolz@deploy1003: helmfile [eqiad] DONE helmfile.d/services/zotero: apply
11:58 mvolz@deploy1003: helmfile [eqiad] START helmfile.d/services/zotero: apply
11:41 mvolz@deploy1003: helmfile [codfw] DONE helmfile.d/services/zotero: apply
11:41 mvolz@deploy1003: helmfile [codfw] START helmfile.d/services/zotero: apply
11:36 mvolz@deploy1003: helmfile [staging] DONE helmfile.d/services/zotero: apply
11:36 mvolz@deploy1003: helmfile [staging] START helmfile.d/services/zotero: apply
11:26 mvolz@deploy1003: helmfile [codfw] DONE helmfile.d/services/citoid: apply
11:26 mvolz@deploy1003: helmfile [codfw] START helmfile.d/services/citoid: apply
11:25 mvolz@deploy1003: helmfile [eqiad] DONE helmfile.d/services/citoid: apply
11:25 mvolz@deploy1003: helmfile [eqiad] START helmfile.d/services/citoid: apply
11:25 mvolz@deploy1003: helmfile [staging] DONE helmfile.d/services/citoid: apply
11:24 mvolz@deploy1003: helmfile [staging] START helmfile.d/services/citoid: apply
11:22 mvolz@deploy1003: helmfile [staging] DONE helmfile.d/services/citoid: apply
11:22 mvolz@deploy1003: helmfile [staging] START helmfile.d/services/citoid: apply
11:12 mvolz@deploy1003: helmfile [eqiad] DONE helmfile.d/services/citoid: apply
11:12 mvolz@deploy1003: helmfile [eqiad] START helmfile.d/services/citoid: apply
11:08 mvolz@deploy1003: helmfile [codfw] DONE helmfile.d/services/citoid: apply
11:07 mvolz@deploy1003: helmfile [codfw] START helmfile.d/services/citoid: apply
11:06 mvolz@deploy1003: helmfile [staging] DONE helmfile.d/services/citoid: apply
11:06 mvolz@deploy1003: helmfile [staging] START helmfile.d/services/citoid: apply
10:27 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db[2141,2250].codfw.wmnet with reason: clone
07:23 samwilson@deploy1003: Finished scap sync-world: Backport for
Use canvas rather than webgl for OpenSeadragon (T423548)
(duration: 08m 31s)
07:17 samwilson@deploy1003: samwilson: Continuing with deployment
07:16 samwilson@deploy1003: samwilson: Backport for
Use canvas rather than webgl for OpenSeadragon (T423548)
synced to the testservers (see
). Changes can now be verified there.
07:14 samwilson@deploy1003: Started scap sync-world: Backport for
Use canvas rather than webgl for OpenSeadragon (T423548)
04:35 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
04:19 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
03:03 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db2163 (
T410589
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260422-030300-ladsgroup.json
03:02 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db2163.codfw.wmnet with reason: Maintenance
03:02 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2161 (
T410589
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260422-030235-ladsgroup.json
02:52 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2161', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260422-025227-ladsgroup.json
02:42 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2161', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260422-024219-ladsgroup.json
02:32 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2161 (
T410589
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260422-023211-ladsgroup.json
02:17 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host phab2003.codfw.wmnet with OS trixie
02:17 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
02:16 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 06s)
02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
01:59 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on phab2003.codfw.wmnet with reason: host reimage
01:51 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on phab2003.codfw.wmnet with reason: host reimage
01:37 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host phab2003.codfw.wmnet with OS trixie
01:24 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host phab2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
00:55 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host phab2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
2026-04-21
23:15 denisse@deploy1003: Finished deploy [librenms/librenms@4a0466d]: Upgrade LibreNMS to 26.4.0 -
T423229
(duration: 00m 18s)
23:15 denisse@deploy1003: Started deploy [librenms/librenms@4a0466d]: Upgrade LibreNMS to 26.4.0 -
T423229
{{safesubst:SAL entry|1=22:37 musikanimal@deploy1003: Finished scap sync-world: Backport for
Promote CM6 out of beta, remove CM5 modules, and add v6 aliases (T373720)
VisualEditor.CodeMirror.less: remove CM5 styles
CodeEditorHooks: remove temporary code for CodeMirror beta feature (T419332)
DescriptionField: use new module name for loading CodeMirror
, [[gerrit:1275998|H}}
22:31 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host rdb1015.eqiad.wmnet with OS trixie
22:25 musikanimal@deploy1003: musikanimal: Continuing with deployment
{{safesubst:SAL entry|1=22:19 musikanimal@deploy1003: musikanimal: Backport for
Promote CM6 out of beta, remove CM5 modules, and add v6 aliases (T373720)
VisualEditor.CodeMirror.less: remove CM5 styles
CodeEditorHooks: remove temporary code for CodeMirror beta feature (T419332)
DescriptionField: use new module name for loading CodeMirror
, [[gerrit:1275998|Hooks: remove}}
{{safesubst:SAL entry|1=22:02 musikanimal@deploy1003: Started scap sync-world: Backport for
Promote CM6 out of beta, remove CM5 modules, and add v6 aliases (T373720)
VisualEditor.CodeMirror.less: remove CM5 styles
CodeEditorHooks: remove temporary code for CodeMirror beta feature (T419332)
DescriptionField: use new module name for loading CodeMirror
, [[gerrit:1275998|Ho}}
21:58 cjming@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
21:57 cjming@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
21:47 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host rdb1016.eqiad.wmnet with OS trixie
21:29 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host rdb1016.eqiad.wmnet with OS trixie
21:29 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host rdb1016.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
21:17 jclark@cumin1003: START - Cookbook sre.hosts.provision for host rdb1016.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
21:12 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host rdb1016.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
21:10 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host rdb1015.eqiad.wmnet with OS trixie
21:07 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host rdb1015.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
21:05 jclark@cumin1003: START - Cookbook sre.hosts.provision for host rdb1016.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
21:00 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host rdb1016.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
20:57 musikanimal@deploy1003: Finished scap sync-world: Backport for
mw.FormDataTransport.test: Update expected API call for POSTed calls (T423529 T421288)
(duration: 06m 27s)
20:53 musikanimal@deploy1003: musikanimal: Continuing with deployment
20:53 jclark@cumin1003: START - Cookbook sre.hosts.provision for host rdb1016.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
20:52 jclark@cumin1003: START - Cookbook sre.hosts.provision for host rdb1015.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
20:52 musikanimal@deploy1003: musikanimal: Backport for
mw.FormDataTransport.test: Update expected API call for POSTed calls (T423529 T421288)
synced to the testservers (see
). Changes can now be verified there.
20:51 musikanimal@deploy1003: Started scap sync-world: Backport for
mw.FormDataTransport.test: Update expected API call for POSTed calls (T423529 T421288)
20:50 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
20:50 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding rdb1016 to eqiad - jclark@cumin1003"
20:49 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding rdb1016 to eqiad - jclark@cumin1003"
20:44 dancy@deploy1003: Installation of scap version "4.250.1" completed for 2 hosts
20:42 dancy@deploy1003: Installing scap version "4.250.1" for 2 host(s)
20:35 jclark@cumin1003: START - Cookbook sre.dns.netbox
20:28 Dreamy_Jazz: Evening UTC backport window done
20:16 Dreamy_Jazz: Running `mwscript-k8s maintenance/namespaceDupes.php --wiki=diqwiki --fix`
20:15 dreamyjazz@deploy1003: Finished scap sync-world: Backport for
Diqwiki: change project namespace (T328207)
Remove unused wgCheckUserUserAgentTableMigrationStage config
CheckUser Suggested Investigations: Enable on commonswiki (T424084)
(duration: 07m 38s)
20:11 dreamyjazz@deploy1003: pppery, dreamyjazz: Continuing with sync
20:09 dreamyjazz@deploy1003: pppery, dreamyjazz: Backport for
Diqwiki: change project namespace (T328207)
Remove unused wgCheckUserUserAgentTableMigrationStage config
CheckUser Suggested Investigations: Enable on commonswiki (T424084)
synced to the testservers (see
). Changes can now be verified there.
20:08 dreamyjazz@deploy1003: Started scap sync-world: Backport for
Diqwiki: change project namespace (T328207)
Remove unused wgCheckUserUserAgentTableMigrationStage config
CheckUser Suggested Investigations: Enable on commonswiki (T424084)
20:07 dancy@deploy1003: Installation of scap version "4.249.0" completed for 2 hosts
20:05 dancy@deploy1003: Installing scap version "4.249.0" for 2 host(s)
19:49 jasmine@dns1004: END - running authdns-update
19:47 jasmine@dns1004: START - running authdns-update
19:37 mutante: contint1003 - re-enabling puppet
T418521
19:32 Dreamy_Jazz: Created cusi_user, cusi_case, and cusi_signal on commonswiki on the extension1 database cluster -
T424084
18:02 dancy@deploy1003: Finished scap sync-world: Testing (duration: 02m 58s)
17:59 dancy@deploy1003: Started scap sync-world: Testing
17:58 dancy@deploy1003: Installation of scap version "4.250.0" completed for 2 hosts
17:56 dancy@deploy1003: Installing scap version "4.250.0" for 2 host(s)
17:42 rzl@deploy1003: Finished scap sync-world:
T423623
(duration: 02m 30s)
17:41 rzl@deploy1003: Started scap sync-world:
T423623
17:01 mvernon@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host thanos-be2005.codfw.wmnet with OS bullseye
17:00 aikochou@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
16:45 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on thanos-be2005.codfw.wmnet with reason: host reimage
16:37 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on thanos-be2005.codfw.wmnet with reason: host reimage
16:23 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host thanos-be2005.codfw.wmnet with OS bullseye
16:23 brennen@deploy1003: Finished deploy [phabricator/deployment@ceeecba]: deploy phab1004 for
T424059
(duration: 00m 38s)
16:22 brennen@deploy1003: Started deploy [phabricator/deployment@ceeecba]: deploy phab1004 for
T424059
16:22 brennen@deploy1003: Finished deploy [phabricator/deployment@ceeecba]: deploy phab2002 for
T424059
(duration: 00m 47s)
16:21 brennen@deploy1003: Started deploy [phabricator/deployment@ceeecba]: deploy phab2002 for
T424059
15:58 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host prometheus5003.eqsin.wmnet
15:57 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host prometheus5003.eqsin.wmnet with OS bookworm
15:45 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on prometheus5003.eqsin.wmnet with reason: host reimage
15:40 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on prometheus5003.eqsin.wmnet with reason: host reimage
15:39 moritzm: installing busybox updates from Trixie point release
15:05 brennen@deploy1003: Finished deploy [phabricator/deployment@ce0ec30]: deploy phab1004 for
T424033
(duration: 00m 43s)
15:04 aokoth@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on phab1004.eqiad.wmnet with reason: Phorge Deploy
15:04 brennen@deploy1003: Started deploy [phabricator/deployment@ce0ec30]: deploy phab1004 for
T424033
15:04 brennen@deploy1003: Finished deploy [phabricator/deployment@ce0ec30]: deploy phab2002 for
T424033
(duration: 00m 44s)
15:03 brennen@deploy1003: Started deploy [phabricator/deployment@ce0ec30]: deploy phab2002 for
T424033
15:01 aokoth@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on phab2002.codfw.wmnet with reason: Phorge Deploy
15:00 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host prometheus5003.eqsin.wmnet with OS bookworm
15:00 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db2161 (
T410589
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260421-150025-ladsgroup.json
15:00 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db2161.codfw.wmnet with reason: Maintenance
15:00 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2154 (
T410589
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260421-145959-ladsgroup.json
14:56 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM prometheus5003.eqsin.wmnet - jmm@cumin2002"
14:56 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM prometheus5003.eqsin.wmnet - jmm@cumin2002"
14:55 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) prometheus5003.eqsin.wmnet on all recursors
14:55 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache prometheus5003.eqsin.wmnet on all recursors
14:55 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
14:55 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM prometheus5003.eqsin.wmnet - jmm@cumin2002"
14:55 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM prometheus5003.eqsin.wmnet - jmm@cumin2002"
14:54 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host thanos-be1009.eqiad.wmnet with OS bullseye
14:53 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.powercycle (exit_code=0) for host sretest2010
14:51 elukey@cumin1003: START - Cookbook sre.hosts.powercycle for host sretest2010
14:51 jmm@cumin2002: START - Cookbook sre.dns.netbox
14:51 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host prometheus5003.eqsin.wmnet
14:49 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2154', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260421-144951-ladsgroup.json
14:45 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host tcp-proxy5004.eqsin.wmnet
14:45 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host tcp-proxy5004.eqsin.wmnet with OS trixie
14:39 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2154', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260421-143943-ladsgroup.json
14:39 cscott@deploy1003: Finished scap sync-world: Backport for
Increase Parsoid Read Views percentage for ruwiki to 55%
(duration: 09m 37s)
14:36 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on thanos-be1009.eqiad.wmnet with reason: host reimage
14:35 cscott@deploy1003: cscott: Continuing with sync
14:34 papaul: moving OOB link on mr1-eqiad to ge-0/0/7
14:32 moritzm: installing gdk-pixbuf security updates
14:31 marostegui@cumin1003: dbctl commit (dc=all): 'Repool ms1
T418979
', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260421-143145-marostegui.json
14:31 cscott@deploy1003: cscott: Backport for
Increase Parsoid Read Views percentage for ruwiki to 55%
synced to the testservers (see
). Changes can now be verified there.
14:30 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on thanos-be1009.eqiad.wmnet with reason: host reimage
14:30 marostegui@cumin1003: dbctl commit (dc=all): 'Add db2251 to ms1
T418979
', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260421-143017-marostegui.json
14:29 cscott@deploy1003: Started scap sync-world: Backport for
Increase Parsoid Read Views percentage for ruwiki to 55%
14:29 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2154 (
T410589
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260421-142935-ladsgroup.json
14:29 marostegui@cumin1003: dbctl commit (dc=all): 'Add db2251, remove db2142
T418979
', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260421-142913-marostegui.json
14:24 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on tcp-proxy5004.eqsin.wmnet with reason: host reimage
14:22 cscott@deploy1003: Finished scap sync-world: Backport for
Bump wikimedia/parsoid to 0.23.0-a28 (T420102 T421680 T422879 T422966 T423192 T423763 T423662)
Bump wikimedia/parsoid to 0.23.0-a28 (T423662)
[tests] add ParsoidLanguageConverterTest
ParsoidLanguageConverter: update lang/dir on content wrapper div (T423747)
(duration: 13m 02s)
14:21 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on tcp-proxy5004.eqsin.wmnet with reason: host reimage
14:16 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 15 days, 0:00:00 on db2142.codfw.wmnet,pc2011.codfw.wmnet with reason: Will be decommissioned
14:16 cscott@deploy1003: cscott: Continuing with sync
14:11 cscott@deploy1003: cscott: Backport for
Bump wikimedia/parsoid to 0.23.0-a28 (T420102 T421680 T422879 T422966 T423192 T423763 T423662)
Bump wikimedia/parsoid to 0.23.0-a28 (T423662)
[tests] add ParsoidLanguageConverterTest
ParsoidLanguageConverter: update lang/dir on content wrapper div (T423747)
synced to the testservers (see
14:10 cscott@deploy1003: Started scap sync-world: Backport for
Bump wikimedia/parsoid to 0.23.0-a28 (T420102 T421680 T422879 T422966 T423192 T423763 T423662)
Bump wikimedia/parsoid to 0.23.0-a28 (T423662)
[tests] add ParsoidLanguageConverterTest
ParsoidLanguageConverter: update lang/dir on content wrapper div (T423747)
14:08 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host thanos-be1009.eqiad.wmnet with OS bullseye
13:55 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2142.codfw.wmnet,db1152.eqiad.wmnet with reason: Cloning
13:55 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2142: Cloning
13:55 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
13:55 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
13:55 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2142: Cloning
13:53 jayme@cumin1003: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts wikikube-worker[1029,1089,1092,1098-1099,1106,1112].eqiad.wmnet
13:53 jayme@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
13:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host atlas5001.wikimedia.org
13:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM atlas5001.wikimedia.org - ayounsi@cumin1003"
13:52 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM atlas5001.wikimedia.org - ayounsi@cumin1003"
{{safesubst:SAL entry|1=13:52 stran@deploy1003: Finished scap sync-world: Backport for
Enable non-emergency categories via config (T423244)
Add next steps page for non-emergency "sockpuppetry" incidents (T423045)
Add next steps page for non-emergency "vandalism" incidents (T423563)
Add next steps page for non-emergency "user dispute" incidents (T423587)
, [[gerrit:127583}}
13:51 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) atlas5001.wikimedia.org on all recursors
13:51 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache atlas5001.wikimedia.org on all recursors
13:51 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
13:51 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM atlas5001.wikimedia.org - ayounsi@cumin1003"
13:50 jayme@cumin1003: START - Cookbook sre.dns.netbox
13:49 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM atlas5001.wikimedia.org - ayounsi@cumin1003"
13:45 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
13:45 ayounsi@cumin1003: START - Cookbook sre.ganeti.makevm for new host atlas5001.wikimedia.org
13:44 isaranto@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
13:44 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
13:44 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
13:40 stran@deploy1003: stran: Continuing with sync
13:39 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
13:39 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: eqsin remove old sandbox vlan - ayounsi@cumin1003"
13:38 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: eqsin remove old sandbox vlan - ayounsi@cumin1003"
{{safesubst:SAL entry|1=13:37 stran@deploy1003: stran: Backport for
Enable non-emergency categories via config (T423244)
Add next steps page for non-emergency "sockpuppetry" incidents (T423045)
Add next steps page for non-emergency "vandalism" incidents (T423563)
Add next steps page for non-emergency "user dispute" incidents (T423587)
, [[gerrit:1275836|Add next steps pa}}
13:34 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
13:30 ayounsi@dns1004: END - running authdns-update
13:29 ayounsi@dns1004: START - running authdns-update
13:27 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host tcp-proxy5004.eqsin.wmnet with OS trixie
13:27 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM tcp-proxy5004.eqsin.wmnet - jmm@cumin2002"
13:27 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM tcp-proxy5004.eqsin.wmnet - jmm@cumin2002"
13:26 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) tcp-proxy5004.eqsin.wmnet on all recursors
13:26 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache tcp-proxy5004.eqsin.wmnet on all recursors
13:26 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
13:26 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM tcp-proxy5004.eqsin.wmnet - jmm@cumin2002"
13:25 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM tcp-proxy5004.eqsin.wmnet - jmm@cumin2002"
{{safesubst:SAL entry|1=13:20 stran@deploy1003: Started scap sync-world: Backport for
Enable non-emergency categories via config (T423244)
Add next steps page for non-emergency "sockpuppetry" incidents (T423045)
Add next steps page for non-emergency "vandalism" incidents (T423563)
Add next steps page for non-emergency "user dispute" incidents (T423587)
, [[gerrit:1275836}}
13:16 aude@deploy1003: Finished scap sync-world: Backport for
Opt-in new accounts to the ReadingLists beta feature on enwiki (T420881)
(duration: 06m 50s)
13:13 jmm@cumin2002: START - Cookbook sre.dns.netbox
13:13 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host tcp-proxy5004.eqsin.wmnet
13:12 aude@deploy1003: aude: Continuing with sync
13:11 aude@deploy1003: aude: Backport for
Opt-in new accounts to the ReadingLists beta feature on enwiki (T420881)
synced to the testservers (see
). Changes can now be verified there.
13:09 aude@deploy1003: Started scap sync-world: Backport for
Opt-in new accounts to the ReadingLists beta feature on enwiki (T420881)
13:09 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host tcp-proxy5003.eqsin.wmnet
13:09 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host tcp-proxy5003.eqsin.wmnet with OS trixie
13:08 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
13:06 jayme@cumin1003: START - Cookbook sre.hosts.decommission for hosts wikikube-worker[1029,1089,1092,1098-1099,1106,1112].eqiad.wmnet
13:03 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host thanos-be1008.eqiad.wmnet with OS bullseye
13:03 jmm@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts puppetserver1002.eqiad.wmnet
13:03 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host puppetserver1002.eqiad.wmnet
12:56 jiji@deploy1003: Unlocked for deployment [ALL REPOSITORIES]: Upgrading mw-mcrouter - effie (duration: 33m 37s)
12:54 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host puppetserver1002.eqiad.wmnet
12:53 moritzm: update firmware on puppetserver1002: NIC from 22.31.6 to 23.21.6
T423282
12:52 jmm@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts puppetserver1002.eqiad.wmnet
12:49 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on tcp-proxy5003.eqsin.wmnet with reason: host reimage
12:49 jiji@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-mcrouter: apply
12:47 jiji@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-mcrouter: apply
12:45 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on tcp-proxy5003.eqsin.wmnet with reason: host reimage
12:45 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on thanos-be1008.eqiad.wmnet with reason: host reimage
12:44 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host phab2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
12:43 elukey@cumin1003: START - Cookbook sre.hosts.provision for host phab2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
12:41 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-mcrouter: apply
12:40 jmm@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts puppetserver1002.eqiad.wmnet
12:40 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host puppetserver1002.eqiad.wmnet
12:39 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/mw-mcrouter: apply
12:38 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on thanos-be1008.eqiad.wmnet with reason: host reimage
12:29 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host puppetserver1002.eqiad.wmnet
12:29 moritzm: update firmware on puppetserver1002: BIOS from 1.9.2 to 1.20.2
T423282
12:28 moritzm: update firmware on puppetserver1002: idrac from 6.10.30.20 to 7.20.80.50
T423282
12:23 jiji@deploy1003: Locking from deployment [ALL REPOSITORIES]: Upgrading mw-mcrouter - effie
12:22 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
12:15 jmm@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts puppetserver1002.eqiad.wmnet
12:15 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host thanos-be1008.eqiad.wmnet with OS bullseye
12:06 gkyziridis@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
12:02 marostegui@cumin1003: dbctl commit (dc=all): 'Pool back pc1 but with pc2021 replacing pc2011', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260421-120206-marostegui.json
11:58 jiji@deploy1003: Unlocked for deployment [ALL REPOSITORIES]: Upgrading mw-mcrouter - effie (duration: 68m 02s)
11:57 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool pc1011: Pool pc2021 into pc
11:57 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.parsercache (exit_code=99)
11:57 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
11:57 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool pc1011: Pool pc2021 into pc
11:57 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool pc2021: Pool pc2021 into pc
11:57 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.parsercache (exit_code=99)
11:57 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
11:57 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool pc2021: Pool pc2021 into pc
11:56 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool pc2021: Pool pc2021 into pc
11:56 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.parsercache (exit_code=99)
11:56 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
11:56 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool pc2021: Pool pc2021 into pc
11:53 gkyziridis@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
11:52 marostegui@cumin1003: dbctl commit (dc=all): 'add pc2021 to pc1', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260421-115209-marostegui.json
11:50 moritzm: installing Tornado security updates
11:48 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
11:48 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host tcp-proxy5003.eqsin.wmnet with OS trixie
11:47 marostegui@cumin1003: dbctl commit (dc=all): 'Remove pc2011 and add pc2021 as replacement', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260421-114718-marostegui.json
11:45 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM tcp-proxy5003.eqsin.wmnet - jmm@cumin2002"
11:45 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM tcp-proxy5003.eqsin.wmnet - jmm@cumin2002"
11:45 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) tcp-proxy5003.eqsin.wmnet on all recursors
11:45 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache tcp-proxy5003.eqsin.wmnet on all recursors
11:45 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
11:45 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM tcp-proxy5003.eqsin.wmnet - jmm@cumin2002"
11:44 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM tcp-proxy5003.eqsin.wmnet - jmm@cumin2002"
11:41 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-mcrouter: apply
11:41 jmm@cumin2002: START - Cookbook sre.dns.netbox
11:41 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host tcp-proxy5003.eqsin.wmnet
11:39 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1240.eqiad.wmnet with reason: Maintenance
11:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1223 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260421-113927-fceratto.json
11:38 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
11:38 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: test current diff - jmm@cumin2002"
11:38 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: test current diff - jmm@cumin2002"
11:33 jmm@cumin2002: START - Cookbook sre.dns.netbox
11:31 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
11:30 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1016,1020].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
11:30 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1161.eqiad.wmnet with reason: Maintenance
11:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1159 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260421-113010-fceratto.json
11:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1223', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260421-112919-fceratto.json
11:27 klausman@deploy1003: helmfile [eqiad] DONE helmfile.d/services/changeprop: apply
11:26 klausman@deploy1003: helmfile [eqiad] START helmfile.d/services/changeprop: apply
11:26 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2143: repool after maintenance
11:26 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
11:25 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
11:25 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2143: repool after maintenance
11:24 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.pool (exit_code=99) pool db2143: after reimage to trixie
11:24 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2143: after reimage to trixie
11:23 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2143.codfw.wmnet with OS trixie
11:21 klausman@deploy1003: helmfile [codfw] DONE helmfile.d/services/changeprop: apply
11:21 klausman@deploy1003: helmfile [codfw] START helmfile.d/services/changeprop: apply
11:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1159', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260421-112001-fceratto.json
11:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1223', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260421-111911-fceratto.json
11:11 claime: Enabling puppet on A:cp to deploy
T422804
11:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1159', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260421-110954-fceratto.json
11:09 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/mw-mcrouter: apply
11:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1223 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260421-110903-fceratto.json
11:07 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host tcp-proxy5003.eqsin.wmnet
11:07 jmm@cumin2002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
11:00 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2143.codfw.wmnet with reason: host reimage
10:59 jmm@cumin2002: START - Cookbook sre.dns.netbox
10:59 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host tcp-proxy5003.eqsin.wmnet
10:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1159 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260421-105945-fceratto.json
10:57 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host tcp-proxy5003.eqsin.wmnet
10:57 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host tcp-proxy5003.eqsin.wmnet with OS trixie
10:56 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
10:56 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2143.codfw.wmnet with reason: host reimage
10:51 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/proton: apply
10:50 gkyziridis@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
10:50 jiji@deploy1003: Locking from deployment [ALL REPOSITORIES]: Upgrading mw-mcrouter - effie
10:49 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/proton: apply
10:49 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/proton: apply
10:48 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host thanos-be1007.eqiad.wmnet with OS bullseye
10:47 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/proton: apply
10:44 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/proton: apply
10:43 jmm@deploy1003: helmfile [staging] START helmfile.d/services/proton: apply
10:39 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1223 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260421-103945-fceratto.json
10:39 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1223.eqiad.wmnet with reason: Maintenance
10:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1212 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260421-103915-fceratto.json
10:38 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2143.codfw.wmnet with OS trixie
10:37 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2143: Reimage to Trixie
10:37 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
10:37 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
10:37 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2143: Reimage to Trixie
10:37 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2143.codfw.wmnet with reason: Reimage to Trixie
10:36 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
10:35 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2143.codfw.wmnet,db1153.eqiad.wmnet with reason: Reimage to Trixie
10:30 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on thanos-be1007.eqiad.wmnet with reason: host reimage
10:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1212', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260421-102907-fceratto.json
10:22 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on thanos-be1007.eqiad.wmnet with reason: host reimage
10:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1212', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260421-101857-fceratto.json
10:13 lucaswerkmeister-wmde@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikidata-query-gui: apply
10:13 lucaswerkmeister-wmde@deploy1003: helmfile [codfw] START helmfile.d/services/wikidata-query-gui: apply
10:13 lucaswerkmeister-wmde@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikidata-query-gui: apply
10:13 lucaswerkmeister-wmde@deploy1003: helmfile [eqiad] START helmfile.d/services/wikidata-query-gui: apply
10:12 lucaswerkmeister-wmde@deploy1003: helmfile [staging] DONE helmfile.d/services/wikidata-query-gui: apply
10:12 lucaswerkmeister-wmde@deploy1003: helmfile [staging] START helmfile.d/services/wikidata-query-gui: apply
10:10 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
10:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1212 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260421-100849-fceratto.json
10:07 claime: Disabling puppet on A:cp to merge
T422804
10:03 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host tcp-proxy5003.eqsin.wmnet with OS trixie
10:00 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1212 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260421-100051-fceratto.json
10:00 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on 6 hosts with reason: Maintenance
10:00 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1212.eqiad.wmnet with reason: Maintenance
10:00 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host thanos-be1007.eqiad.wmnet with OS bullseye
09:59 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1159 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260421-095928-fceratto.json
09:59 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1159.eqiad.wmnet with reason: Maintenance
09:55 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM tcp-proxy5003.eqsin.wmnet - jmm@cumin2002"
09:55 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM tcp-proxy5003.eqsin.wmnet - jmm@cumin2002"
09:54 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1198: Security update
09:54 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) tcp-proxy5003.eqsin.wmnet on all recursors
09:54 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache tcp-proxy5003.eqsin.wmnet on all recursors
09:54 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
09:54 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM tcp-proxy5003.eqsin.wmnet - jmm@cumin2002"
09:54 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM tcp-proxy5003.eqsin.wmnet - jmm@cumin2002"
09:50 jmm@cumin2002: START - Cookbook sre.dns.netbox
09:50 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host tcp-proxy5003.eqsin.wmnet
09:47 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
09:45 moritzm: updating debdeploy on trixie to 0.0.99.15
09:39 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1153: repool after maintenance
09:39 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
09:39 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
09:39 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1153: repool after maintenance
09:38 gkyziridis@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
09:37 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.pool (exit_code=99) pool db1153: repool after maintenance
09:37 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1153: repool after maintenance
09:35 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.pool (exit_code=99) pool db1153: after reimage to trixie
09:35 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1153: after reimage to trixie
09:34 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1016,1020].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
09:34 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1161.eqiad.wmnet with reason: Maintenance
09:34 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1153.eqiad.wmnet with OS trixie
09:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1159 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260421-093401-fceratto.json
09:26 moritzm: imported debdeploy 0.0.99.15 for trixie-wikimedia (compat release for Cumin 6)
09:23 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1159', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260421-092352-fceratto.json
09:21 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1198: Security update
09:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1198 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260421-091949-fceratto.json
09:17 gkyziridis@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
09:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1159', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260421-091344-fceratto.json
09:11 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1198 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260421-091124-fceratto.json
09:11 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1198.eqiad.wmnet with reason: Maintenance
09:11 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1153.eqiad.wmnet with reason: host reimage
09:05 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1175.eqiad.wmnet with reason: Maintenance
09:05 jayme: kubectl delete node $(nodeset -e wikikube-worker[1002-1005,1011-1012,1019-1020,1029-1031,1058-1063,1082-1083,1088-1092,1096-1112,1166-1168].eqiad.wmnet) -
T423863
09:05 fabfur: restarting pybal on lvs1019-1020 to clear alerts
09:03 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1175 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260421-090358-fceratto.json
09:03 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1175.eqiad.wmnet with reason: Maintenance
09:03 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1159 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260421-090336-fceratto.json
09:01 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1153.eqiad.wmnet with reason: host reimage
09:00 jayme: homer 'asw2-a-eqiad.mgmt.eqiad.wmnet' commit -
T423863
09:00 jayme: homer 'asw2-b-eqiad.mgmt.eqiad.wmnet' commit -
T423863
08:56 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
08:51 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1157: Security update
08:50 jayme: homer 'cr*eqiad*' commit -
T423863
08:46 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti5006.eqsin.wmnet to cluster eqsin02 and group 01
08:46 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1153.eqiad.wmnet with OS trixie
08:45 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1153: Reimage to Trixie
08:45 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
08:45 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
08:45 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1153: Reimage to Trixie
08:45 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1153.eqiad.wmnet with reason: Reimage to Trixie
08:45 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2143.codfw.wmnet,db1153.eqiad.wmnet with reason: Reimage to Trixie
08:43 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti5006.eqsin.wmnet to cluster eqsin02 and group 01
08:40 jayme@cumin1003: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts wikikube-worker[1002-1005,1011-1012,1019-1020,1029-1031,1058-1061].eqiad.wmnet
08:40 jayme@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
08:39 jayme@cumin1003: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts wikikube-worker[1062-1063,1082-1083,1088-1092,1096-1101].eqiad.wmnet
08:39 jayme@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
08:39 jayme@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: wikikube-worker[1062-1063,1082-1083,1088-1092,1096-1101].eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jayme@cumin1003"
08:39 jayme@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: wikikube-worker[1062-1063,1082-1083,1088-1092,1096-1101].eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jayme@cumin1003"
08:38 jayme@cumin1003: START - Cookbook sre.dns.netbox
08:32 jayme@cumin1003: START - Cookbook sre.dns.netbox
08:32 moritzm: installing gst-plugins-base1.0 security updates
08:32 jayme@cumin1003: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts wikikube-worker[1102-1112,1166-1168].eqiad.wmnet
08:32 jayme@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
08:32 jayme@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: wikikube-worker[1102-1112,1166-1168].eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jayme@cumin1003"
08:31 jayme@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: wikikube-worker[1102-1112,1166-1168].eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jayme@cumin1003"
08:27 jayme@cumin1003: START - Cookbook sre.dns.netbox
08:20 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1157: Security update
08:18 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti5006.eqsin.wmnet
08:18 musikanimal@deploy1003: Finished scap sync-world: Backport for
ext.abuseFilter.edit.js: temporary locking of CodeMirror lineWrapping (T423773 T423756)
(duration: 07m 01s)
08:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1157 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260421-081717-fceratto.json
08:14 elukey: bootstrapping pki intermediate discovery2026
08:14 musikanimal@deploy1003: musikanimal: Continuing with sync
08:12 musikanimal@deploy1003: musikanimal: Backport for
ext.abuseFilter.edit.js: temporary locking of CodeMirror lineWrapping (T423773 T423756)
synced to the testservers (see
). Changes can now be verified there.
08:10 musikanimal@deploy1003: Started scap sync-world: Backport for
ext.abuseFilter.edit.js: temporary locking of CodeMirror lineWrapping (T423773 T423756)
08:09 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1157 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260421-080936-fceratto.json
08:09 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1157.eqiad.wmnet with reason: Maintenance
08:08 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti5006.eqsin.wmnet
08:06 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host thanos-be1006.eqiad.wmnet with OS bullseye
08:04 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1150.eqiad.wmnet with reason: Maintenance
08:03 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1159 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260421-080314-fceratto.json
08:03 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1159.eqiad.wmnet with reason: Maintenance
07:51 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1025.eqiad.wmnet,service=s4
07:51 fnegri@cumin1003: conftool action : set/weight=100; selector: name=clouddb1025.eqiad.wmnet,service=s4
07:49 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1025.eqiad.wmnet,service=s6
07:49 fnegri@cumin1003: conftool action : set/weight=100; selector: name=clouddb1025.eqiad.wmnet,service=s6
07:48 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for ms-be1064.eqiad.wmnet
07:48 mvernon@cumin2002: START - Cookbook sre.hosts.remove-downtime for ms-be1064.eqiad.wmnet
07:46 jayme@cumin1003: START - Cookbook sre.hosts.decommission for hosts wikikube-worker[1102-1112,1166-1168].eqiad.wmnet
07:46 jayme@cumin1003: START - Cookbook sre.hosts.decommission for hosts wikikube-worker[1062-1063,1082-1083,1088-1092,1096-1101].eqiad.wmnet
07:46 jayme@cumin1003: START - Cookbook sre.hosts.decommission for hosts wikikube-worker[1002-1005,1011-1012,1019-1020,1029-1031,1058-1061].eqiad.wmnet
07:45 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on thanos-be1006.eqiad.wmnet with reason: host reimage
07:39 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool pc1011: Reimage to Trixie
07:39 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
07:39 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
07:39 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool pc1011: Reimage to Trixie
07:38 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on thanos-be1006.eqiad.wmnet with reason: host reimage
07:17 mvernon@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on ms-be1064.eqiad.wmnet with reason: vacuum overlarge container dbs
07:16 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host thanos-be1006.eqiad.wmnet with OS bullseye
07:06 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on pc2011.codfw.wmnet,pc1011.eqiad.wmnet with reason: Cloning pc2021 from pc2011
07:05 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool pc2011: Cloning pc2021
07:05 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
07:05 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
07:05 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool pc2011: Cloning pc2021
07:04 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2144: After reimage
07:04 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
07:04 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
07:04 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2144: After reimage
07:03 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.pool (exit_code=99) pool db2144: after reimage to trixie
07:03 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2144: after reimage to trixie
07:02 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2144.codfw.wmnet with OS trixie
06:38 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2144.codfw.wmnet with reason: host reimage
06:33 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2144.codfw.wmnet with reason: host reimage
06:12 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2144.codfw.wmnet with OS trixie
06:12 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2144: Reimage to Trixie
06:12 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
06:12 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
06:12 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2144: Reimage to Trixie
06:12 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2144.codfw.wmnet with reason: Reimage to Trixie
06:11 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2144.codfw.wmnet,db1151.eqiad.wmnet with reason: Reimage to Trixie
05:40 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
05:27 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on clouddb1015.eqiad.wmnet with reason: Clone s6 to clouddb1025
05:25 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on clouddb1025.eqiad.wmnet with reason: Clone s6
04:02 mwpresync@deploy1003: Pruned MediaWiki: 1.46.0-wmf.22 (duration: 02m 30s)
02:53 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db2154 (
T410589
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260421-025311-ladsgroup.json
02:53 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db2154.codfw.wmnet with reason: Maintenance
02:52 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2152 (
T410589
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260421-025245-ladsgroup.json
02:42 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2152', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260421-024237-ladsgroup.json
02:32 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2152', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260421-023228-ladsgroup.json
02:22 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2152 (
T410589
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260421-022219-ladsgroup.json
02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 03s)
02:01 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
01:32 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db2250.codfw.wmnet with OS bookworm
01:32 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
01:28 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
01:24 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2251.codfw.wmnet with OS bookworm
01:24 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
01:24 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
01:23 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2252.codfw.wmnet with OS bookworm
01:23 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
01:20 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
01:18 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2253.codfw.wmnet with OS bookworm
01:18 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
01:17 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
01:12 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2250.codfw.wmnet with reason: host reimage
01:07 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2251.codfw.wmnet with reason: host reimage
01:03 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2252.codfw.wmnet with reason: host reimage
01:02 zabe: marked 543 revisions as bad #
T393237
00:59 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2253.codfw.wmnet with reason: host reimage
00:52 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2250.codfw.wmnet with reason: host reimage
00:52 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2251.codfw.wmnet with reason: host reimage
00:52 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2252.codfw.wmnet with reason: host reimage
00:51 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2253.codfw.wmnet with reason: host reimage
00:35 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host db2253.codfw.wmnet with OS bookworm
00:35 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host db2252.codfw.wmnet with OS bookworm
00:35 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host db2251.codfw.wmnet with OS bookworm
00:35 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host db2250.codfw.wmnet with OS bookworm
00:34 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db2250.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
00:33 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db2253.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
00:33 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db2252.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
00:32 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db2251.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
00:29 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host db2253.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
00:28 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host db2252.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
00:28 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host db2251.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
00:28 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host db2250.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
2026-04-20
23:40 jdlrobson@deploy1003: Finished scap sync-world: Backport for
Restore PageImages functionality to Wikisources and Wikibooks (T417538)
(duration: 07m 47s)
23:39 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host pc2024.codfw.wmnet with OS trixie
23:36 jdlrobson@deploy1003: jdlrobson, ignaciorodrguez: Continuing with sync
23:34 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host pc2023.codfw.wmnet with OS trixie
23:34 jdlrobson@deploy1003: jdlrobson, ignaciorodrguez: Backport for
Restore PageImages functionality to Wikisources and Wikibooks (T417538)
synced to the testservers (see
). Changes can now be verified there.
23:32 jdlrobson@deploy1003: Started scap sync-world: Backport for
Restore PageImages functionality to Wikisources and Wikibooks (T417538)
23:29 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host pc2022.codfw.wmnet with OS trixie
23:28 jdlrobson@deploy1003: Finished scap sync-world: Backport for
[Mobile Page Previews] Avoid syntax error on older browsers (T423959)
(duration: 08m 13s)
23:24 jdlrobson@deploy1003: jdlrobson: Continuing with sync
23:24 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on pc2024.codfw.wmnet with reason: host reimage
23:21 jdlrobson@deploy1003: jdlrobson: Backport for
[Mobile Page Previews] Avoid syntax error on older browsers (T423959)
synced to the testservers (see
). Changes can now be verified there.
23:20 jdlrobson@deploy1003: Started scap sync-world: Backport for
[Mobile Page Previews] Avoid syntax error on older browsers (T423959)
23:19 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on pc2023.codfw.wmnet with reason: host reimage
23:16 jdlrobson@deploy1003: Finished scap sync-world: Backport for
Revert "Skin: Avoid stretching low resolution images" (T421524 T423676)
(duration: 05m 56s)
23:14 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on pc2022.codfw.wmnet with reason: host reimage
23:12 jdlrobson@deploy1003: cscott, jdlrobson: Continuing with sync
23:12 jdlrobson@deploy1003: cscott, jdlrobson: Backport for
Revert "Skin: Avoid stretching low resolution images" (T421524 T423676)
synced to the testservers (see
). Changes can now be verified there.
23:10 jdlrobson@deploy1003: Started scap sync-world: Backport for
Revert "Skin: Avoid stretching low resolution images" (T421524 T423676)
23:07 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on pc2024.codfw.wmnet with reason: host reimage
23:07 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on pc2023.codfw.wmnet with reason: host reimage
23:06 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on pc2022.codfw.wmnet with reason: host reimage
22:53 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host pc2024.codfw.wmnet with OS trixie
22:52 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host pc2023.codfw.wmnet with OS trixie
22:52 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host pc2022.codfw.wmnet with OS trixie
21:59 rzl@deploy1003: Finished scap sync-world:
T423311
T423624
(duration: 03m 24s)
21:57 rzl@deploy1003: Started scap sync-world:
T423311
T423624
21:42 maryum: Deployed security fix for
T406954
21:33 maryum: Deployed security fix for
T299359
20:16 aude@deploy1003: Finished scap sync-world: Backport for
Do not show donate button on affiliate wikis (T423876)
(duration: 10m 57s)
20:10 aude@deploy1003: aude: Continuing with sync
20:08 aude@deploy1003: aude: Backport for
Do not show donate button on affiliate wikis (T423876)
synced to the testservers (see
). Changes can now be verified there.
20:05 aude@deploy1003: Started scap sync-world: Backport for
Do not show donate button on affiliate wikis (T423876)
19:48 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host phab2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
19:46 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2190: Security update
19:28 elukey@cumin1003: START - Cookbook sre.hosts.provision for host phab2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
19:04 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host phab2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
19:00 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2190: Security update
18:58 dancy@deploy1003: Installation of scap version "4.249.0" completed for 2 hosts
18:56 dancy@deploy1003: Installing scap version "4.249.0" for 2 host(s)
{{safesubst:SAL entry|1=18:55 jforrester@deploy1003: Finished scap sync-world: Backport for
Attribution: Clean up API spec descriptions (T422502)
, [[gerrit:1275476|i18n: Use
Template:Doc-markdown
template in Attribution qqq.json (T422502)]],
Attribution: Documentation copyedits
Attribution: Update contact and add call to action (T422502)
, [[gerrit:1275478|Attribution: Add localized texts for tre}}
18:44 jforrester@deploy1003: pmiazga, jforrester: Continuing with sync
{{safesubst:SAL entry|1=18:42 jforrester@deploy1003: pmiazga, jforrester: Backport for
Attribution: Clean up API spec descriptions (T422502)
, [[gerrit:1275476|i18n: Use
Template:Doc-markdown
template in Attribution qqq.json (T422502)]],
Attribution: Documentation copyedits
Attribution: Update contact and add call to action (T422502)
, [[gerrit:1275478|Attribution: Add localized texts for trending}}
{{safesubst:SAL entry|1=18:25 jforrester@deploy1003: Started scap sync-world: Backport for
Attribution: Clean up API spec descriptions (T422502)
, [[gerrit:1275476|i18n: Use
Template:Doc-markdown
template in Attribution qqq.json (T422502)]],
Attribution: Documentation copyedits
Attribution: Update contact and add call to action (T422502)
, [[gerrit:1275478|Attribution: Add localized texts for tren}}
18:11 Amir1: drop of langlinks table on testcommonswiki (
T421914
18:07 herron@dns1004: END - running authdns-update
18:05 herron@dns1004: START - running authdns-update
17:56 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host thanos-be1005.eqiad.wmnet with OS bullseye
17:47 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-video: apply
17:46 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-video: apply
17:46 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-timeline: apply
17:45 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-timeline: apply
17:45 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
17:45 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-syntaxhighlight: apply
17:44 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-media: apply
17:44 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-media: apply
17:43 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-constraints: apply
17:43 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-constraints: apply
17:42 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox: apply
17:41 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox: apply
17:38 elukey@cumin1003: START - Cookbook sre.hosts.provision for host phab2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
17:37 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host phab2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
17:36 elukey@cumin1003: START - Cookbook sre.hosts.provision for host phab2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
17:35 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudelastic1012.eqiad.wmnet with OS trixie
17:34 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on thanos-be1005.eqiad.wmnet with reason: host reimage
17:27 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-video: apply
17:27 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on thanos-be1005.eqiad.wmnet with reason: host reimage
17:26 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-video: apply
17:26 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-timeline: apply
17:25 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-timeline: apply
17:25 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
17:24 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-syntaxhighlight: apply
17:24 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-media: apply
17:24 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-media: apply
17:23 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-constraints: apply
17:23 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-constraints: apply
17:22 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox: apply
17:21 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox: apply
17:18 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-video: apply
17:18 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudelastic1012.eqiad.wmnet with reason: host reimage
17:18 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-video: apply
17:18 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-timeline: apply
17:18 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-timeline: apply
17:18 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
17:17 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-syntaxhighlight: apply
17:17 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-media: apply
17:17 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-media: apply
17:17 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-constraints: apply
17:17 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-constraints: apply
17:17 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox: apply
17:16 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox: apply
17:15 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host thanos-be1005.eqiad.wmnet with OS bullseye
17:14 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudelastic1012.eqiad.wmnet with reason: host reimage
17:02 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1012.eqiad.wmnet with OS trixie
16:55 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1161 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260420-165459-fceratto.json
16:54 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1016,1020].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
16:54 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1161.eqiad.wmnet with reason: Maintenance
16:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1159 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260420-165423-fceratto.json
16:52 moritzm: installing imagemagick security updates
16:48 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti5006.eqsin.wmnet with OS bookworm
16:48 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM aux-k8s-etcd1003.eqiad.wmnet
16:44 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM aux-k8s-etcd1003.eqiad.wmnet
16:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1159', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260420-164415-fceratto.json
16:38 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM aux-k8s-etcd1003.eqiad.wmnet
16:36 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2036: Moving to another rack
16:34 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM aux-k8s-etcd1003.eqiad.wmnet
16:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1159', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260420-163407-fceratto.json
16:33 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM aux-k8s-etcd1003.eqiad.wmnet
16:29 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM aux-k8s-etcd1003.eqiad.wmnet
16:29 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM backupmon1001.eqiad.wmnet
16:27 marostegui@dns1004: END - running authdns-update
16:26 marostegui: Switchover m3 proxy (phabricator)
16:26 marostegui@dns1004: START - running authdns-update
16:25 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti5006.eqsin.wmnet with reason: host reimage
16:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1159 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260420-162359-fceratto.json
16:21 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1166: Security update
16:19 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM backupmon1001.eqiad.wmnet
16:17 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti5006.eqsin.wmnet with reason: host reimage
16:17 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
16:17 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
16:06 bking@cumin2002: conftool action : set/pooled=no; selector: name=cloudelastic1012.eqiad.wmnet
15:57 moritzm: installing libvirt security updates
15:55 sukhe: sudo cumin -b31 "A:cp and not P{cp2041* or cp2042*}" "run-puppet-agent --enable 'merging CR 1272869'"
15:51 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2036: Moving to another rack
15:50 cgoubert@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2188.codfw.wmnet
15:50 cgoubert@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2188.codfw.wmnet
15:50 marostegui@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool es2036: Moving to another rack
15:50 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2036: Moving to another rack
15:50 cgoubert@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for wikikube-worker2188.codfw.wmnet
15:50 cgoubert@cumin1003: START - Cookbook sre.hosts.remove-downtime for wikikube-worker2188.codfw.wmnet
15:48 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti5006.eqsin.wmnet with OS bookworm
15:41 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
15:41 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
15:36 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host cloudelastic1012.eqiad.wmnet
15:36 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host es2036
15:36 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host es2036
15:36 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dbstore1010.eqiad.wmnet with OS bookworm
15:36 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
15:36 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
15:35 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1166: Security update
15:34 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
15:34 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
15:33 jhancock@cumin2002: START - Cookbook sre.dns.netbox
15:27 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
15:25 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudelastic1012.eqiad.wmnet
15:25 fceratto@cumin1003: END (FAIL) - Cookbook sre.mysql.pool (exit_code=99) pool db1166: Security update
15:23 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1159 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260420-152341-fceratto.json
15:23 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1159.eqiad.wmnet with reason: Maintenance
15:20 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1166: Security update
15:14 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es2036.codfw.wmnet with reason: Moved to anotehr rack
15:13 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2036: Moving to another rack
15:13 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2036: Moving to another rack
15:11 jhancock@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['wikikube-worker2188']
15:11 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.pool (exit_code=99) pool db1151: repool after maintenance
15:11 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1151: repool after maintenance
15:09 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dbstore1010.eqiad.wmnet with reason: host reimage
15:05 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host thanos-be2006.codfw.wmnet with OS bullseye
15:05 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dbstore1010.eqiad.wmnet with reason: host reimage
15:03 trueg@deploy1003: helmfile [staging] DONE helmfile.d/services/rdf-streaming-updater: apply
15:03 trueg@deploy1003: helmfile [staging] START helmfile.d/services/rdf-streaming-updater: apply
14:52 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
14:51 jhancock@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['wikikube-worker2188']
14:46 cwhite@deploy1003: Finished deploy [performance/arc-lamp@bd7b2ab]:
T413127
(duration: 00m 08s)
14:45 cwhite@deploy1003: Started deploy [performance/arc-lamp@bd7b2ab]:
T413127
14:45 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on thanos-be2006.codfw.wmnet with reason: host reimage
14:43 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.pool (exit_code=99) pool db1151: after reimage to trixie
14:43 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1151: after reimage to trixie
14:42 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1151.eqiad.wmnet with OS trixie
14:40 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host dbstore1010.eqiad.wmnet with OS bookworm
14:37 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host dbstore1010.eqiad.wmnet with OS bookworm
14:37 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host dbstore1010.eqiad.wmnet with OS bookworm
14:36 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on thanos-be2006.codfw.wmnet with reason: host reimage
14:35 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dbstore1010.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
14:26 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
14:21 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudelastic1012.eqiad.wmnet
14:21 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1166 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260420-142120-fceratto.json
14:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1166.eqiad.wmnet with reason: Maintenance
14:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1157 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260420-142050-fceratto.json
14:19 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudelastic1012.eqiad.wmnet
14:19 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1151.eqiad.wmnet with reason: host reimage
14:15 jclark@cumin1003: START - Cookbook sre.hosts.provision for host dbstore1010.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
14:14 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1151.eqiad.wmnet with reason: host reimage
14:14 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host thanos-be2006.codfw.wmnet with OS bullseye
14:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1157', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260420-141042-fceratto.json
14:02 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
14:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1253 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260420-140203-fceratto.json
14:02 urandom: upgrade envoyproxy, restbase —
T419637
T410975
14:00 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1151.eqiad.wmnet with OS trixie
14:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1157', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260420-140033-fceratto.json
14:00 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1151: Reimage to Trixie
14:00 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
14:00 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
14:00 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1151: Reimage to Trixie
14:00 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1151.eqiad.wmnet with reason: Reimage to Trixie
14:00 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
13:59 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2144.codfw.wmnet,db1151.eqiad.wmnet with reason: Reimage to Trixie
13:52 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db2152 (
T410589
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260420-135255-ladsgroup.json
13:52 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db2152.codfw.wmnet with reason: Maintenance
13:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1253', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260420-135155-fceratto.json
13:50 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
13:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1157 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260420-135025-fceratto.json
13:47 jclark@cumin1003: START - Cookbook sre.dns.netbox
13:45 jclark@cumin1003: START - Cookbook sre.hosts.provision for host dbstore1010.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
13:44 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
13:44 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding dbstore1010 to eqiad - jclark@cumin1003"
13:44 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding dbstore1010 to eqiad - jclark@cumin1003"
13:41 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1157 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260420-134158-fceratto.json
13:41 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1157.eqiad.wmnet with reason: Maintenance
13:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1253', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260420-134148-fceratto.json
13:37 jclark@cumin1003: START - Cookbook sre.dns.netbox
13:35 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1150.eqiad.wmnet with reason: Maintenance
13:34 Lucas_WMDE: UTC afternoon backport+config window done
13:32 urandom: decommissioning Cassandra, aqs1014 [a,b] —
T412830
13:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1253 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260420-133139-fceratto.json
13:30 eevans@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on aqs1014.eqiad.wmnet with reason: Decommissioning —
T412830
13:29 phuedx@deploy1003: Finished scap sync-world: Backport for
PHP SDK: Split measurement of unknown experiments (T422112)
(duration: 07m 51s)
13:29 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1253 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260420-132926-fceratto.json
13:29 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1253.eqiad.wmnet with reason: Maintenance
13:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1236 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260420-132901-fceratto.json
13:26 phuedx@deploy1003: phuedx: Continuing with sync
13:23 phuedx@deploy1003: phuedx: Backport for
PHP SDK: Split measurement of unknown experiments (T422112)
synced to the testservers (see
). Changes can now be verified there.
13:22 phuedx@deploy1003: Started scap sync-world: Backport for
PHP SDK: Split measurement of unknown experiments (T422112)
13:20 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for
Remove unused JWT for bot password temporary config (T422367 T415007)
Enable ReadingLists beta feature for all Wikipedia wikis (T420881)
(duration: 08m 21s)
13:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1236', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260420-131853-fceratto.json
13:16 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, aude, d3r1ck01: Continuing with sync
13:15 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, aude, d3r1ck01: Backport for
Remove unused JWT for bot password temporary config (T422367 T415007)
Enable ReadingLists beta feature for all Wikipedia wikis (T420881)
synced to the testservers (see
). Changes can now be verified there.
13:12 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for
Remove unused JWT for bot password temporary config (T422367 T415007)
Enable ReadingLists beta feature for all Wikipedia wikis (T420881)
13:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1236', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260420-130845-fceratto.json
12:58 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1002-1005,1011-1012,1019-1020,1029-1031,1058-1063,1082-1083,1088-1092,1096,1098-1112,1166-1168].eqiad.wmnet
12:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1236 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260420-125837-fceratto.json
12:56 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1236 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260420-125624-fceratto.json
12:56 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1236.eqiad.wmnet with reason: Maintenance
12:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1231 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260420-125559-fceratto.json
12:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1231', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260420-124550-fceratto.json
12:35 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts moss-be[1001-1002].eqiad.wmnet
12:35 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
12:35 mvernon@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: moss-be[1001-1002].eqiad.wmnet decommissioned, removing all IPs except the asset tag one - mvernon@cumin2002"
12:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1231', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260420-123542-fceratto.json
12:31 mvernon@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: moss-be[1001-1002].eqiad.wmnet decommissioned, removing all IPs except the asset tag one - mvernon@cumin2002"
12:28 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1002-1005,1011-1012,1019-1020,1029-1031,1058-1063,1082-1083,1088-1092,1096,1098-1112,1166-1168].eqiad.wmnet
12:26 mvernon@cumin2002: START - Cookbook sre.dns.netbox
12:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1231 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260420-122534-fceratto.json
12:23 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1231 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260420-122321-fceratto.json
12:23 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1231.eqiad.wmnet with reason: Maintenance
12:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1227 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260420-122256-fceratto.json
12:17 zabe: Deployed patch for
T423821
12:16 moritzm: remove ganeti5006 from eqsin01 Ganeti cluster (running classic Ganeti)
T421863
12:15 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1328-1334,1360-1374].eqiad.wmnet
12:15 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1328-1334,1360-1374].eqiad.wmnet
12:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1227', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260420-121247-fceratto.json
12:11 mvernon@cumin2002: START - Cookbook sre.hosts.decommission for hosts moss-be[1001-1002].eqiad.wmnet
12:10 moritzm: installing edk2 security updates
12:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1227', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260420-120239-fceratto.json
11:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1227 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260420-115231-fceratto.json
11:49 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti5006.eqsin.wmnet
11:17 fnegri@cumin1003: conftool action : set/pooled=no; selector: name=clouddb1025.eqiad.wmnet,service=x4
10:52 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1227 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260420-105213-fceratto.json
10:52 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1227.eqiad.wmnet with reason: Maintenance
10:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1202 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260420-105148-fceratto.json
10:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1202', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260420-104141-fceratto.json
10:32 trueg@deploy1003: helmfile [staging] DONE helmfile.d/services/rdf-streaming-updater: apply
10:32 trueg@deploy1003: helmfile [staging] START helmfile.d/services/rdf-streaming-updater: apply
10:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1202', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260420-103133-fceratto.json
10:26 kamila@deploy1003: Finished scap sync-world: ICU 72 upgrade (duration: 51m 35s)
10:24 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts bast1003.wikimedia.org
10:24 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
10:24 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: bast1003.wikimedia.org decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
10:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1202 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260420-102125-fceratto.json
10:19 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1202 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260420-101913-fceratto.json
10:19 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1202.eqiad.wmnet with reason: Maintenance
10:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1194 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260420-101847-fceratto.json
10:15 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: bast1003.wikimedia.org decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
10:14 kamila@deploy1003: kamila: Continuing with sync
10:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1194', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260420-100839-fceratto.json
10:07 jmm@cumin2002: START - Cookbook sre.dns.netbox
10:04 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2190 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260420-100423-fceratto.json
10:04 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2190.codfw.wmnet with reason: Maintenance
10:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260420-100402-fceratto.json
10:02 Emperor: ceph orch host drain moss-be1002
T418901
10:02 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1165: after reimage to trixie
09:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1194', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260420-095831-fceratto.json
09:58 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts bast1003.wikimedia.org
09:53 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260420-095354-fceratto.json
09:52 kamila@deploy1003: kamila: ICU 72 upgrade synced to the testservers (see
). Changes can now be verified there.
09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1194 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260420-094823-fceratto.json
09:48 klausman@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'llm' for release 'main' .
09:46 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1194 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260420-094612-fceratto.json
09:46 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1194.eqiad.wmnet with reason: Maintenance
09:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1191 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260420-094546-fceratto.json
09:43 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260420-094345-fceratto.json
09:43 Emperor: ceph orch host drain moss-be1001
T418901
09:40 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM puppetboard1003.eqiad.wmnet
09:36 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM puppetboard1003.eqiad.wmnet
09:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1191', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260420-093538-fceratto.json
09:35 kamila@deploy1003: Started scap sync-world: ICU 72 upgrade
09:33 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260420-093337-fceratto.json
09:33 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM puppetboard2003.codfw.wmnet
09:29 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM puppetboard2003.codfw.wmnet
09:26 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti5006.eqsin.wmnet
09:26 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti5006.eqsin.wmnet
09:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1191', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260420-092530-fceratto.json
09:25 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM testvm2008.wikimedia.org
09:25 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti5006.eqsin.wmnet
09:24 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2177 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260420-092448-fceratto.json
09:24 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2177.codfw.wmnet with reason: Maintenance
09:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260420-092417-fceratto.json
09:24 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts testvm2006.codfw.wmnet
09:24 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
09:24 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: testvm2006.codfw.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
09:22 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: testvm2006.codfw.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
09:21 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
09:21 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM testvm2008.wikimedia.org
09:19 klausman@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
09:18 klausman@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'llm' for release 'main' .
09:16 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1165: after reimage to trixie
09:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1191 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260420-091522-fceratto.json
09:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260420-091409-fceratto.json
09:14 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1165.eqiad.wmnet with OS trixie
09:13 trueg@deploy1003: helmfile [staging] DONE helmfile.d/services/rdf-streaming-updater: apply
09:13 trueg@deploy1003: helmfile [staging] START helmfile.d/services/rdf-streaming-updater: apply
09:13 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1191 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260420-091310-fceratto.json
09:13 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1191.eqiad.wmnet with reason: Maintenance
09:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1174 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260420-091233-fceratto.json
09:11 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM testvm2007.codfw.wmnet
09:11 trueg@deploy1003: helmfile [staging] DONE helmfile.d/services/rdf-streaming-updater: apply
09:10 trueg@deploy1003: helmfile [staging] START helmfile.d/services/rdf-streaming-updater: apply
09:08 jmm@cumin2002: START - Cookbook sre.dns.netbox
09:07 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM testvm2007.codfw.wmnet
09:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260420-090401-fceratto.json
09:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260420-090225-fceratto.json
08:53 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260420-085349-fceratto.json
08:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260420-085217-fceratto.json
08:49 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1165.eqiad.wmnet with reason: host reimage
08:48 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts testvm2006.codfw.wmnet
08:45 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2156 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260420-084512-fceratto.json
08:45 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2156.codfw.wmnet with reason: Maintenance
08:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260420-084440-fceratto.json
08:43 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM testvm2005.codfw.wmnet
08:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1174 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260420-084209-fceratto.json
08:41 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1165.eqiad.wmnet with reason: host reimage
08:41 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts atlas5001.wikimedia.org
08:41 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
08:41 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: atlas5001.wikimedia.org decommissioned, removing all IPs except the asset tag one - ayounsi@cumin1003"
08:41 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: atlas5001.wikimedia.org decommissioned, removing all IPs except the asset tag one - ayounsi@cumin1003"
08:39 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1174 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260420-083957-fceratto.json
08:39 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1174.eqiad.wmnet with reason: Maintenance
08:39 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM testvm2005.codfw.wmnet
08:39 gkyziridis@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
08:34 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
08:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260420-083432-fceratto.json
08:32 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts testvm2004.codfw.wmnet
08:32 jmm@cumin2002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
08:30 ayounsi@cumin1003: START - Cookbook sre.hosts.decommission for hosts atlas5001.wikimedia.org
08:30 jmm@cumin2002: START - Cookbook sre.dns.netbox
08:28 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1165.eqiad.wmnet with OS trixie
08:27 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1165: Reimage to Trixie
08:27 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1165: Reimage to Trixie
08:27 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1165.eqiad.wmnet with reason: Reimage to Trixie
08:26 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2229: after reimage to trixie
08:26 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts testvm2004.codfw.wmnet
08:26 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1171.eqiad.wmnet with reason: Maintenance
08:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1170 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260420-082555-fceratto.json
08:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260420-082424-fceratto.json
08:23 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb1015.eqiad.wmnet,db[1155,1165].eqiad.wmnet with reason: Reimage to Trixie
08:23 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts bast5004.wikimedia.org
08:22 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
08:22 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: bast5004.wikimedia.org decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
08:22 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: bast5004.wikimedia.org decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
08:19 filippo@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM cloudcumin1001.eqiad.wmnet
08:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1170', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260420-081547-fceratto.json
08:15 filippo@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM cloudcumin1001.eqiad.wmnet
08:15 cgoubert@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on wikikube-worker2188.codfw.wmnet with reason: dcops intervention
08:14 cgoubert@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2188.codfw.wmnet
08:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260420-081416-fceratto.json
08:14 cgoubert@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2188.codfw.wmnet
08:13 filippo@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM cloudcumin2001.codfw.wmnet
08:09 jmm@cumin2002: START - Cookbook sre.dns.netbox
08:07 filippo@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM cloudcumin2001.codfw.wmnet
08:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1170', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260420-080539-fceratto.json
08:05 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2149 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260420-080529-fceratto.json
08:05 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2149.codfw.wmnet with reason: Maintenance
08:04 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts bast5004.wikimedia.org
08:01 marostegui: Removed categorylinks_icu72 from s3 with a sleep, this will around 1.5 hours
T422546
07:59 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 12389
07:59 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'configure' for AS: 12389
07:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1170 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260420-075524-fceratto.json
07:51 marostegui: Removed categorylinks_icu72 from s5
T422546
07:41 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2229: after reimage to trixie
07:40 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1170 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260420-074031-fceratto.json
07:40 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1170.eqiad.wmnet with reason: Maintenance
07:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1158 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260420-074005-fceratto.json
07:38 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2229.codfw.wmnet with OS trixie
07:31 marostegui: Removed categorylinks_icu72 from s7
T422546
07:30 marostegui: Removed categorylinks_icu72 from s2
T422546
07:30 marostegui: Removed categorylinks_icu72 from s12
T422546
07:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260420-072957-fceratto.json
07:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260420-071949-fceratto.json
07:14 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2229.codfw.wmnet with reason: host reimage
07:10 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2229.codfw.wmnet with reason: host reimage
07:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1158 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260420-070941-fceratto.json
07:07 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1158 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260420-070728-fceratto.json
07:07 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1014,1018].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
07:07 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1158.eqiad.wmnet with reason: Maintenance
06:57 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2151: repool after maintenance
06:52 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2229.codfw.wmnet with OS trixie
06:50 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2229: Reimage to Trixie
06:50 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2229: Reimage to Trixie
06:50 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2229.codfw.wmnet with reason: Reimage to Trixie
06:40 marostegui@cumin1003: dbctl commit (dc=all): 'Depool db2229
T423837
', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260420-064042-marostegui.json
06:40 marostegui@cumin1003: dbctl commit (dc=all): 'Promote db2214 to s6 primary
T423837
', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260420-064006-marostegui.json
06:39 marostegui: Starting s6 codfw failover from db2229 to db2214 -
T423837
06:36 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 21 hosts with reason: Primary switchover s6
T423837
06:35 marostegui@cumin1003: dbctl commit (dc=all): 'Set db2214 with weight 0
T423837
', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260420-063553-marostegui.json
06:27 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2151: repool after maintenance
06:26 marostegui@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db2151: after reimage to trixie
06:22 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2151: after reimage to trixie
06:22 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2151.codfw.wmnet with OS trixie
06:06 marostegui: Removed categorylinks_icu72 from s1 and s6
T422546
05:59 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2151.codfw.wmnet with reason: host reimage
05:53 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2151.codfw.wmnet with reason: host reimage
05:33 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2151.codfw.wmnet with OS trixie
05:32 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2151: Reimage to Trixie
05:32 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2151: Reimage to Trixie
05:32 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2151.codfw.wmnet with reason: Reimage to Trixie
03:19 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
03:19 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
03:05 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
03:05 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
00:24 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
00:24 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
2026-04-19
18:20 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
18:20 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
17:50 zabe@deploy1003: Finished scap sync-world: Backport for
Temporarily switch back to file read old schema (T423065)
(duration: 33m 41s)
17:36 zabe@deploy1003: zabe: Continuing with sync
17:34 zabe@deploy1003: zabe: Backport for
Temporarily switch back to file read old schema (T423065)
synced to the testservers (see
). Changes can now be verified there.
17:16 zabe@deploy1003: Started scap sync-world: Backport for
Temporarily switch back to file read old schema (T423065)
16:02 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
16:02 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
13:42 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
13:42 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
06:59 mvernon@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on ms-be1066.eqiad.wmnet with reason: vacuum overlarge container dbs
02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 11s)
02:01 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
2026-04-17
23:55 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1070.eqiad.wmnet with OS bookworm
23:55 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
23:48 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
23:31 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1070.eqiad.wmnet with reason: host reimage
23:26 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1067.eqiad.wmnet with OS bookworm
23:26 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
23:25 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
23:24 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1070.eqiad.wmnet with reason: host reimage
23:10 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
23:10 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
23:08 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1067.eqiad.wmnet with reason: host reimage
23:07 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mc1070.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
23:05 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1067.eqiad.wmnet with reason: host reimage
23:00 jclark@cumin1003: START - Cookbook sre.hosts.provision for host mc1070.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
22:56 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1056.eqiad.wmnet with OS bookworm
22:56 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
22:56 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
22:51 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mc1070.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
22:50 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host mc1070.eqiad.wmnet with OS bookworm
22:49 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db2253.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
22:49 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db2251.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
22:49 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db2252.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
22:48 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db2250.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
22:45 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mc1067.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
22:41 jclark@cumin1003: START - Cookbook sre.hosts.provision for host mc1070.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
22:40 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1072.eqiad.wmnet with OS bookworm
22:40 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
22:40 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
22:39 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1056.eqiad.wmnet with reason: host reimage
22:38 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host db2253.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
22:38 jclark@cumin1003: START - Cookbook sre.hosts.provision for host mc1067.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
22:38 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host db2252.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
22:38 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host mc1067.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
22:38 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host db2251.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
22:37 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host db2250.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
22:36 jclark@cumin1003: START - Cookbook sre.hosts.provision for host mc1067.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
22:35 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host mc1067.eqiad.wmnet with OS bookworm
22:33 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1056.eqiad.wmnet with reason: host reimage
22:24 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1072.eqiad.wmnet with reason: host reimage
22:23 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host mc1070.eqiad.wmnet with OS bookworm
22:20 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1072.eqiad.wmnet with reason: host reimage
22:19 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host pc2024.codfw.wmnet with OS trixie
22:19 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
22:15 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host mc1067.eqiad.wmnet with OS bookworm
22:13 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
22:13 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host mc1056.eqiad.wmnet with OS bookworm
22:09 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host mc1072.eqiad.wmnet with OS bookworm
22:09 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1071.eqiad.wmnet with OS bookworm
22:09 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
22:08 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
22:08 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mc1072.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
22:02 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host pc2022.codfw.wmnet with OS trixie
22:02 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
22:01 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host pc2023.codfw.wmnet with OS trixie
22:01 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
22:01 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
21:59 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on pc2024.codfw.wmnet with reason: host reimage
21:58 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
21:57 jclark@cumin1003: START - Cookbook sre.hosts.provision for host mc1072.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
21:54 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host pc2021.codfw.wmnet with OS trixie
21:54 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
21:54 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
21:51 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1071.eqiad.wmnet with reason: host reimage
21:47 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on pc2022.codfw.wmnet with reason: host reimage
21:45 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1065.eqiad.wmnet with OS bookworm
21:42 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on pc2023.codfw.wmnet with reason: host reimage
21:42 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host mc1056.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
21:41 jclark@cumin1003: START - Cookbook sre.hosts.provision for host mc1056.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
21:39 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on pc2021.codfw.wmnet with reason: host reimage
21:39 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host mc1056.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
21:37 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on pc2024.codfw.wmnet with reason: host reimage
21:36 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on pc2023.codfw.wmnet with reason: host reimage
21:35 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1069.eqiad.wmnet with OS bookworm
21:35 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
21:35 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on pc2022.codfw.wmnet with reason: host reimage
21:35 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
21:34 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1068.eqiad.wmnet with OS bookworm
21:34 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
21:34 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1071.eqiad.wmnet with reason: host reimage
21:33 jclark@cumin1003: START - Cookbook sre.hosts.provision for host mc1056.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
21:32 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on pc2021.codfw.wmnet with reason: host reimage
21:28 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1065.eqiad.wmnet with reason: host reimage
21:27 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
21:24 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host mc1056.eqiad.wmnet with OS bookworm
21:23 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host mc1071.eqiad.wmnet with OS bookworm
21:21 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1065.eqiad.wmnet with reason: host reimage
21:18 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1069.eqiad.wmnet with reason: host reimage
21:16 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host pc2024.codfw.wmnet with OS trixie
21:16 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1063.eqiad.wmnet with OS bookworm
21:16 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
21:16 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host pc2023.codfw.wmnet with OS trixie
21:16 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
21:15 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host pc2022.codfw.wmnet with OS trixie
21:15 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host pc2021.codfw.wmnet with OS trixie
21:14 jhancock@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts ['pc2021']
21:14 jhancock@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['pc2021']
21:13 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1066.eqiad.wmnet with OS bookworm
21:13 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
21:12 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1069.eqiad.wmnet with reason: host reimage
21:12 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
21:10 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1068.eqiad.wmnet with reason: host reimage
21:10 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host pc2024.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
21:10 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host pc2023.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
21:09 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host mc1065.eqiad.wmnet with OS bookworm
21:09 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host pc2022.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
21:07 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host mc1065.eqiad.wmnet with OS bookworm
21:07 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host pc2021.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
21:06 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1068.eqiad.wmnet with reason: host reimage
21:04 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host pc2024.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
21:04 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host pc2023.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
21:03 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host pc2022.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
21:03 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host pc2021.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
21:03 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host mc1070.eqiad.wmnet with OS bookworm
21:02 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1058.eqiad.wmnet with OS bookworm
21:02 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
21:02 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host mc1069.eqiad.wmnet with OS bookworm
21:02 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
21:00 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1059.eqiad.wmnet with OS bookworm
21:00 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
20:59 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
20:59 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1063.eqiad.wmnet with reason: host reimage
20:56 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host mc1068.eqiad.wmnet with OS bookworm
20:55 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host mc1067.eqiad.wmnet with OS bookworm
20:54 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1066.eqiad.wmnet with reason: host reimage
20:54 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1057.eqiad.wmnet with OS bookworm
20:54 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
20:53 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
20:51 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host pc2022.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
20:50 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1062.eqiad.wmnet with OS bookworm
20:50 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
20:50 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1065.eqiad.wmnet with reason: host reimage
20:50 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
20:48 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host pc2024.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
20:48 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host pc2023.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
20:48 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1066.eqiad.wmnet with reason: host reimage
20:47 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host pc2021.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
20:47 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1061.eqiad.wmnet with OS bookworm
20:47 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
20:46 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
20:45 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1058.eqiad.wmnet with reason: host reimage
20:43 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1064.eqiad.wmnet with OS bookworm
20:43 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
20:43 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
20:42 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1065.eqiad.wmnet with reason: host reimage
20:42 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1059.eqiad.wmnet with reason: host reimage
20:40 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1058.eqiad.wmnet with reason: host reimage
20:39 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1063.eqiad.wmnet with reason: host reimage
20:38 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host pc2024.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
20:37 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host pc2023.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
20:37 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host mc1066.eqiad.wmnet with OS bookworm
20:37 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1055.eqiad.wmnet with OS bookworm
20:37 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host pc2022.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
20:37 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
20:37 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host pc2021.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
20:36 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
20:36 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1057.eqiad.wmnet with reason: host reimage
20:35 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host pc2024
20:34 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host pc2024
20:34 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host pc2023
20:34 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host pc2023
20:34 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host pc2022
20:34 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host pc2022
20:34 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host pc2021
20:33 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host pc2021
20:32 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1062.eqiad.wmnet with reason: host reimage
20:31 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host mc1065.eqiad.wmnet with OS bookworm
20:31 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
20:31 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding pc2021 to codfw - jhancock@cumin2002"
20:31 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding pc2021 to codfw - jhancock@cumin2002"
20:29 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host mc1058.eqiad.wmnet with OS bookworm
20:28 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host mc1063.eqiad.wmnet with OS bookworm
20:28 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1061.eqiad.wmnet with reason: host reimage
20:28 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1060.eqiad.wmnet with OS bookworm
20:28 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
20:28 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
20:25 jhancock@cumin2002: START - Cookbook sre.dns.netbox
20:24 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1064.eqiad.wmnet with reason: host reimage
20:23 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db2253
20:23 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host db2253
20:23 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db2252
20:22 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host db2252
20:22 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db2251
20:22 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host db2251
20:22 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db2250
20:22 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host db2250
20:21 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
20:21 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding db2250 to codfw - jhancock@cumin2002"
20:21 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding db2250 to codfw - jhancock@cumin2002"
20:21 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1062.eqiad.wmnet with reason: host reimage
20:20 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1061.eqiad.wmnet with reason: host reimage
20:20 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1055.eqiad.wmnet with reason: host reimage
20:20 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1059.eqiad.wmnet with reason: host reimage
20:20 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1064.eqiad.wmnet with reason: host reimage
20:17 jhancock@cumin2002: START - Cookbook sre.dns.netbox
20:17 jhancock@cumin2002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
20:15 jhancock@cumin2002: START - Cookbook sre.dns.netbox
20:14 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1057.eqiad.wmnet with reason: host reimage
20:14 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1055.eqiad.wmnet with reason: host reimage
20:13 mutante: planet1003, planet2003 - rebooting on ganeti level for
T422596
20:10 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host mc1062.eqiad.wmnet with OS bookworm
20:10 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1060.eqiad.wmnet with reason: host reimage
20:10 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host mc1061.eqiad.wmnet with OS bookworm
20:09 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host mc1064.eqiad.wmnet with OS bookworm
20:09 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host mc1059.eqiad.wmnet with OS bookworm
20:06 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1060.eqiad.wmnet with reason: host reimage
20:04 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host mc1057.eqiad.wmnet with OS bookworm
20:03 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host mc1056.eqiad.wmnet with OS bookworm
20:03 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host mc1055.eqiad.wmnet with OS bookworm
19:59 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mc1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
19:53 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mc1055.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
19:50 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host mc1060.eqiad.wmnet with OS bookworm
19:50 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mc1056.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
19:49 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mc1060.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
19:48 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mc1057.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
19:48 jclark@cumin1003: START - Cookbook sre.hosts.provision for host mc1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
19:47 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mc1059.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
19:46 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mc1061.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
19:44 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mc1062.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
19:44 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mc1065.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
19:44 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mc1064.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
19:41 jclark@cumin1003: START - Cookbook sre.hosts.provision for host mc1055.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
19:40 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mc1063.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
19:38 jclark@cumin1003: START - Cookbook sre.hosts.provision for host mc1056.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
19:37 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host mc1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
19:37 jclark@cumin1003: START - Cookbook sre.hosts.provision for host mc1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
19:36 jclark@cumin1003: START - Cookbook sre.hosts.provision for host mc1057.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
19:36 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mc1066.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
19:36 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host mc1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
19:35 jclark@cumin1003: START - Cookbook sre.hosts.provision for host mc1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
19:35 jclark@cumin1003: START - Cookbook sre.hosts.provision for host mc1060.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
19:35 jclark@cumin1003: START - Cookbook sre.hosts.provision for host mc1059.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
19:35 jclark@cumin1003: START - Cookbook sre.hosts.provision for host mc1061.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
19:33 jclark@cumin1003: START - Cookbook sre.hosts.provision for host mc1062.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
19:33 jclark@cumin1003: START - Cookbook sre.hosts.provision for host mc1065.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
19:33 jclark@cumin1003: START - Cookbook sre.hosts.provision for host mc1063.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
19:33 jclark@cumin1003: START - Cookbook sre.hosts.provision for host mc1064.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
19:33 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mc1067.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
19:33 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mc1068.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
19:33 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
19:33 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mc1070.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
19:32 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
19:32 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mc1069.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
19:32 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mc1071.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
19:25 jclark@cumin1003: START - Cookbook sre.hosts.provision for host mc1066.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
19:24 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host mc1072.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
19:22 jclark@cumin1003: START - Cookbook sre.hosts.provision for host mc1067.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
19:22 jclark@cumin1003: START - Cookbook sre.hosts.provision for host mc1068.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
19:22 jclark@cumin1003: START - Cookbook sre.hosts.provision for host mc1069.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
19:21 jclark@cumin1003: START - Cookbook sre.hosts.provision for host mc1070.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
19:21 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
19:21 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding phab1006 to eqiad - jclark@cumin1003"
19:21 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding phab1006 to eqiad - jclark@cumin1003"
19:21 jclark@cumin1003: START - Cookbook sre.hosts.provision for host mc1071.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
19:18 jclark@cumin1003: START - Cookbook sre.hosts.provision for host mc1072.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
19:17 jclark@cumin1003: START - Cookbook sre.dns.netbox
19:17 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
19:17 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding phab1006 to eqiad - jclark@cumin1003"
19:17 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding phab1006 to eqiad - jclark@cumin1003"
19:12 jclark@cumin1003: START - Cookbook sre.dns.netbox
17:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2222 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260417-172835-fceratto.json
17:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2222', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260417-171827-fceratto.json
17:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2222', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260417-170819-fceratto.json
16:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2222 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260417-165811-fceratto.json
16:56 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2222 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260417-165559-fceratto.json
16:55 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2222.codfw.wmnet with reason: Maintenance
16:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2221 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260417-165544-fceratto.json
16:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2221', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260417-164536-fceratto.json
16:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2221', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260417-163528-fceratto.json
16:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2221 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260417-162520-fceratto.json
16:23 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2221 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260417-162307-fceratto.json
16:23 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2221.codfw.wmnet with reason: Maintenance
16:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2218 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260417-162253-fceratto.json
16:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2218', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260417-161245-fceratto.json
16:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260417-160418-fceratto.json
16:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2218', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260417-160236-fceratto.json
16:02 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
16:01 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
15:59 jiji@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
15:59 jiji@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
15:59 jiji@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
15:58 jiji@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
15:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260417-155410-fceratto.json
15:54 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host phab2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
15:53 elukey@cumin1003: START - Cookbook sre.hosts.provision for host phab2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
15:53 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host phab2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
15:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2218 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260417-155228-fceratto.json
15:50 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2218 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260417-155015-fceratto.json
15:50 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2218.codfw.wmnet with reason: Maintenance
15:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2208 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260417-155001-fceratto.json
15:49 elukey@cumin1003: START - Cookbook sre.hosts.provision for host phab2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
15:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260417-154402-fceratto.json
15:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2208', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260417-153953-fceratto.json
15:33 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260417-153354-fceratto.json
15:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2208', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260417-152944-fceratto.json
15:27 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for
enwikinews: Move override for $wgFlaggedRevsHandleIncludes to InitialiseSettings.php (T423512)
(duration: 06m 51s)
15:26 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1259 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260417-152620-fceratto.json
15:26 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1259.eqiad.wmnet with reason: Maintenance
15:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260417-152549-fceratto.json
15:25 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
15:25 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
15:24 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
15:24 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
15:23 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, asmartkitten: Continuing with sync
15:23 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
15:23 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
15:22 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, asmartkitten: Backport for
enwikinews: Move override for $wgFlaggedRevsHandleIncludes to InitialiseSettings.php (T423512)
synced to the testservers (see
). Changes can now be verified there.
15:22 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
15:22 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
15:20 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for
enwikinews: Move override for $wgFlaggedRevsHandleIncludes to InitialiseSettings.php (T423512)
15:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2208 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260417-151936-fceratto.json
15:18 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
15:18 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
15:17 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
15:17 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
15:17 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2208 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260417-151723-fceratto.json
15:17 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2208.codfw.wmnet with reason: Maintenance
15:16 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2200.codfw.wmnet with reason: Maintenance
15:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260417-151541-fceratto.json
15:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260417-150532-fceratto.json
15:04 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2198.codfw.wmnet with reason: Maintenance
15:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2182 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260417-150440-fceratto.json
14:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260417-145524-fceratto.json
14:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2182', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260417-145432-fceratto.json
14:51 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
14:51 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
14:48 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
14:48 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
14:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1254 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260417-144819-fceratto.json
14:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1254.eqiad.wmnet with reason: Maintenance
14:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2182', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260417-144424-fceratto.json
14:43 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1239.eqiad.wmnet with reason: Maintenance
14:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260417-144247-fceratto.json
14:36 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
14:36 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
14:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2182 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260417-143416-fceratto.json
14:33 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
14:33 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
14:33 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
14:32 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
14:32 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
14:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260417-143238-fceratto.json
14:32 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2182 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260417-143204-fceratto.json
14:31 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2182.codfw.wmnet with reason: Maintenance
14:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2168 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260417-143139-fceratto.json
14:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260417-142230-fceratto.json
14:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2168', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260417-142130-fceratto.json
14:18 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
14:16 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
14:16 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
14:14 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
14:13 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
14:12 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
14:12 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
14:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260417-141222-fceratto.json
14:12 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
14:11 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
14:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2168', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260417-141123-fceratto.json
14:10 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
14:09 urandom: decommissioning Cassandra, aqs1011 [a,b] —
T412830
14:06 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-restart-varnish (exit_code=0) rolling restart of Varnish on 1 hosts matching query P{cp3073.*}
14:06 eevans@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on aqs1011.eqiad.wmnet with reason: Bootstrapping —
T412830
14:05 fabfur@cumin1003: START - Cookbook sre.cdn.roll-restart-varnish rolling restart of Varnish on 1 hosts matching query P{cp3073.*}
14:04 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1233 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260417-140454-fceratto.json
14:04 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1233.eqiad.wmnet with reason: Maintenance
14:04 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-restart-varnish (exit_code=0) rolling restart of Varnish on 1 hosts matching query P{cp3072.*}
14:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260417-140424-fceratto.json
14:04 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
14:03 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
14:03 fabfur@cumin1003: START - Cookbook sre.cdn.roll-restart-varnish rolling restart of Varnish on 1 hosts matching query P{cp3072.*}
14:02 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-restart-varnish (exit_code=0) rolling restart of Varnish on 1 hosts matching query P{cp3070.*}
14:01 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
14:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2168 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260417-140115-fceratto.json
14:01 fabfur@cumin1003: START - Cookbook sre.cdn.roll-restart-varnish rolling restart of Varnish on 1 hosts matching query P{cp3070.*}
14:00 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-restart-varnish (exit_code=0) rolling restart of Varnish on 1 hosts matching query P{cp3069.*}
14:00 fabfur: restart varnish on cp3069, cp3070, cp3072, cp3073 to clear alerts
14:00 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2168 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260417-140003-fceratto.json
13:59 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2168.codfw.wmnet with reason: Maintenance
13:59 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
13:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2159 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260417-135938-fceratto.json
13:58 fabfur@cumin1003: START - Cookbook sre.cdn.roll-restart-varnish rolling restart of Varnish on 1 hosts matching query P{cp3069.*}
13:57 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-restart-varnish (exit_code=0) rolling restart of Varnish on 1 hosts matching query P{cp3066.*}
13:54 fabfur@cumin1003: START - Cookbook sre.cdn.roll-restart-varnish rolling restart of Varnish on 1 hosts matching query P{cp3066.*}
13:54 fabfur: restarting varnish on cp3066 to clear alerts
13:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260417-135416-fceratto.json
13:52 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
13:52 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
13:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2159', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260417-134930-fceratto.json
13:44 jmm@dns1004: END - running authdns-update
13:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260417-134408-fceratto.json
13:43 jmm@dns1004: START - running authdns-update
13:42 inflatador: bking@apt1002 sudo -E reprepro -C component/opensearch2 include trixie-wikimedia /home/bking/wmf-opensearch-search-plugins-2.19.5+5-trixie/wmf-opensearch-search-plugins_2.19.5+5_amd64.changes
13:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2159', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260417-133923-fceratto.json
13:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260417-133359-fceratto.json
13:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2159 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260417-132914-fceratto.json
13:28 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2159 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260417-132802-fceratto.json
13:27 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2159.codfw.wmnet with reason: Maintenance
13:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2150 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260417-132738-fceratto.json
13:27 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
13:26 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
13:26 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1229 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260417-132628-fceratto.json
13:26 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1229.eqiad.wmnet with reason: Maintenance
13:26 jiji@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM wikikube-worker-exp2001.codfw.wmnet
13:22 jiji@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM wikikube-worker-exp2001.codfw.wmnet
13:20 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1225.eqiad.wmnet with reason: Maintenance
13:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260417-132034-fceratto.json
13:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2150', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260417-131730-fceratto.json
13:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260417-131026-fceratto.json
13:07 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2150', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260417-130722-fceratto.json
13:07 jiji@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM wikikube-worker-exp1001.eqiad.wmnet
13:00 jiji@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM wikikube-worker-exp1001.eqiad.wmnet
13:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260417-130018-fceratto.json
12:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2150 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260417-125714-fceratto.json
12:55 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2150 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260417-125501-fceratto.json
12:54 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2150.codfw.wmnet with reason: Maintenance
12:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260417-125009-fceratto.json
12:41 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1197 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260417-124149-fceratto.json
12:41 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1197.eqiad.wmnet with reason: Maintenance
12:41 jiji@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
12:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260417-124120-fceratto.json
12:41 jiji@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
12:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260417-123111-fceratto.json
12:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260417-122104-fceratto.json
12:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260417-121056-fceratto.json
12:02 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1188 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260417-120255-fceratto.json
12:02 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1188.eqiad.wmnet with reason: Maintenance
12:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260417-120226-fceratto.json
11:55 jiji@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
11:54 jiji@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
11:53 jiji@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
11:53 jiji@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
11:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260417-115218-fceratto.json
11:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260417-114210-fceratto.json
11:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260417-113201-fceratto.json
11:23 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1182 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260417-112333-fceratto.json
11:23 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1182.eqiad.wmnet with reason: Maintenance
11:23 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162 (
T419961
)', diff saved to Unable to send diff to phaste and previous config saved to /var/cache/conftool/dbconfig/20260417-112259-fceratto.json
11:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260417-111250-fceratto.json
11:11 jynus@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ms-backup2002.codfw.wmnet
11:11 jynus@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
11:11 jynus@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ms-backup2002.codfw.wmnet decommissioned, removing all IPs except the asset tag one - jynus@cumin1003"
11:08 jynus@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ms-backup2002.codfw.wmnet decommissioned, removing all IPs except the asset tag one - jynus@cumin1003"
11:03 jynus@cumin1003: START - Cookbook sre.dns.netbox
11:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260417-110242-fceratto.json
10:55 jynus@cumin1003: START - Cookbook sre.hosts.decommission for hosts ms-backup2002.codfw.wmnet
10:54 jynus@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ms-backup2001.codfw.wmnet
10:54 jynus@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
10:54 jynus@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ms-backup2001.codfw.wmnet decommissioned, removing all IPs except the asset tag one - jynus@cumin1003"
10:53 jynus@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ms-backup2001.codfw.wmnet decommissioned, removing all IPs except the asset tag one - jynus@cumin1003"
10:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260417-105234-fceratto.json
10:48 jynus@cumin1003: START - Cookbook sre.dns.netbox
10:43 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1162 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260417-104327-fceratto.json
10:43 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1162.eqiad.wmnet with reason: Maintenance
10:43 jynus@cumin1003: START - Cookbook sre.hosts.decommission for hosts ms-backup2001.codfw.wmnet
10:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260417-104257-fceratto.json
10:37 jynus@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ms-backup1002.eqiad.wmnet
10:37 jynus@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
10:37 jynus@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ms-backup1002.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jynus@cumin1003"
10:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260417-103249-fceratto.json
10:31 jynus@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ms-backup1002.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jynus@cumin1003"
10:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260417-102241-fceratto.json
10:20 jynus@cumin1003: START - Cookbook sre.dns.netbox
10:13 jynus@cumin1003: START - Cookbook sre.hosts.decommission for hosts ms-backup1002.eqiad.wmnet
10:13 jynus@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ms-backup1001.eqiad.wmnet
10:13 jynus@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
10:12 jynus@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ms-backup1001.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jynus@cumin1003"
10:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260417-101233-fceratto.json
10:11 jynus@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ms-backup1001.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jynus@cumin1003"
10:04 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1156 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260417-100401-fceratto.json
10:03 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1014,1018].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
10:03 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1156.eqiad.wmnet with reason: Maintenance
10:00 jynus@cumin1003: START - Cookbook sre.dns.netbox
09:55 jynus@cumin1003: START - Cookbook sre.hosts.decommission for hosts ms-backup1001.eqiad.wmnet
09:54 marostegui: pool esams
09:53 marostegui@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: pool esams [reason: no reason specified, no task ID specified]
09:53 marostegui@cumin1003: START - Cookbook sre.dns.admin DNS admin: pool esams [reason: no reason specified, no task ID specified]
09:44 moritzm: initialise eqsin02 Ganeti cluster
T421863
09:36 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device lsw1-f3-codfw
09:36 ayounsi@cumin1003: START - Cookbook sre.network.tls for network device lsw1-f3-codfw
09:36 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device ssw1-f1-codfw
09:36 ayounsi@cumin1003: START - Cookbook sre.network.tls for network device ssw1-f1-codfw
09:35 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device ssw1-e1-codfw
09:35 ayounsi@cumin1003: START - Cookbook sre.network.tls for network device ssw1-e1-codfw
09:35 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device lsw1-e1-codfw
09:35 ayounsi@cumin1003: START - Cookbook sre.network.tls for network device lsw1-e1-codfw
09:35 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device lsw1-e3-codfw
09:35 ayounsi@cumin1003: START - Cookbook sre.network.tls for network device lsw1-e3-codfw
08:51 topranks: depool esams due to connectivity issues
08:51 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: depool esams [reason: no reason specified, no task ID specified]
08:51 cmooney@cumin1003: START - Cookbook sre.dns.admin DNS admin: depool esams [reason: no reason specified, no task ID specified]
08:50 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
08:48 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
08:46 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
08:45 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
08:44 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
08:42 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
08:42 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
08:39 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
08:39 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
08:37 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
07:33 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1201: after reimage to trixie
07:11 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
07:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1263 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260417-071048-fceratto.json
07:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1263', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260417-070039-fceratto.json
06:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1263', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260417-065031-fceratto.json
06:49 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2158: repool after maintenance
06:48 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1201: after reimage to trixie
06:46 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1201.eqiad.wmnet with OS trixie
06:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1263 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260417-064023-fceratto.json
06:24 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1201.eqiad.wmnet with reason: host reimage
06:22 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1201.eqiad.wmnet with reason: host reimage
06:08 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1201.eqiad.wmnet with OS trixie
06:07 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1201: Reimage to Trixie
06:06 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1201: Reimage to Trixie
06:06 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1201.eqiad.wmnet with reason: Reimage to Trixie
06:04 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2158: repool after maintenance
06:00 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2158.codfw.wmnet with OS trixie
05:39 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2158.codfw.wmnet with reason: host reimage
05:33 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2158.codfw.wmnet with reason: host reimage
05:16 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2158.codfw.wmnet with OS trixie
05:13 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2158: Reimage to Trixie
05:13 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2158: Reimage to Trixie
05:13 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2158.codfw.wmnet with reason: Reimage to Trixie
04:45 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1263 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260417-044543-fceratto.json
04:45 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1263.eqiad.wmnet with reason: Maintenance
04:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1262 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260417-044518-fceratto.json
04:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1262', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260417-043510-fceratto.json
04:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1262', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260417-042502-fceratto.json
04:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1262 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260417-041454-fceratto.json
02:16 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1262 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260417-021624-fceratto.json
02:16 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1262.eqiad.wmnet with reason: Maintenance
02:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1261 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260417-021558-fceratto.json
02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 25s)
02:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1261', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260417-020550-fceratto.json
02:01 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
01:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1261', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260417-015542-fceratto.json
01:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1261 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260417-014534-fceratto.json
00:10 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2006.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
00:03 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2006.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
00:03 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2006.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
2026-04-16
23:56 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2006.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
23:51 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1261 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260416-235123-fceratto.json
23:51 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1261.eqiad.wmnet with reason: Maintenance
23:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1260 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260416-235059-fceratto.json
23:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1260', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260416-234052-fceratto.json
23:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1260', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260416-233044-fceratto.json
23:25 musikanimal@deploy1003: Finished scap sync-world: Backport for
CommonSettings: use CodeMirror instead of CodeEditor in AbuseFilter (T399673)
(duration: 06m 35s)
23:21 musikanimal@deploy1003: musikanimal: Continuing with sync
23:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1260 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260416-232036-fceratto.json
23:20 musikanimal@deploy1003: musikanimal: Backport for
CommonSettings: use CodeMirror instead of CodeEditor in AbuseFilter (T399673)
synced to the testservers (see
). Changes can now be verified there.
23:18 musikanimal@deploy1003: Started scap sync-world: Backport for
CommonSettings: use CodeMirror instead of CodeEditor in AbuseFilter (T399673)
22:14 James_F: jforrester@deploy1003:/srv/mediawiki-staging$ foreachwikiindblist sul extensions/Wikibase/lib/maintenance/populateSitesTable.php #
T423660
22:08 cscott@deploy1003: Finished scap sync-world: Backport for
ConverterRule: convert `null` to `false` when needed (T423639)
Convert language to internal code in tests
ParsoidCachePrewarmJob: Define the title in the req context (T422780)
Move language variant parser option setting from Article to WikiPage (T423534)
(duration: 09m 41s)
22:04 cscott@deploy1003: cscott: Continuing with sync
22:00 cscott@deploy1003: cscott: Backport for
ConverterRule: convert `null` to `false` when needed (T423639)
Convert language to internal code in tests
ParsoidCachePrewarmJob: Define the title in the req context (T422780)
Move language variant parser option setting from Article to WikiPage (T423534)
synced to the testservers (see
21:58 zabe@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
21:58 cscott@deploy1003: Started scap sync-world: Backport for
ConverterRule: convert `null` to `false` when needed (T423639)
Convert language to internal code in tests
ParsoidCachePrewarmJob: Define the title in the req context (T422780)
Move language variant parser option setting from Article to WikiPage (T423534)
21:57 zabe@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
21:33 cscott@deploy1003: Finished scap sync-world: Backport for
Deploy PRV to 4 wikis (T423188)
[bnwikisource] Enable PageImages on NS:4, NS:100, NS:104, NS:106, NS:114
(duration: 17m 26s)
21:29 cscott@deploy1003: cscott, arlolra, bodhisattwa: Continuing with sync
21:23 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1260 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260416-212348-fceratto.json
21:23 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1260.eqiad.wmnet with reason: Maintenance
21:23 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1252 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260416-212323-fceratto.json
21:17 cscott@deploy1003: cscott, arlolra, bodhisattwa: Backport for
Deploy PRV to 4 wikis (T423188)
[bnwikisource] Enable PageImages on NS:4, NS:100, NS:104, NS:106, NS:114
synced to the testservers (see
). Changes can now be verified there.
21:16 cscott@deploy1003: Started scap sync-world: Backport for
Deploy PRV to 4 wikis (T423188)
[bnwikisource] Enable PageImages on NS:4, NS:100, NS:104, NS:106, NS:114
21:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1252', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260416-211315-fceratto.json
21:03 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1252', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260416-210307-fceratto.json
20:53 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1252 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260416-205258-fceratto.json
20:51 stran@deploy1003: Finished scap sync-world: Backport for
Deploy IRS to enwiki's Event Talk namespace (T423042)
Make abstractwiki a multi-lingual Wikidata client (T420420)
Enable attribution.v0-beta in RestSandboxSpecs for all wikis (T419545)
(duration: 08m 36s)
20:48 stran@deploy1003: aaron, stran, jforrester: Continuing with sync
20:44 stran@deploy1003: aaron, stran, jforrester: Backport for
Deploy IRS to enwiki's Event Talk namespace (T423042)
Make abstractwiki a multi-lingual Wikidata client (T420420)
Enable attribution.v0-beta in RestSandboxSpecs for all wikis (T419545)
synced to the testservers (see
). Changes can now be verified there.
20:43 stran@deploy1003: Started scap sync-world: Backport for
Deploy IRS to enwiki's Event Talk namespace (T423042)
Make abstractwiki a multi-lingual Wikidata client (T420420)
Enable attribution.v0-beta in RestSandboxSpecs for all wikis (T419545)
20:36 stran@deploy1003: Finished scap sync-world: Backport for
Allow the 'ReportIncidentEnabledNamespaces' config to be ovewritten (T423042)
(duration: 09m 07s)
20:33 stran@deploy1003: stran: Continuing with sync
20:29 stran@deploy1003: stran: Backport for
Allow the 'ReportIncidentEnabledNamespaces' config to be ovewritten (T423042)
synced to the testservers (see
). Changes can now be verified there.
20:27 stran@deploy1003: Started scap sync-world: Backport for
Allow the 'ReportIncidentEnabledNamespaces' config to be ovewritten (T423042)
20:17 maryum: Removed private mitigation for
T419137
20:09 mstyles@deploy1003: Finished scap sync-world: Backport for
config: Enable EmailConfirmationBanner on selected wikis (T421366)
(duration: 06m 06s)
20:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2238 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260416-200839-fceratto.json
20:05 mstyles@deploy1003: mmartorana, mstyles: Continuing with sync
20:05 mstyles@deploy1003: mmartorana, mstyles: Backport for
config: Enable EmailConfirmationBanner on selected wikis (T421366)
synced to the testservers (see
). Changes can now be verified there.
20:03 mstyles@deploy1003: Started scap sync-world: Backport for
config: Enable EmailConfirmationBanner on selected wikis (T421366)
19:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2238', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260416-195831-fceratto.json
19:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2238', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260416-194823-fceratto.json
19:48 zabe@deploy1003: Finished scap sync-world: Backport for
Set $wgGlobalUsageSharedRepoWiki for testcommonswiki (T421914)
Also disable updates for GloballyWantedFiles on testcommonswiki (T421914)
(duration: 06m 48s)
19:44 zabe@deploy1003: zabe: Continuing with sync
19:43 zabe@deploy1003: zabe: Backport for
Set $wgGlobalUsageSharedRepoWiki for testcommonswiki (T421914)
Also disable updates for GloballyWantedFiles on testcommonswiki (T421914)
synced to the testservers (see
). Changes can now be verified there.
19:41 zabe@deploy1003: Started scap sync-world: Backport for
Set $wgGlobalUsageSharedRepoWiki for testcommonswiki (T421914)
Also disable updates for GloballyWantedFiles on testcommonswiki (T421914)
19:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2238 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260416-193814-fceratto.json
19:36 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-ctrl2005.codfw.wmnet with OS trixie
19:34 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-ctrl2004.codfw.wmnet with OS trixie
19:31 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2238 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260416-193100-fceratto.json
19:30 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2238.codfw.wmnet with reason: Maintenance
19:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2226 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260416-193028-fceratto.json
19:21 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-ctrl2005.codfw.wmnet with reason: host reimage
19:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2226', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260416-192020-fceratto.json
19:19 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-ctrl2004.codfw.wmnet with reason: host reimage
19:16 jasmine@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-ctrl2005.codfw.wmnet with reason: host reimage
19:15 jasmine@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-ctrl2004.codfw.wmnet with reason: host reimage
19:14 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
19:14 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
19:12 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
19:12 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
19:11 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
19:11 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
19:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2226', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260416-191012-fceratto.json
19:03 jasmine@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-ctrl2005.codfw.wmnet with OS trixie
19:02 jasmine@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-ctrl2004.codfw.wmnet with OS trixie
19:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2226 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260416-190004-fceratto.json
18:59 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
18:57 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1252 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260416-185757-fceratto.json
18:57 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1252.eqiad.wmnet with reason: Maintenance
18:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1249 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260416-185731-fceratto.json
18:52 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2226 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260416-185253-fceratto.json
18:52 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2226.codfw.wmnet with reason: Maintenance
18:52 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
18:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2225 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260416-185222-fceratto.json
18:49 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
18:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1249', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260416-184723-fceratto.json
18:46 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
18:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2225', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260416-184213-fceratto.json
18:42 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
18:39 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
18:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1249', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260416-183715-fceratto.json
18:36 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
18:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2225', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260416-183205-fceratto.json
18:32 dduvall@deploy1003: rebuilt and synchronized wikiversions files: group2 to 1.46.0-wmf.24 refs
T420482
18:28 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
18:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1249 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260416-182707-fceratto.json
18:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2225 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260416-182157-fceratto.json
18:14 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2225 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260416-181447-fceratto.json
18:14 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2225.codfw.wmnet with reason: Maintenance
18:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2204 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260416-181415-fceratto.json
18:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2204', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260416-180407-fceratto.json
17:53 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2204', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260416-175358-fceratto.json
17:43 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2204 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260416-174350-fceratto.json
17:36 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2204 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260416-173640-fceratto.json
17:36 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2204.codfw.wmnet with reason: Maintenance
17:31 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2197.codfw.wmnet with reason: Maintenance
17:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2189 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260416-173058-fceratto.json
17:28 bd808@deploy1003: helmfile [eqiad] DONE helmfile.d/services/developer-portal: apply
17:27 bd808@deploy1003: helmfile [eqiad] START helmfile.d/services/developer-portal: apply
17:27 bd808@deploy1003: helmfile [codfw] DONE helmfile.d/services/developer-portal: apply
17:27 bd808@deploy1003: helmfile [codfw] START helmfile.d/services/developer-portal: apply
17:26 bd808@deploy1003: helmfile [staging] DONE helmfile.d/services/developer-portal: apply
17:26 bd808@deploy1003: helmfile [staging] START helmfile.d/services/developer-portal: apply
17:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2189', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260416-172050-fceratto.json
17:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2189', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260416-171041-fceratto.json
17:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2189 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260416-170033-fceratto.json
16:53 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2189 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260416-165326-fceratto.json
16:53 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2189.codfw.wmnet with reason: Maintenance
16:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2175 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260416-165253-fceratto.json
16:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2175', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260416-164245-fceratto.json
16:38 cdanis: 💙cdanis@cumin1003.eqiad.wmnet ~ 🕧☕ sudo cumin 'A:swift-fe' 'enable-puppet "cdanis deploy
8ad070a466
T328872
"'
16:38 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1249 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260416-163800-fceratto.json
16:37 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1249.eqiad.wmnet with reason: Maintenance
16:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1248 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260416-163736-fceratto.json
16:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2175', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260416-163237-fceratto.json
16:30 cdanis: 💙cdanis@cumin1003.eqiad.wmnet ~ 🕧☕ sudo cumin 'A:swift-fe' 'disable-puppet "cdanis deploy
8ad070a466
T328872
"'
16:27 urandom: upgrade envoyproxy, restbase[1031,2024] (canary) —
T419637
T410975
16:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1248', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260416-162727-fceratto.json
16:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2175 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260416-162229-fceratto.json
16:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1248', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260416-161719-fceratto.json
16:15 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2175 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260416-161504-fceratto.json
16:14 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2175.codfw.wmnet with reason: Maintenance
16:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2148 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260416-161432-fceratto.json
16:11 eevans@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on aqs1010.eqiad.wmnet with reason: Bootstrapping —
T412830
16:07 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1248 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260416-160710-fceratto.json
16:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2148', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260416-160424-fceratto.json
15:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2148', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260416-155416-fceratto.json
15:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2148 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260416-154408-fceratto.json
15:35 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2148 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260416-153547-fceratto.json
15:35 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2148.codfw.wmnet with reason: Maintenance
15:35 jhathaway@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on krb2002.codfw.wmnet with reason:
T407726
15:35 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
15:35 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
15:34 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
15:34 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
15:31 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
15:30 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
15:29 cdanis: 💔cdanis@cumin1003.eqiad.wmnet ~ 🕦☕ sudo cumin 'A:swift-fe' 'disable-puppet "cdanis deploy
I3aaec0ca
T328872
"'
15:16 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
15:16 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
15:15 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
15:14 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
15:14 moritzm: installing sequoia-sqv security updates
15:13 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
15:10 daniel@deploy1003: Finished scap sync-world: Backport for
API rate limits: add highlimits-user class (T419796)
(duration: 10m 47s)
15:04 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
15:03 daniel@deploy1003: daniel: Continuing with sync
15:01 daniel@deploy1003: daniel: Backport for
API rate limits: add highlimits-user class (T419796)
synced to the testservers (see
). Changes can now be verified there.
15:00 root@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mr1-codfw,mr1-codfw IPv6,mr1-codfw.oob with reason: router upgrade
14:59 daniel@deploy1003: Started scap sync-world: Backport for
API rate limits: add highlimits-user class (T419796)
14:58 root@cumin2002: DONE (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 1:00:00 on mr1-codfw IPv6,mr-codfw with reason: router upgrade
14:58 papaul: ongoing maintenace on mr1-codfw
14:56 root@cumin2002: DONE (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 1:00:00 on mr1-codfw IPv6,mr1-codfw.oob,mr-codfw with reason: router upgrade
14:56 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
14:56 jelto@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host gerrit2002.wikimedia.org
14:56 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
14:54 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
14:54 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
14:52 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
14:45 jelto@cumin1003: START - Cookbook sre.hosts.reboot-single for host gerrit2002.wikimedia.org
14:44 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
14:29 jforrester@deploy1003: Finished scap sync-world: Backport for
mc: Use MCROUTER_SERVER values rather than local sidepod for WF cache (T423311)
(duration: 09m 36s)
14:25 jforrester@deploy1003: jforrester: Continuing with sync
14:25 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
14:21 jforrester@deploy1003: jforrester: Backport for
mc: Use MCROUTER_SERVER values rather than local sidepod for WF cache (T423311)
synced to the testservers (see
). Changes can now be verified there.
14:20 jforrester@deploy1003: Started scap sync-world: Backport for
mc: Use MCROUTER_SERVER values rather than local sidepod for WF cache (T423311)
14:18 mlitn@deploy1003: Finished scap sync-world: Backport for
fix: add missing hook registration for create account stats (T422283)
(duration: 06m 07s)
14:18 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
14:15 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1248 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260416-141515-fceratto.json
14:15 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1248.eqiad.wmnet with reason: Maintenance
14:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1247 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260416-141450-fceratto.json
14:14 mlitn@deploy1003: mlitn, migr: Continuing with sync
14:14 mlitn@deploy1003: mlitn, migr: Backport for
fix: add missing hook registration for create account stats (T422283)
synced to the testservers (see
). Changes can now be verified there.
14:12 mlitn@deploy1003: Started scap sync-world: Backport for
fix: add missing hook registration for create account stats (T422283)
14:07 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host testvm2002.codfw.wmnet with OS trixie
14:05 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
14:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1247', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260416-140442-fceratto.json
14:04 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
14:01 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
14:01 mlitn@deploy1003: Finished scap sync-world: Backport for
siwikitionary: update logo to localised svg version. (T342173)
(duration: 07m 11s)
14:01 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
14:00 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
13:58 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1245.eqiad.wmnet with reason: Maintenance
13:57 mlitn@deploy1003: mlitn, robertsky: Continuing with sync
13:56 mlitn@deploy1003: mlitn, robertsky: Backport for
siwikitionary: update logo to localised svg version. (T342173)
synced to the testservers (see
). Changes can now be verified there.
13:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1210 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260416-135549-fceratto.json
13:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1247', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260416-135434-fceratto.json
13:54 mlitn@deploy1003: Started scap sync-world: Backport for
siwikitionary: update logo to localised svg version. (T342173)
13:54 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on testvm2002.codfw.wmnet with reason: host reimage
13:51 mlitn@deploy1003: Finished scap sync-world: Backport for
Squashed diff to master
(duration: 30m 21s)
13:51 jiji@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-debug: apply
13:49 jiji@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-debug: apply
13:48 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on testvm2002.codfw.wmnet with reason: host reimage
13:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1210', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260416-134541-fceratto.json
13:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1247 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260416-134426-fceratto.json
13:41 urandom: decommissioning Cassandra [a,b] on aqs1010 —
T412830
13:39 mlitn@deploy1003: mlitn: Continuing with sync
13:38 mlitn@deploy1003: mlitn: Backport for
Squashed diff to master
synced to the testservers (see
). Changes can now be verified there.
13:38 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
13:38 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
13:36 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
13:36 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
13:36 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2222 (
T410589
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260416-133600-ladsgroup.json
13:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1210', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260416-133533-fceratto.json
13:34 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
13:33 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
13:32 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host testvm2002.codfw.wmnet with OS trixie
13:31 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
13:29 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host testvm2004.codfw.wmnet with OS trixie
13:28 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
13:26 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
13:25 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2222', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260416-132551-ladsgroup.json
13:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1210 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260416-132525-fceratto.json
13:23 jhathaway@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on krb2002.codfw.wmnet with reason:
T407726
13:21 mlitn@deploy1003: Started scap sync-world: Backport for
Squashed diff to master
13:18 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1210 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260416-131836-fceratto.json
13:18 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1210.eqiad.wmnet with reason: Maintenance
13:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1207 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260416-131806-fceratto.json
13:17 Lucas_WMDE: correction, namespaceDupes sahwikisource run was for
T423374
, my bad
13:17 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host apus-be1005.eqiad.wmnet
13:17 lucaswerkmeister-wmde@deploy1003: mwscript-k8s job started: namespaceDupes sahwikisource --fix #
T423273
13:16 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for
etwikiquote: delete unused temporary logo files (T313698)
sahwikisource: add Ааптар (author) namespace (T423374)
(duration: 10m 59s)
13:16 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
13:15 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2222', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260416-131543-ladsgroup.json
13:15 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
13:13 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
13:12 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
13:12 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host apus-be1005.eqiad.wmnet
13:10 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
13:10 lucaswerkmeister-wmde@deploy1003: anzx, lucaswerkmeister-wmde: Continuing with sync
13:09 lucaswerkmeister-wmde@deploy1003: anzx, lucaswerkmeister-wmde: Backport for
etwikiquote: delete unused temporary logo files (T313698)
sahwikisource: add Ааптар (author) namespace (T423374)
synced to the testservers (see
). Changes can now be verified there.
13:07 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1207', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260416-130758-fceratto.json
13:05 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2222 (
T410589
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260416-130535-ladsgroup.json
13:05 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for
etwikiquote: delete unused temporary logo files (T313698)
sahwikisource: add Ааптар (author) namespace (T423374)
13:03 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
12:58 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
12:58 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
12:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1207', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260416-125750-fceratto.json
12:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1207 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260416-124742-fceratto.json
12:43 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
12:41 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
12:40 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1207 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260416-124032-fceratto.json
12:40 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1207.eqiad.wmnet with reason: Maintenance
12:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1200 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260416-124001-fceratto.json
12:38 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
12:38 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
12:37 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
12:36 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
12:30 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts moss-be[2001-2002].codfw.wmnet
12:30 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
12:30 mvernon@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: moss-be[2001-2002].codfw.wmnet decommissioned, removing all IPs except the asset tag one - mvernon@cumin2002"
12:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1200', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260416-122953-fceratto.json
12:29 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
12:28 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
12:28 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
12:27 mvernon@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: moss-be[2001-2002].codfw.wmnet decommissioned, removing all IPs except the asset tag one - mvernon@cumin2002"
12:26 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
12:26 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
12:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1200', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260416-121945-fceratto.json
12:19 mvernon@cumin2002: START - Cookbook sre.dns.netbox
12:14 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
12:11 mvernon@cumin2002: START - Cookbook sre.hosts.decommission for hosts moss-be[2001-2002].codfw.wmnet
12:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1200 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260416-120935-fceratto.json
12:09 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on P{ml-serve1013.eqiad.wmnet} and (A:ml-serve-master-eqiad or A:ml-serve-worker-eqiad)
12:09 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1013.eqiad.wmnet
12:09 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1013.eqiad.wmnet
12:03 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1013.eqiad.wmnet
12:02 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host testvm2004.codfw.wmnet with OS trixie
12:01 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1200 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260416-120104-fceratto.json
12:00 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1200.eqiad.wmnet with reason: Maintenance
12:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1185 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260416-120033-fceratto.json
11:56 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
11:53 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1013.eqiad.wmnet
11:53 klausman@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P{ml-serve1013.eqiad.wmnet} and (A:ml-serve-master-eqiad or A:ml-serve-worker-eqiad)
11:53 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
11:50 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1247 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260416-115055-fceratto.json
11:50 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1247.eqiad.wmnet with reason: Maintenance
11:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1185', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260416-115024-fceratto.json
11:43 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-wmde: apply
11:42 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-wmde: apply
11:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1185', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260416-114014-fceratto.json
11:38 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on P{ml-serve1012.eqiad.wmnet} and (A:ml-serve-master-eqiad or A:ml-serve-worker-eqiad)
11:38 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1012.eqiad.wmnet
11:38 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1012.eqiad.wmnet
11:33 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-wikidata: apply
11:33 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1012.eqiad.wmnet
11:32 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-wikidata: apply
11:32 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-sre: apply
11:31 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-sre: apply
11:31 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-search: apply
11:31 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-search: apply
11:30 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-research: apply
11:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1185 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260416-113005-fceratto.json
11:30 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-research: apply
11:29 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-platform-eng: apply
11:29 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-platform-eng: apply
11:29 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-ml: apply
11:28 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-ml: apply
11:28 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-main: apply
11:28 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-main: apply
11:23 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1012.eqiad.wmnet
11:23 klausman@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P{ml-serve1012.eqiad.wmnet} and (A:ml-serve-master-eqiad or A:ml-serve-worker-eqiad)
11:21 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1185 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260416-112136-fceratto.json
11:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1185.eqiad.wmnet with reason: Maintenance
11:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1161 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260416-112105-fceratto.json
11:19 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-fr-tech: apply
11:19 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-fr-tech: apply
11:16 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-analytics-test: apply
11:16 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-analytics-test: apply
11:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260416-111058-fceratto.json
11:10 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
11:07 moritzm: updating debdeploy on bookworm to 0.0.99.15
11:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260416-110049-fceratto.json
10:57 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-analytics-product: apply
10:57 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-analytics-product: apply
10:56 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-test-k8s: apply
10:56 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
10:55 moritzm: imported debdeploy 0.0.99.15 for bookworm-wikimedia (compat release for Cumin 6)
10:52 jmm@cumin2002: DONE (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: wikikube-ctrl2006.codfw.wmnet
10:50 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
10:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1161 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260416-105040-fceratto.json
10:50 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
10:47 jmm@cumin2002: DONE (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: wikikube-ctrl2005.codfw.wmnet
10:47 jmm@cumin2002: DONE (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: wikikube-ctrl2004.codfw.wmnet
10:42 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1161 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260416-104240-fceratto.json
10:42 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1016,1020].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
10:42 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1161.eqiad.wmnet with reason: Maintenance
10:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1159 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260416-104201-fceratto.json
10:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1159', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260416-103152-fceratto.json
10:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1159', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260416-102143-fceratto.json
10:15 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
10:15 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1245.eqiad.wmnet with reason: Maintenance
10:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1244 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260416-101514-fceratto.json
10:15 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
10:14 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
10:13 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
10:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1159 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260416-101135-fceratto.json
10:09 jynus: backup1014 returns from maintenance, backups and recovery can flow as usual
T421719
10:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1244', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260416-100505-fceratto.json
09:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1244', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260416-095455-fceratto.json
09:54 jmm@cumin2002: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:aux-master-eqiad
09:52 moritzm: installing qemu security updates
09:47 jynus@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host backup1014
09:47 jynus@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host backup1014
09:45 jynus@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host backup1014
09:45 jynus@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) backup1014.eqiad.wmnet 20.48.64.10.in-addr.arpa 0.2.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
09:45 jynus@cumin1003: START - Cookbook sre.dns.wipe-cache backup1014.eqiad.wmnet 20.48.64.10.in-addr.arpa 0.2.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
09:45 jynus@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
09:45 jynus@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host backup1014 - jynus@cumin1003"
09:45 jynus@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host backup1014 - jynus@cumin1003"
09:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1244 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260416-094436-fceratto.json
09:44 jmm@cumin2002: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:aux-master-eqiad
09:42 jmm@cumin2002: DONE (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: cp2042.codfw.wmnet
09:42 jmm@cumin2002: DONE (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: cp2041.codfw.wmnet
09:41 jynus@cumin1003: START - Cookbook sre.dns.netbox
09:40 jynus@cumin1003: START - Cookbook sre.hosts.move-vlan for host backup1014
09:37 moritzm: installing imagemagick security updates
09:31 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
09:29 jynus: setting backup1014 in maintenance, no backup or recovery will run while it
T421719
09:24 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
09:24 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/thumbor: apply
09:20 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
09:20 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
09:18 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/thumbor: apply
09:16 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2169: repool after maintenance
09:15 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host backup1007
09:15 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host backup1007
09:15 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host backup1007
09:15 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) backup1007.eqiad.wmnet 88.48.64.10.in-addr.arpa 8.8.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
09:15 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache backup1007.eqiad.wmnet 88.48.64.10.in-addr.arpa 8.8.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
09:14 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
09:14 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host backup1007 - ayounsi@cumin1003"
09:14 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host backup1007 - ayounsi@cumin1003"
09:13 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/thumbor: apply
09:13 jmm@deploy1003: helmfile [staging] START helmfile.d/services/thumbor: apply
09:11 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1159 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260416-091115-fceratto.json
09:11 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1159.eqiad.wmnet with reason: Maintenance
09:11 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
09:10 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host backup1007
09:03 ayounsi@cumin1003: END (FAIL) - Cookbook sre.hosts.move-vlan (exit_code=99) for host backup1007
09:03 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host backup1007
08:56 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
08:56 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
08:55 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
08:55 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
08:51 jmm@dns1004: END - running authdns-update
08:50 jmm@dns1004: START - running authdns-update
08:43 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1225.eqiad.wmnet with reason: Maintenance
08:43 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1201 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260416-084331-fceratto.json
08:36 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on backup[1007,1014].eqiad.wmnet with reason: maintenance
08:33 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1201', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260416-083323-fceratto.json
08:31 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2169: repool after maintenance
08:27 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2169.codfw.wmnet with OS trixie
08:23 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1201', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260416-082314-fceratto.json
08:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1201 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260416-081305-fceratto.json
08:05 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2169.codfw.wmnet with reason: host reimage
08:04 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1201 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260416-080445-fceratto.json
08:04 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1201.eqiad.wmnet with reason: Maintenance
08:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1165 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260416-080420-fceratto.json
08:00 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2169.codfw.wmnet with reason: host reimage
07:55 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1244 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260416-075522-fceratto.json
07:55 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1244.eqiad.wmnet with reason: Maintenance
07:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1243 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260416-075457-fceratto.json
07:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1165', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260416-075410-fceratto.json
07:45 jmm@cumin2002: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:aux-worker-eqiad
07:45 jmm@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker1009.eqiad.wmnet
07:45 jmm@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker1009.eqiad.wmnet
07:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1243', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260416-074448-fceratto.json
07:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1165', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260416-074402-fceratto.json
07:41 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2169.codfw.wmnet with OS trixie
07:40 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2169: Reimage to Trixie
07:40 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2169: Reimage to Trixie
07:40 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2169.codfw.wmnet with reason: Reimage to Trixie
07:39 jmm@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker1009.eqiad.wmnet
07:39 jmm@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker1009.eqiad.wmnet
07:39 jmm@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker1008.eqiad.wmnet
07:39 jmm@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker1008.eqiad.wmnet
07:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1243', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260416-073440-fceratto.json
07:33 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1165 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260416-073354-fceratto.json
07:33 jmm@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker1008.eqiad.wmnet
07:33 jmm@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker1008.eqiad.wmnet
07:32 jmm@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker1007.eqiad.wmnet
07:32 jmm@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker1007.eqiad.wmnet
07:27 jmm@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker1007.eqiad.wmnet
07:27 jmm@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker1007.eqiad.wmnet
07:26 jmm@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker1006.eqiad.wmnet
07:26 jmm@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker1006.eqiad.wmnet
07:26 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1165 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260416-072650-fceratto.json
07:26 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb1015.eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
07:26 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1165.eqiad.wmnet with reason: Maintenance
07:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1243 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260416-072432-fceratto.json
07:23 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2193: after reimage to trixie
07:21 jmm@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker1006.eqiad.wmnet
07:16 jmm@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker1006.eqiad.wmnet
06:59 jmm@cumin2002: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:aux-worker-eqiad
06:55 moritzm: imported opensearch-madvise 0.2+deb13u1 to component/opensearch2 of trixie-wikimedia
T422860
06:40 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for wikikube-worker2280.codfw.wmnet
06:40 jayme@cumin2002: START - Cookbook sre.hosts.remove-downtime for wikikube-worker2280.codfw.wmnet
06:40 jayme@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2280.codfw.wmnet
06:40 jayme@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2280.codfw.wmnet
06:38 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2193: after reimage to trixie
06:37 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2193.codfw.wmnet with OS trixie
06:14 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2193.codfw.wmnet with reason: host reimage
06:08 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2193.codfw.wmnet with reason: host reimage
05:49 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2193.codfw.wmnet with OS trixie
05:48 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2193: Reimage to Trixie
05:47 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2193: Reimage to Trixie
05:47 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2193.codfw.wmnet with reason: Reimage to Trixie
05:37 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1243 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260416-053659-fceratto.json
05:36 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1243.eqiad.wmnet with reason: Maintenance
05:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1242 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260416-053635-fceratto.json
05:31 marostegui@cumin1003: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts clouddb1019.eqiad.wmnet
05:31 marostegui@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
05:31 marostegui@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: clouddb1019.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
05:30 marostegui@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: clouddb1019.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
05:27 marostegui@cumin1003: START - Cookbook sre.dns.netbox
05:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1242', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260416-052626-fceratto.json
05:22 marostegui@cumin1003: START - Cookbook sre.hosts.decommission for hosts clouddb1019.eqiad.wmnet
05:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1242', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260416-051618-fceratto.json
05:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1242 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260416-050609-fceratto.json
03:19 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1242 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260416-031934-fceratto.json
03:19 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1242.eqiad.wmnet with reason: Maintenance
03:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1241 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260416-031910-fceratto.json
03:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1241', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260416-030902-fceratto.json
02:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1241', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260416-025853-fceratto.json
02:53 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
02:52 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1253 (
T410589
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260416-025247-ladsgroup.json
02:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1241 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260416-024845-fceratto.json
02:42 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1253', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260416-024239-ladsgroup.json
02:32 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1253', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260416-023231-ladsgroup.json
02:22 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1253 (
T410589
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260416-022223-ladsgroup.json
02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 16s)
02:01 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
01:27 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db2222 (
T410589
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260416-012755-ladsgroup.json
01:27 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db2222.codfw.wmnet with reason: Maintenance
01:27 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2221 (
T410589
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260416-012730-ladsgroup.json
01:17 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2221', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260416-011722-ladsgroup.json
01:07 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2221', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260416-010714-ladsgroup.json
01:02 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1241 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260416-010218-fceratto.json
01:02 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1241.eqiad.wmnet with reason: Maintenance
01:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1238 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260416-010154-fceratto.json
00:57 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2221 (
T410589
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260416-005706-ladsgroup.json
00:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1238', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260416-005146-fceratto.json
00:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1238', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260416-004138-fceratto.json
00:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1238 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260416-003130-fceratto.json
2026-04-15
23:35 cscott@deploy1003: Finished scap sync-world: Backport for
Exclude parser functions from SpecialLintTemplateErrors (T420102)
(duration: 32m 47s)
23:23 cscott@deploy1003: cscott: Continuing with sync
23:20 cscott@deploy1003: cscott: Backport for
Exclude parser functions from SpecialLintTemplateErrors (T420102)
synced to the testservers (see
). Changes can now be verified there.
23:05 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
23:05 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
23:03 cscott@deploy1003: Started scap sync-world: Backport for
Exclude parser functions from SpecialLintTemplateErrors (T420102)
22:57 cscott@deploy1003: Finished scap sync-world: Backport for
Pass preferred LanguageConverter variant explicitly instead of implicitly (T415435)
Make variant into a parser option for parsoid language conversion (T415435)
(duration: 16m 00s)
22:53 cscott@deploy1003: cscott: Continuing with sync
22:43 cscott@deploy1003: cscott: Backport for
Pass preferred LanguageConverter variant explicitly instead of implicitly (T415435)
Make variant into a parser option for parsoid language conversion (T415435)
synced to the testservers (see
). Changes can now be verified there.
22:43 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1238 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260415-224305-fceratto.json
22:42 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1238.eqiad.wmnet with reason: Maintenance
22:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1221 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260415-224241-fceratto.json
22:41 cscott@deploy1003: Started scap sync-world: Backport for
Pass preferred LanguageConverter variant explicitly instead of implicitly (T415435)
Make variant into a parser option for parsoid language conversion (T415435)
22:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1221', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260415-223233-fceratto.json
22:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1221', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260415-222225-fceratto.json
22:15 jforrester@deploy1003: Finished scap sync-world: Backport for
PageRenderingHandler: Don't run repo-mode lang check in non-repo world either (T423515)
(duration: 08m 48s)
22:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1221 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260415-221216-fceratto.json
22:11 jforrester@deploy1003: jforrester: Continuing with sync
22:08 jforrester@deploy1003: jforrester: Backport for
PageRenderingHandler: Don't run repo-mode lang check in non-repo world either (T423515)
synced to the testservers (see
). Changes can now be verified there.
22:06 jforrester@deploy1003: Started scap sync-world: Backport for
PageRenderingHandler: Don't run repo-mode lang check in non-repo world either (T423515)
21:29 eevans@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for aqs1027.eqiad.wmnet
21:29 eevans@cumin1003: START - Cookbook sre.hosts.remove-downtime for aqs1027.eqiad.wmnet
21:14 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
21:13 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
21:13 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
21:13 dduvall@deploy1003: rebuilt and synchronized wikiversions files: group1 to 1.46.0-wmf.24 refs
T420482
21:13 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
21:12 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
21:12 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
21:07 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
21:06 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
21:06 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
21:06 bking@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on cloudelastic1012.eqiad.wmnet with reason: still fixing Puppet
21:06 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
21:05 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
21:05 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
21:05 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
21:05 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
21:05 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
21:05 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
21:04 swfrench@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
21:04 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
21:04 swfrench@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
21:03 swfrench@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
21:03 swfrench@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
21:02 swfrench@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
20:58 jforrester@deploy1003: Finished scap sync-world: Backport for
PageRenderingHandler: Handle Wikibase's OutOfBoundsException for "we don't have a label" (T423514)
PageRenderingHandler: Don't run repo-mode stuff in non-repo world (T423515)
(duration: 06m 08s)
20:54 jforrester@deploy1003: jforrester: Continuing with sync
20:54 jforrester@deploy1003: jforrester: Backport for
PageRenderingHandler: Handle Wikibase's OutOfBoundsException for "we don't have a label" (T423514)
PageRenderingHandler: Don't run repo-mode stuff in non-repo world (T423515)
synced to the testservers (see
). Changes can now be verified there.
20:52 jforrester@deploy1003: Started scap sync-world: Backport for
PageRenderingHandler: Handle Wikibase's OutOfBoundsException for "we don't have a label" (T423514)
PageRenderingHandler: Don't run repo-mode stuff in non-repo world (T423515)
20:46 jforrester@deploy1003: Finished scap sync-world: Backport for
Drop 1.5x logos (T246054)
Enwikinews: disable lingering FlaggedRevs template processing (T423512)
Record file usage from TemplateStyles pages (T413707)
(duration: 09m 15s)
20:42 jforrester@deploy1003: jforrester, bawolff, pppery: Continuing with sync
20:42 topranks: enable BGP over GRE between cr1-drmrs and cr2-eqiad
20:38 jforrester@deploy1003: jforrester, bawolff, pppery: Backport for
Drop 1.5x logos (T246054)
Enwikinews: disable lingering FlaggedRevs template processing (T423512)
Record file usage from TemplateStyles pages (T413707)
synced to the testservers (see
). Changes can now be verified there.
20:37 jforrester@deploy1003: Started scap sync-world: Backport for
Drop 1.5x logos (T246054)
Enwikinews: disable lingering FlaggedRevs template processing (T423512)
Record file usage from TemplateStyles pages (T413707)
20:36 cmooney@dns2005: END - running authdns-update
20:35 cmooney@dns2005: START - running authdns-update
20:34 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
20:34 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: generate v6 reverse records for 2a02:ec80:600:fe0a::1/64 - cmooney@cumin1003"
20:33 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: generate v6 reverse records for 2a02:ec80:600:fe0a::1/64 - cmooney@cumin1003"
20:31 mstyles@deploy1003: Finished scap sync-world: Backport for
Force Reauth (T419621)
Rename Test Kitchen Experiment (T420007)
Rename Test Kitchen Experiment (T420007)
Rename Test Kitchen Experiment (T420007)
Rename Test Kitchen Experiment (T420007)
(duration: 07m 48s)
20:30 cmooney@cumin1003: START - Cookbook sre.dns.netbox
20:27 mstyles@deploy1003: mstyles: Continuing with sync
20:26 topranks: enable ospf on GRE cr1-drmrs <-> cr2-eqiad
20:25 mstyles@deploy1003: mstyles: Backport for
Force Reauth (T419621)
Rename Test Kitchen Experiment (T420007)
Rename Test Kitchen Experiment (T420007)
Rename Test Kitchen Experiment (T420007)
Rename Test Kitchen Experiment (T420007)
synced to the testservers (see
). Changes can now
20:23 mstyles@deploy1003: Started scap sync-world: Backport for
Force Reauth (T419621)
Rename Test Kitchen Experiment (T420007)
Rename Test Kitchen Experiment (T420007)
Rename Test Kitchen Experiment (T420007)
Rename Test Kitchen Experiment (T420007)
20:20 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
20:19 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
20:17 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1221 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260415-201700-fceratto.json
20:16 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1015,1024-1025].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
20:16 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1221.eqiad.wmnet with reason: Maintenance
20:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1199 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260415-201613-fceratto.json
20:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1199', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260415-200605-fceratto.json
20:05 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
20:05 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add ipv4 dns names for eqiad-drmrs gre tunnel - cmooney@cumin1003"
20:00 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add ipv4 dns names for eqiad-drmrs gre tunnel - cmooney@cumin1003"
19:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1199', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260415-195556-fceratto.json
19:55 cmooney@cumin1003: START - Cookbook sre.dns.netbox
19:55 topranks: add static routes on cr1-drmrs and cr2-eqiad for arelion GRE far-side IPv4 addresses
19:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1199 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260415-194548-fceratto.json
19:38 topranks: add GRE tunnel to cr2-eqiad towards cr1-drmrs
19:37 topranks: add GRE tunnel to cr1-drmrs towards cr2-eqiad
18:50 dduvall@deploy1003: rebuilt and synchronized wikiversions files: group0 to 1.46.0-wmf.24 refs
T420482
18:43 dduvall: rolling back due to steady `Term with languageCode "en" not found` errors (cc
T420482
18:27 eevans@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on aqs1027.eqiad.wmnet with reason: Bootstrapping —
T412830
18:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2228 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260415-181833-fceratto.json
18:15 dduvall@deploy1003: rebuilt and synchronized wikiversions files: group1 to 1.46.0-wmf.24 refs
T420482
18:08 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp1115.*
18:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2228', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260415-180825-fceratto.json
18:01 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp1115.eqiad.wmnet with OS trixie
18:01 brett@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - brett@cumin2002"
17:58 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/page-analytics: apply
17:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2228', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260415-175817-fceratto.json
17:58 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/page-analytics: apply
17:57 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/media-analytics: apply
17:57 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/media-analytics: apply
17:57 brett@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - brett@cumin2002"
17:56 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/geo-analytics: apply
17:56 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/geo-analytics: apply
17:56 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/editor-analytics: apply
17:56 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/editor-analytics: apply
17:55 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/edit-analytics: apply
17:55 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/edit-analytics: apply
17:54 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/device-analytics: apply
17:54 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/device-analytics: apply
17:53 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/data-gateway: apply
17:53 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/data-gateway: apply
17:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2228 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260415-174808-fceratto.json
17:48 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/page-analytics: apply
17:47 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/page-analytics: apply
17:47 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/media-analytics: apply
17:46 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/media-analytics: apply
17:46 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/geo-analytics: apply
17:46 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/geo-analytics: apply
17:45 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/editor-analytics: apply
17:45 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/editor-analytics: apply
17:44 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/edit-analytics: apply
17:44 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/edit-analytics: apply
17:44 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/device-analytics: apply
17:43 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/device-analytics: apply
17:43 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/data-gateway: apply
17:43 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/data-gateway: apply
17:42 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db1253 (
T410589
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260415-174236-ladsgroup.json
17:42 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db1253.eqiad.wmnet with reason: Maintenance
17:42 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1236 (
T410589
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260415-174212-ladsgroup.json
17:41 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2228 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260415-174107-fceratto.json
17:40 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2228.codfw.wmnet with reason: Maintenance
17:40 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/page-analytics: apply
17:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2223 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260415-174035-fceratto.json
17:40 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/page-analytics: apply
17:39 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/media-analytics: apply
17:39 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/media-analytics: apply
17:39 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/geo-analytics: apply
17:39 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/geo-analytics: apply
17:38 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/editor-analytics: apply
17:38 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/editor-analytics: apply
17:37 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/edit-analytics: apply
17:37 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/edit-analytics: apply
17:37 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/device-analytics: apply
17:37 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/device-analytics: apply
17:36 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/data-gateway: apply
17:36 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/data-gateway: apply
17:36 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1199 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260415-173602-fceratto.json
17:35 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1199.eqiad.wmnet with reason: Maintenance
17:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1190 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260415-173525-fceratto.json
17:34 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp1115.eqiad.wmnet with reason: host reimage
17:32 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1236', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260415-173203-ladsgroup.json
17:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2223', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260415-173027-fceratto.json
17:29 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp1115.eqiad.wmnet with reason: host reimage
17:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1190', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260415-172517-fceratto.json
17:21 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1236', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260415-172155-ladsgroup.json
17:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2223', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260415-172019-fceratto.json
17:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1190', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260415-171509-fceratto.json
17:12 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp1115.eqiad.wmnet with OS trixie
17:11 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1236 (
T410589
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260415-171147-ladsgroup.json
17:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2223 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260415-171011-fceratto.json
17:09 kamila@deploy1003: Finished scap sync-world: Backport for
Revert "Temporarily add shellbox-icu to $wgShellboxUrls" (T422546)
Revert "Enable $wgTempCategoryCollations for testwiki." (T422546)
(duration: 16m 10s)
17:05 kamila@deploy1003: kamila: Continuing with sync
17:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1190 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260415-170501-fceratto.json
17:03 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2223 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260415-170310-fceratto.json
17:03 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2223.codfw.wmnet with reason: Maintenance
17:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2211 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260415-170239-fceratto.json
16:55 kamila@deploy1003: kamila: Backport for
Revert "Temporarily add shellbox-icu to $wgShellboxUrls" (T422546)
Revert "Enable $wgTempCategoryCollations for testwiki." (T422546)
synced to the testservers (see
). Changes can now be verified there.
16:53 kamila@deploy1003: Started scap sync-world: Backport for
Revert "Temporarily add shellbox-icu to $wgShellboxUrls" (T422546)
Revert "Enable $wgTempCategoryCollations for testwiki." (T422546)
16:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2211', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260415-165231-fceratto.json
16:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2211', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260415-164223-fceratto.json
16:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2211 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260415-163215-fceratto.json
16:25 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2211 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260415-162513-fceratto.json
16:25 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2211.codfw.wmnet with reason: Maintenance
16:20 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2201.codfw.wmnet with reason: Maintenance
16:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2192 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260415-161936-fceratto.json
16:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2192', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260415-160928-fceratto.json
15:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2192', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260415-155920-fceratto.json
15:56 jynus@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for backup1012.eqiad.wmnet
15:56 jynus@cumin1003: START - Cookbook sre.hosts.remove-downtime for backup1012.eqiad.wmnet
15:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2192 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260415-154911-fceratto.json
15:43 blake@deploy1003: Finished scap sync-world: Backport for
ProductionServices: re-add poolcounter1007.eqiad. (T420171)
(duration: 06m 09s)
15:42 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2192 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260415-154210-fceratto.json
15:42 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2192.codfw.wmnet with reason: Maintenance
15:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2178 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260415-154138-fceratto.json
15:39 blake@deploy1003: blake: Continuing with sync
15:39 blake@deploy1003: blake: Backport for
ProductionServices: re-add poolcounter1007.eqiad. (T420171)
synced to the testservers (see
). Changes can now be verified there.
15:37 blake@deploy1003: Started scap sync-world: Backport for
ProductionServices: re-add poolcounter1007.eqiad. (T420171)
15:35 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host poolcounter1007.eqiad.wmnet
15:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2178', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260415-153130-fceratto.json
15:31 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host poolcounter1007.eqiad.wmnet
15:31 blake@deploy1003: Finished scap sync-world: Backport for
ProductionServices: remove poolcounter1007.eqiad, add 1006 (T420171)
(duration: 06m 19s)
15:30 Emperor: update & restart envoy on ms swift frontends
T410975
T419637
15:30 Emperor: update & restart envoy on thanos frontends
T410975
T419637
15:27 blake@deploy1003: blake: Continuing with sync
15:26 blake@deploy1003: blake: Backport for
ProductionServices: remove poolcounter1007.eqiad, add 1006 (T420171)
synced to the testservers (see
). Changes can now be verified there.
15:26 Emperor: update & restart envoy on apus frontends
T410975
T419637
15:24 jmm@cumin2002: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:aux-master-codfw
15:24 blake@deploy1003: Started scap sync-world: Backport for
ProductionServices: remove poolcounter1007.eqiad, add 1006 (T420171)
15:24 Emperor: update & restart envoy on apus frontends
T423065
T382824
15:22 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host poolcounter1006.eqiad.wmnet
15:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2178', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260415-152122-fceratto.json
15:19 moritzm: installing Dovecot security updates on mx-out*
15:18 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host poolcounter1006.eqiad.wmnet
15:18 blake@deploy1003: Finished scap sync-world: Backport for
ProductionServices: remove poolcounter1006.eqiad (T420171)
(duration: 06m 59s)
15:14 blake@deploy1003: blake: Continuing with sync
15:14 jmm@cumin2002: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:aux-master-codfw
15:13 blake@deploy1003: blake: Backport for
ProductionServices: remove poolcounter1006.eqiad (T420171)
synced to the testservers (see
). Changes can now be verified there.
15:12 moritzm: installing inetutils security updates
15:11 blake@deploy1003: Started scap sync-world: Backport for
ProductionServices: remove poolcounter1006.eqiad (T420171)
15:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2178 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260415-151114-fceratto.json
15:08 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-test-k8s: apply
15:06 samtar@deploy1003: Finished scap sync-world: Backport for
Revert^2 "lbwiki: Set minimum requirement of 10 edits for wgAutoConfirmCount"
(duration: 06m 54s)
15:04 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2178 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260415-150415-fceratto.json
15:04 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2178.codfw.wmnet with reason: Maintenance
15:03 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2171 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260415-150344-fceratto.json
15:02 samtar@deploy1003: samtar: Continuing with sync
15:02 samtar@deploy1003: samtar: Backport for
Revert^2 "lbwiki: Set minimum requirement of 10 edits for wgAutoConfirmCount"
synced to the testservers (see
). Changes can now be verified there.
15:01 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
15:01 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
14:59 samtar@deploy1003: Started scap sync-world: Backport for
Revert^2 "lbwiki: Set minimum requirement of 10 edits for wgAutoConfirmCount"
14:59 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1190 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260415-145918-fceratto.json
14:59 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1190.eqiad.wmnet with reason: Maintenance
14:57 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
14:57 elukey@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'sync'.
14:57 elukey@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'sync'.
14:56 elukey@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'sync'.
14:56 elukey@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'sync'.
14:55 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on backup1012.eqiad.wmnet with reason: maintenance
14:53 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2171', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260415-145335-fceratto.json
14:53 samtar@deploy1003: Finished scap sync-world: Backport for
Revert "lbwiki: Set minimum requirement of 10 edits for wgAutoConfirmCount"
(duration: 06m 12s)
14:52 cgoubert@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker1273.eqiad.wmnet
14:51 cgoubert@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker1273.eqiad.wmnet
14:49 samtar@deploy1003: samtar: Continuing with sync
14:49 samtar@deploy1003: samtar: Backport for
Revert "lbwiki: Set minimum requirement of 10 edits for wgAutoConfirmCount"
synced to the testservers (see
). Changes can now be verified there.
14:47 samtar@deploy1003: Started scap sync-world: Backport for
Revert "lbwiki: Set minimum requirement of 10 edits for wgAutoConfirmCount"
14:46 cgoubert@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2280.codfw.wmnet
14:45 cgoubert@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2280.codfw.wmnet
14:43 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifeeds: apply
14:43 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2171', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260415-144327-fceratto.json
14:42 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/wikifeeds: apply
14:42 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifeeds: apply
14:42 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifeeds: apply
14:41 cgoubert@cumin1003: END (FAIL) - Cookbook sre.k8s.pool-depool-node (exit_code=99) depool for host wikikube-worker2280.codfw.wmnet
14:40 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
14:40 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
14:39 cgoubert@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2280.codfw.wmnet
14:36 jgiannelos@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifeeds: apply
14:36 jgiannelos@deploy1003: helmfile [staging] START helmfile.d/services/wikifeeds: apply
14:33 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2171 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260415-143319-fceratto.json
14:26 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2171 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260415-142615-fceratto.json
14:26 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2171.codfw.wmnet with reason: Maintenance
14:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2157 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260415-142543-fceratto.json
14:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2157', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260415-141535-fceratto.json
14:06 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
14:06 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
14:06 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
14:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2157', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260415-140527-fceratto.json
14:04 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
14:04 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
13:56 jmm@cumin2002: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:aux-worker-codfw
13:56 jmm@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker2009.codfw.wmnet
13:56 jmm@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker2009.codfw.wmnet
13:55 samtar@deploy1003: samtar, codenamenoreste: Continuing with sync
13:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2157 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260415-135519-fceratto.json
13:53 samtar@deploy1003: samtar, codenamenoreste: Backport for
lbwiki: Set minimum requirement of 10 edits for wgAutoConfirmCount (T423102)
synced to the testservers (see
). Changes can now be verified there.
13:51 samtar@deploy1003: Started scap sync-world: Backport for
lbwiki: Set minimum requirement of 10 edits for wgAutoConfirmCount (T423102)
13:51 jmm@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker2009.codfw.wmnet
13:50 jmm@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker2009.codfw.wmnet
13:50 jmm@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker2008.codfw.wmnet
13:50 jmm@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker2008.codfw.wmnet
13:47 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2157 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260415-134704-fceratto.json
13:46 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2157.codfw.wmnet with reason: Maintenance
13:45 jmm@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker2008.codfw.wmnet
13:45 jmm@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker2008.codfw.wmnet
13:44 jmm@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker2007.codfw.wmnet
13:44 jmm@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker2007.codfw.wmnet
13:44 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2068.codfw.wmnet with OS bullseye
13:39 jmm@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker2007.codfw.wmnet
13:34 jmm@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker2007.codfw.wmnet
13:34 jmm@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker2006.codfw.wmnet
13:34 jmm@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker2006.codfw.wmnet
13:29 jmm@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker2006.codfw.wmnet
13:28 jmm@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker2006.codfw.wmnet
13:25 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2068.codfw.wmnet with reason: host reimage
13:21 eevans@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for aqs1026.eqiad.wmnet
13:21 eevans@cumin1003: START - Cookbook sre.hosts.remove-downtime for aqs1026.eqiad.wmnet
13:19 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2068.codfw.wmnet with reason: host reimage
13:18 jgiannelos@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifeeds: apply
13:17 jgiannelos@deploy1003: helmfile [staging] START helmfile.d/services/wikifeeds: apply
13:17 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2197.codfw.wmnet with reason: Maintenance
13:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2193 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260415-131657-fceratto.json
13:16 kartik@deploy1003: Finished scap sync-world: Backport for
Register ArticleGuidance extension and enable in labs (T423295)
(duration: 12m 02s)
13:12 kartik@deploy1003: sbisson, kartik: Continuing with sync
13:08 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db2221 (
T410589
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260415-130849-ladsgroup.json
13:08 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db2221.codfw.wmnet with reason: Maintenance
13:08 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2218 (
T410589
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260415-130836-ladsgroup.json
13:08 jmm@cumin2002: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:aux-worker-codfw
13:08 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1150.eqiad.wmnet with reason: Maintenance
13:07 fceratto@cumin1003: DONE (ERROR) - Cookbook sre.hosts.downtime (exit_code=97) for 1 day, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1014,1018].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
13:07 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1158.eqiad.wmnet with reason: Maintenance
13:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2193', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260415-130649-fceratto.json
13:06 kartik@deploy1003: sbisson, kartik: Backport for
Register ArticleGuidance extension and enable in labs (T423295)
synced to the testservers (see
). Changes can now be verified there.
13:04 kartik@deploy1003: Started scap sync-world: Backport for
Register ArticleGuidance extension and enable in labs (T423295)
13:00 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2068.codfw.wmnet with OS bullseye
12:58 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2218', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260415-125828-ladsgroup.json
12:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2193', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260415-125640-fceratto.json
12:48 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2218', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260415-124819-ladsgroup.json
12:48 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2069.codfw.wmnet with OS bullseye
12:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2193 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260415-124633-fceratto.json
12:46 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.update-replication (exit_code=0)
12:45 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication
12:45 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.update-replication (exit_code=0)
12:45 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication
12:45 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.update-replication (exit_code=0)
12:45 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication
12:44 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.update-replication (exit_code=0)
12:44 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication
12:44 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.update-replication (exit_code=0)
12:44 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication
12:43 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.update-replication (exit_code=0)
12:43 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication
12:43 dreamyjazz@deploy1003: Finished scap sync-world: Backport for
VisualEditor hCaptcha: Clear challenge container for new render (T423294)
(duration: 08m 11s)
12:41 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.update-replication (exit_code=0)
12:41 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication
12:41 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.update-replication (exit_code=0)
12:40 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication
12:40 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.update-replication (exit_code=0)
12:40 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication
12:39 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2193 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260415-123937-fceratto.json
12:39 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2193.codfw.wmnet with reason: Maintenance
12:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2169 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260415-123915-fceratto.json
12:39 dreamyjazz@deploy1003: dreamyjazz: Continuing with sync
12:38 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.update-replication (exit_code=0)
12:38 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication
12:38 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2218 (
T410589
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260415-123811-ladsgroup.json
12:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1226 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260415-123803-fceratto.json
12:37 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.update-replication (exit_code=0)
12:37 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication
12:37 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.update-replication (exit_code=0)
12:37 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication
12:36 dreamyjazz@deploy1003: dreamyjazz: Backport for
VisualEditor hCaptcha: Clear challenge container for new render (T423294)
synced to the testservers (see
). Changes can now be verified there.
12:36 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.update-replication (exit_code=0)
12:34 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication
12:34 dreamyjazz@deploy1003: Started scap sync-world: Backport for
VisualEditor hCaptcha: Clear challenge container for new render (T423294)
12:34 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.update-replication (exit_code=0)
12:34 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication
12:34 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.update-replication (exit_code=0)
12:33 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication
12:32 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.update-replication (exit_code=0)
12:32 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication
12:31 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.update-replication (exit_code=0)
12:31 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication
12:30 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.update-replication (exit_code=0)
12:30 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication
12:30 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.update-replication (exit_code=0)
12:29 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication
12:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2169', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260415-122907-fceratto.json
12:28 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2069.codfw.wmnet with reason: host reimage
12:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1226', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260415-122756-fceratto.json
12:27 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.update-replication (exit_code=0)
12:27 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication
12:26 kart_: Updated cxserver to 2026-04-14-071531-production
12:25 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.update-replication (exit_code=0)
12:25 kartik@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
12:25 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication
12:25 fceratto@cumin1003: END (FAIL) - Cookbook sre.mysql.update-replication (exit_code=99)
12:25 kartik@deploy1003: helmfile [eqiad] START helmfile.d/services/cxserver: apply
12:23 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication
12:22 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.update-replication (exit_code=0)
12:22 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication
12:22 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2069.codfw.wmnet with reason: host reimage
12:22 kartik@deploy1003: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
12:21 kartik@deploy1003: helmfile [codfw] START helmfile.d/services/cxserver: apply
12:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2169', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260415-121859-fceratto.json
12:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1226', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260415-121748-fceratto.json
12:11 kartik@deploy1003: helmfile [staging] DONE helmfile.d/services/cxserver: apply
12:11 kartik@deploy1003: helmfile [staging] START helmfile.d/services/cxserver: apply
12:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2169 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260415-120851-fceratto.json
12:07 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1226 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260415-120739-fceratto.json
12:04 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2069.codfw.wmnet with OS bullseye
12:03 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1226 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260415-120331-fceratto.json
12:03 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1226.eqiad.wmnet with reason: Maintenance
12:03 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1214 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260415-120305-fceratto.json
12:01 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2169 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260415-120138-fceratto.json
12:01 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2169.codfw.wmnet with reason: Maintenance
12:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2158 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260415-120117-fceratto.json
12:00 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2069.codfw.wmnet with OS trixie
11:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1214', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260415-115257-fceratto.json
11:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2158', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260415-115109-fceratto.json
11:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1214', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260415-114249-fceratto.json
11:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2158', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260415-114101-fceratto.json
11:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1214 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260415-113241-fceratto.json
11:31 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
11:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2158 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260415-113053-fceratto.json
11:30 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
11:30 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
11:29 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1214 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260415-112937-fceratto.json
11:29 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1214.eqiad.wmnet with reason: Maintenance
11:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1203 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260415-112913-fceratto.json
11:27 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
11:24 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2158 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260415-112445-fceratto.json
11:24 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2158.codfw.wmnet with reason: Maintenance
11:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2151 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260415-112413-fceratto.json
11:20 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
11:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1203', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260415-111905-fceratto.json
11:18 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
11:17 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
11:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2151', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260415-111405-fceratto.json
11:12 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
11:11 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
11:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1203', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260415-110856-fceratto.json
11:05 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2069.codfw.wmnet with reason: host reimage
11:03 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2151', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260415-110357-fceratto.json
11:01 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
10:59 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2069.codfw.wmnet with reason: host reimage
10:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1203 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260415-105848-fceratto.json
10:53 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2151 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260415-105349-fceratto.json
10:53 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1203 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260415-105338-fceratto.json
10:53 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1203.eqiad.wmnet with reason: Maintenance
10:53 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1193 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260415-105314-fceratto.json
10:45 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2151 (
T419961
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260415-104535-fceratto.json
10:45 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2151.codfw.wmnet with reason: Maintenance
10:44 taavi@dns1004: END - running authdns-update
10:43 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1193', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260415-104306-fceratto.json
10:42 taavi@dns1004: START - running authdns-update
10:39 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2069.codfw.wmnet with OS trixie
10:37 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 365 days, 0:00:00 on dborch1001.wikimedia.org with reason:
T416582
10:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1193', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260415-103258-fceratto.json
10:29 mvernon@cumin2002: END (FAIL) - Cookbook sre.swift.convert-disks (exit_code=99) for host ms-be2069
10:29 mvernon@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host ms-be2069.codfw.wmnet with OS bullseye
10:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1193 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260415-102250-fceratto.json
10:19 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1193 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260415-101942-fceratto.json
10:19 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1193.eqiad.wmnet with reason: Maintenance
10:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1192 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260415-101917-fceratto.json
10:10 jayme@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2280.codfw.wmnet
10:10 jayme@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2280.codfw.wmnet
10:10 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for wikikube-worker2280.codfw.wmnet
10:10 jayme@cumin2002: START - Cookbook sre.hosts.remove-downtime for wikikube-worker2280.codfw.wmnet
10:10 elukey: upgrade spicerack on cumin nodes
10:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1192', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260415-100908-fceratto.json
10:08 elukey: uploaded spicerack_12.4.0 to apt.wikimedia.org bookworm-wikimedia
10:00 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
09:59 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2069.codfw.wmnet with reason: host reimage
09:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1192', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260415-095901-fceratto.json
09:58 jayme@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on wikikube-worker2280.codfw.wmnet with reason: hardware issues
09:56 jayme@cumin2002: END (FAIL) - Cookbook sre.k8s.pool-depool-node (exit_code=99) depool for host wikikube-worker2280.codfw.wmnet
09:53 jayme@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2280.codfw.wmnet
09:53 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2069.codfw.wmnet with reason: host reimage
09:51 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
09:50 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
09:50 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
09:49 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db1236 (
T410589
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260415-094902-ladsgroup.json
09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1192 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260415-094852-fceratto.json
09:48 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db1236.eqiad.wmnet with reason: Maintenance
09:48 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1231 (
T410589
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260415-094831-ladsgroup.json
09:45 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1192 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260415-094544-fceratto.json
09:45 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1192.eqiad.wmnet with reason: Maintenance
09:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1178 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260415-094519-fceratto.json
09:41 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
09:38 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1231', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260415-093823-ladsgroup.json
09:35 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2069.codfw.wmnet with OS bullseye
09:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1178', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260415-093511-fceratto.json
09:28 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1231', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260415-092815-ladsgroup.json
09:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1178', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260415-092502-fceratto.json
09:18 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1231 (
T410589
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260415-091807-ladsgroup.json
09:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1178 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260415-091454-fceratto.json
09:09 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1178 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260415-090945-fceratto.json
09:09 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1178.eqiad.wmnet with reason: Maintenance
09:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1177 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260415-090920-fceratto.json
08:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1177', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260415-085912-fceratto.json
08:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1177', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260415-084904-fceratto.json
08:43 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
08:43 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
08:41 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
08:40 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
08:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1177 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260415-083857-fceratto.json
08:38 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
08:35 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1177 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260415-083547-fceratto.json
08:35 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1177.eqiad.wmnet with reason: Maintenance
08:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1172 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260415-083522-fceratto.json
08:34 mvernon@cumin2002: START - Cookbook sre.swift.convert-disks for host ms-be2069
08:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1172', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260415-082514-fceratto.json
08:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1172', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260415-081506-fceratto.json
08:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1172 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260415-080458-fceratto.json
08:01 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1172 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260415-080150-fceratto.json
08:01 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1172.eqiad.wmnet with reason: Maintenance
08:00 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1171.eqiad.wmnet with reason: Maintenance
08:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1167 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260415-075959-fceratto.json
07:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1167', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260415-074951-fceratto.json
07:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1167', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260415-073942-fceratto.json
07:32 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be2006.codfw.wmnet
07:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1167 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260415-072935-fceratto.json
07:26 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1167 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260415-072626-fceratto.json
07:26 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1016,1020].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
07:25 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1167.eqiad.wmnet with reason: Maintenance
07:25 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host thanos-be2006.codfw.wmnet
07:23 Emperor: discard /srv/log/swift/server.log.5.gz on thanos-be2006 to free disk space
07:17 Emperor: discard /srv/log/swift/server.log.1 on thanos-be2006 to free disk space
02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 14s)
02:01 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
01:51 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db1231 (
T410589
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260415-015138-ladsgroup.json
01:51 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db1231.eqiad.wmnet with reason: Maintenance
01:51 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1227 (
T410589
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260415-015113-ladsgroup.json
01:41 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1227', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260415-014104-ladsgroup.json
01:30 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1227', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260415-013056-ladsgroup.json
01:20 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1227 (
T410589
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260415-012048-ladsgroup.json
01:00 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db2218 (
T410589
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260415-010004-ladsgroup.json
00:59 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db2218.codfw.wmnet with reason: Maintenance
00:59 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2208 (
T410589
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260415-005940-ladsgroup.json
00:49 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2208', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260415-004932-ladsgroup.json
00:39 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2208', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260415-003923-ladsgroup.json
00:29 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2208 (
T410589
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260415-002915-ladsgroup.json
00:16 ladsgroup@deploy1003: Finished scap sync-world: Backport for
Api: Remove deprecation warning for missing rvslots (T412637)
Api: Remove deprecation warning for missing rvslots (T412637)
(duration: 06m 41s)
00:13 ladsgroup@deploy1003: ladsgroup: Continuing with sync
00:12 ladsgroup@deploy1003: ladsgroup: Backport for
Api: Remove deprecation warning for missing rvslots (T412637)
Api: Remove deprecation warning for missing rvslots (T412637)
synced to the testservers (see
). Changes can now be verified there.
00:10 ladsgroup@deploy1003: Started scap sync-world: Backport for
Api: Remove deprecation warning for missing rvslots (T412637)
Api: Remove deprecation warning for missing rvslots (T412637)
2026-04-14
23:11 Amir1: optimizing globalblocks table on s7 (
T423349
22:44 jasmine@dns1004: END - running authdns-update
22:43 jasmine@dns1004: START - running authdns-update
21:12 bvibber@deploy1003: Finished scap sync-world: Backport for
Enable ReaderExperiments for itwiki, plwiki (T423173)
(duration: 09m 48s)
21:08 bvibber@deploy1003: bvibber: Continuing with sync
21:04 bvibber@deploy1003: bvibber: Backport for
Enable ReaderExperiments for itwiki, plwiki (T423173)
synced to the testservers (see
). Changes can now be verified there.
21:02 bvibber@deploy1003: Started scap sync-world: Backport for
Enable ReaderExperiments for itwiki, plwiki (T423173)
20:57 catrope@deploy1003: Finished scap sync-world: Backport for
Update wikimaniawiki namespace search (T423278)
Enforce 2FA requirements for phase 1 groups (T423118)
(duration: 07m 28s)
20:53 catrope@deploy1003: catrope, robertsky: Continuing with sync
20:51 catrope@deploy1003: catrope, robertsky: Backport for
Update wikimaniawiki namespace search (T423278)
Enforce 2FA requirements for phase 1 groups (T423118)
synced to the testservers (see
). Changes can now be verified there.
20:49 catrope@deploy1003: Started scap sync-world: Backport for
Update wikimaniawiki namespace search (T423278)
Enforce 2FA requirements for phase 1 groups (T423118)
20:40 cscott@deploy1003: Finished scap sync-world: Backport for
ParsoidLanguageConverter: convert inside (T422961)
LanguageConverter: Allow disabling top-level variant "guess" (T419328)
(duration: 10m 18s)
20:36 cscott@deploy1003: cscott: Continuing with sync
20:32 cscott@deploy1003: cscott: Backport for
ParsoidLanguageConverter: convert inside (T422961)
LanguageConverter: Allow disabling top-level variant "guess" (T419328)
synced to the testservers (see
). Changes can now be verified there.
20:30 cscott@deploy1003: Started scap sync-world: Backport for
ParsoidLanguageConverter: convert inside (T422961)
LanguageConverter: Allow disabling top-level variant "guess" (T419328)
20:16 mstyles@deploy1003: Finished scap sync-world: Backport for
Route email confirmation funnel through Test Kitchen experiment (T420007)
(duration: 09m 25s)
20:12 mstyles@deploy1003: mstyles: Continuing with sync
20:09 mstyles@deploy1003: mstyles: Backport for
Route email confirmation funnel through Test Kitchen experiment (T420007)
synced to the testservers (see
). Changes can now be verified there.
20:07 mstyles@deploy1003: Started scap sync-world: Backport for
Route email confirmation funnel through Test Kitchen experiment (T420007)
19:30 swfrench-wmf: applied external-services network policy updates for cassandra-analytics-query-service-storage-[ab]-eqiad (aqs1026) and dumps-wikimedia in wikikube clusters
19:27 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
19:27 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
19:24 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
19:23 swfrench@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
19:22 swfrench@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
19:21 swfrench@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
19:20 swfrench@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
19:19 swfrench@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
19:16 eevans@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on aqs1026.eqiad.wmnet with reason: Bootstrapping —
T412830
18:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2216 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260414-184440-fceratto.json
18:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2216', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260414-183432-fceratto.json
18:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2216', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260414-182424-fceratto.json
18:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2216 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260414-181416-fceratto.json
18:11 dancy@deploy1003: rebuilt and synchronized wikiversions files: group0 to 1.46.0-wmf.24 refs
T420482
18:04 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-upgrade-ats (exit_code=0) Rolling upgrade of ATS on A:cp-text_eqiad - 9.2.13 Upgrade ()
18:03 ladsgroup@deploy1003: Finished scap sync-world: Backport for
src: Fix typos
(duration: 07m 13s)
17:59 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2216 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260414-175927-fceratto.json
17:59 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2216.codfw.wmnet with reason: Maintenance
17:59 ladsgroup@deploy1003: ladsgroup: Continuing with sync
17:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2203 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260414-175902-fceratto.json
17:58 ladsgroup@deploy1003: ladsgroup: Backport for
src: Fix typos
synced to the testservers (see
). Changes can now be verified there.
17:56 ladsgroup@deploy1003: Started scap sync-world: Backport for
src: Fix typos
17:56 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-upgrade-ats (exit_code=0) Rolling upgrade of ATS on A:cp-upload_eqiad - 9.2.13 Upgrade ()
17:51 mvernon@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ms-be2068.codfw.wmnet with OS bullseye
17:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2203', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260414-174854-fceratto.json
17:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2203', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260414-173846-fceratto.json
17:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2203 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260414-172838-fceratto.json
17:17 brett@cumin2002: START - Cookbook sre.cdn.roll-upgrade-ats Rolling upgrade of ATS on A:cp-text_eqiad - 9.2.13 Upgrade ()
17:17 brett@cumin2002: START - Cookbook sre.cdn.roll-upgrade-ats Rolling upgrade of ATS on A:cp-upload_eqiad - 9.2.13 Upgrade ()
17:12 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2203 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260414-171246-fceratto.json
17:12 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2203.codfw.wmnet with reason: Maintenance
17:07 taavi: updating caprica hostlists on cloud-hosts-in cr firewall policies
17:00 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2202.codfw.wmnet with reason: Maintenance
17:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2188 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260414-170010-fceratto.json
16:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2188', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260414-165001-fceratto.json
16:43 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on zuul2001.codfw.wmnet with reason:
T421398
16:43 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on zuul1001.eqiad.wmnet with reason:
T421398
16:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2188', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260414-163953-fceratto.json
16:35 daniel@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/aux-k8s-services/redioscope: apply
16:35 daniel@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/aux-k8s-services/redioscope: apply
16:34 daniel@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/aux-k8s-services/redioscope: apply
16:34 daniel@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/aux-k8s-services/redioscope: apply
16:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2188 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260414-162945-fceratto.json
16:20 jforrester@deploy1003: Finished scap sync-world: Backport for
wmgMonologChannels: Reduce WikiLambda* sub-channels logging from debug to info
(duration: 08m 27s)
16:16 jforrester@deploy1003: jforrester: Continuing with sync
16:15 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2068.codfw.wmnet with reason: host reimage
16:13 jforrester@deploy1003: jforrester: Backport for
wmgMonologChannels: Reduce WikiLambda* sub-channels logging from debug to info
synced to the testservers (see
). Changes can now be verified there.
16:13 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2188 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260414-161351-fceratto.json
16:13 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2188.codfw.wmnet with reason: Maintenance
16:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2176 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260414-161326-fceratto.json
16:12 jforrester@deploy1003: Started scap sync-world: Backport for
wmgMonologChannels: Reduce WikiLambda* sub-channels logging from debug to info
16:10 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2068.codfw.wmnet with reason: host reimage
16:08 jforrester@deploy1003: Finished scap sync-world: Backport for
wmgMonologChannels: Add WikiLambda* sub-channels, all at debug for some quick checks
(duration: 06m 32s)
16:04 jforrester@deploy1003: jforrester: Continuing with sync
16:03 jforrester@deploy1003: jforrester: Backport for
wmgMonologChannels: Add WikiLambda* sub-channels, all at debug for some quick checks
synced to the testservers (see
). Changes can now be verified there.
16:03 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2176', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260414-160319-fceratto.json
16:01 jforrester@deploy1003: Started scap sync-world: Backport for
wmgMonologChannels: Add WikiLambda* sub-channels, all at debug for some quick checks
15:53 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2176', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260414-155310-fceratto.json
15:52 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2068.codfw.wmnet with OS bullseye
15:50 mvernon@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ms-be2068.codfw.wmnet
15:45 cdanis@deploy1003: Finished scap sync-world: Backport for
SwiftFileBackend: propagate tracing context to HTTP client
(duration: 08m 24s)
15:43 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2176 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260414-154302-fceratto.json
15:41 cdanis@deploy1003: cdanis: Continuing with sync
15:38 cdanis@deploy1003: cdanis: Backport for
SwiftFileBackend: propagate tracing context to HTTP client
synced to the testservers (see
). Changes can now be verified there.
15:37 cdanis@deploy1003: Started scap sync-world: Backport for
SwiftFileBackend: propagate tracing context to HTTP client
15:33 mvernon@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ms-be2068.codfw.wmnet
15:32 mvernon@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host ms-be2068.codfw.wmnet with OS bullseye
15:26 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
15:24 jasmine@dns1004: END - running authdns-update
15:24 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
15:23 jasmine@dns1004: START - running authdns-update
15:21 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2176 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260414-152156-fceratto.json
15:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2176.codfw.wmnet with reason: Maintenance
15:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2174 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260414-152132-fceratto.json
15:18 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
15:18 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
15:17 daniel@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/aux-k8s-services/redioscope: apply
15:15 daniel@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/aux-k8s-services/redioscope: apply
15:13 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
15:12 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
15:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2174', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260414-151123-fceratto.json
15:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2174', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260414-150115-fceratto.json
14:59 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2068.codfw.wmnet with reason: host reimage
14:56 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
14:54 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2068.codfw.wmnet with reason: host reimage
14:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2174 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260414-145107-fceratto.json
14:50 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
14:49 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
14:44 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
14:39 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2214: after reimage to trixie
14:36 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2068.codfw.wmnet with OS bullseye
14:35 mvernon@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ms-be2068.codfw.wmnet
14:33 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2174 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260414-143301-fceratto.json
14:32 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2174.codfw.wmnet with reason: Maintenance
14:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2173 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260414-143235-fceratto.json
14:26 mvernon@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ms-be2068.codfw.wmnet
14:26 mvernon@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ms-be2068.codfw.wmnet
14:25 mvernon@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ms-be2068.codfw.wmnet
14:25 mvernon@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ms-be2068.codfw.wmnet
14:25 mvernon@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ms-be2068.codfw.wmnet
14:24 mvernon@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ms-be2068.codfw.wmnet
14:22 mvernon@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ms-be2068.codfw.wmnet
14:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2173', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260414-142227-fceratto.json
14:18 sukhe@dns1004: END - running authdns-update
14:17 sukhe@dns1004: START - running authdns-update
14:16 ayounsi@cumin1003: conftool action : set/pooled=yes; selector: cluster=dnsbox,dc=esams [reason: esams maintenance over]
14:16 mvernon@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host ms-be2068.codfw.wmnet with OS bullseye
14:16 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: pool esams [reason: network maintenance finished,
T416450
14:16 ayounsi@cumin1003: START - Cookbook sre.dns.admin DNS admin: pool esams [reason: network maintenance finished,
T416450
14:14 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for 13 hosts
14:14 ayounsi@cumin1003: START - Cookbook sre.hosts.remove-downtime for 13 hosts
14:14 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for 12 hosts
14:14 ayounsi@cumin1003: START - Cookbook sre.hosts.remove-downtime for 12 hosts
14:13 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for 8 hosts
14:13 ayounsi@cumin1003: START - Cookbook sre.hosts.remove-downtime for 8 hosts
14:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2173', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260414-141219-fceratto.json
14:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2173 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260414-140211-fceratto.json
13:57 XioNoX: asw1-by27-esams> request system reboot -
T416450
13:56 ayounsi@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 8 hosts with reason: router upgrade
13:55 ayounsi@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on asw1-by27-esams,asw1-by27-esams IPv6,asw1-by27-esams.mgmt with reason: router upgrade
13:55 mszwarc@deploy1003: mwscript-k8s job started: foreachwikiindblist all backfillInterwikiRightsLog.php --remote-wiki metawiki
20260311190000
T6055
(third attempt)
13:54 ayounsi@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 12 hosts with reason: router upgrade
13:54 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2214: after reimage to trixie
13:53 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2214.codfw.wmnet with OS trixie
13:47 Lucas_WMDE: UTC afternoon backport+config window done
13:45 stran@deploy1003: Finished scap sync-world: Backport for
Update webonyx/graphql-php to 15.31.5 (T423216)
(duration: 07m 05s)
13:44 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2173 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260414-134416-fceratto.json
13:44 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2173.codfw.wmnet with reason: Maintenance
13:43 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2170 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260414-134350-fceratto.json
13:41 stran@deploy1003: stran: Continuing with sync
13:40 stran@deploy1003: stran: Backport for
Update webonyx/graphql-php to 15.31.5 (T423216)
synced to the testservers (see
). Changes can now be verified there.
13:38 stran@deploy1003: Started scap sync-world: Backport for
Update webonyx/graphql-php to 15.31.5 (T423216)
13:36 XioNoX: asw1-bw27-esams> request system reboot -
T416450
13:35 ayounsi@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on asw1-bw27-esams,asw1-bw27-esams IPv6,asw1-bw27-esams.mgmt with reason: router upgrade
13:35 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2068.codfw.wmnet with reason: host reimage
13:33 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2170', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260414-133342-fceratto.json
13:33 ayounsi@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 13 hosts with reason: router upgrade
13:31 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2068.codfw.wmnet with reason: host reimage
13:31 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2214.codfw.wmnet with reason: host reimage
13:26 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2214.codfw.wmnet with reason: host reimage
13:23 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2170', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260414-132334-fceratto.json
13:23 Amir1: on testcommonswiki drop table if exists categorylinks; drop table if exists externallinks; drop table if exists linktarget; drop table if exists collation; drop table if exists imagelinks; drop table if exists iwlinks; drop table if exists existencelinks; (
T421914
13:19 dreamyjazz@deploy1003: Finished scap sync-world: Backport for
Stop setting $wgCampaignEventsEnableEventGoals (T414150)
Revert "zhwiki: Temporary Logo Change for WP25" (T414299)
Enable VisualEditor hCaptcha on testwiki (T423252)
(duration: 09m 27s)
13:16 dreamyjazz@deploy1003: daimona, stang, dreamyjazz: Continuing with sync
13:15 XioNoX: cr2-esams - request vmhost reboot -
T416450
13:14 elukey: disable cert-renewal on wikikube staging clusters as a test for the PKI discovery intermediate rollout - To rollback, revert:
T420993
13:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2170 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260414-131326-fceratto.json
13:13 elukey@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'sync'.
13:12 elukey@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'sync'.
13:12 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2068.codfw.wmnet with OS bullseye
13:12 elukey@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'sync'.
13:12 elukey@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'sync'.
13:12 dreamyjazz@deploy1003: daimona, stang, dreamyjazz: Backport for
Stop setting $wgCampaignEventsEnableEventGoals (T414150)
Revert "zhwiki: Temporary Logo Change for WP25" (T414299)
Enable VisualEditor hCaptcha on testwiki (T423252)
synced to the testservers (see
). Changes can now be verified there.
13:12 mvernon@cumin2002: END (FAIL) - Cookbook sre.swift.convert-disks (exit_code=99) for host ms-be2068
13:12 mvernon@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host ms-be2068.codfw.wmnet with OS bullseye
13:10 dreamyjazz@deploy1003: Started scap sync-world: Backport for
Stop setting $wgCampaignEventsEnableEventGoals (T414150)
Revert "zhwiki: Temporary Logo Change for WP25" (T414299)
Enable VisualEditor hCaptcha on testwiki (T423252)
13:09 ayounsi@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on cr2-esams,cr2-esams IPv6,cr2-esams.mgmt with reason: router upgrade
13:06 jmm@dns1004: END - running authdns-update
13:06 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2214.codfw.wmnet with OS trixie
13:05 jmm@dns1004: START - running authdns-update
13:03 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2214: Reimage to Trixie
13:03 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2214: Reimage to Trixie
13:03 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2214.codfw.wmnet with reason: Reimage to Trixie
13:01 XioNoX: cr1-esams - request chassis routing-engine master switch -
T416450
12:59 ayounsi@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on cr1-esams,cr1-esams IPv6,re0.cr1-esams.mgmt with reason: router upgrade
12:56 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2170 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260414-125642-fceratto.json
12:56 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2170.codfw.wmnet with reason: Maintenance
12:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2153 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260414-125628-fceratto.json
12:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2153', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260414-124620-fceratto.json
12:41 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2068.codfw.wmnet with reason: host reimage
12:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2153', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260414-123611-fceratto.json
12:35 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2068.codfw.wmnet with reason: host reimage
12:35 dbrant@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifeeds: apply
12:34 dbrant@deploy1003: helmfile [codfw] START helmfile.d/services/wikifeeds: apply
12:33 XioNoX: cr1-esams - request chassis routing-engine master switch -
T416450
12:33 dbrant@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifeeds: apply
12:32 dbrant@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifeeds: apply
12:28 dbrant@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifeeds: apply
12:28 root@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1151: Security updates
12:28 root@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
12:28 root@cumin1003: START - Cookbook sre.mysql.parsercache
12:28 root@cumin1003: START - Cookbook sre.mysql.pool pool db1151: Security updates
12:27 dbrant@deploy1003: helmfile [staging] START helmfile.d/services/wikifeeds: apply
12:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2153 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260414-122603-fceratto.json
12:22 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: depool esams [reason: network maintenance continue,
T416450
12:22 ayounsi@cumin1003: START - Cookbook sre.dns.admin DNS admin: depool esams [reason: network maintenance continue,
T416450
12:17 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: pool esams [reason: network maintenance paused,
T416450
12:17 ayounsi@cumin1003: START - Cookbook sre.dns.admin DNS admin: pool esams [reason: network maintenance paused,
T416450
12:14 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/proton: apply
12:08 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2153 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260414-120812-fceratto.json
12:08 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2153.codfw.wmnet with reason: Maintenance
12:07 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2146 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260414-120747-fceratto.json
12:03 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/proton: apply
12:02 root@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1151: Security updates
12:02 root@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
12:02 root@cumin1003: START - Cookbook sre.mysql.parsercache
12:02 root@cumin1003: START - Cookbook sre.mysql.depool depool db1151: Security updates
12:02 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/proton: apply
12:02 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db1227 (
T410589
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260414-120200-ladsgroup.json
12:01 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db1227.eqiad.wmnet with reason: Maintenance
12:01 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/proton: apply
11:57 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db2208 (
T410589
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260414-115752-ladsgroup.json
11:57 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db2208.codfw.wmnet with reason: Maintenance
11:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2146', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260414-115739-fceratto.json
11:57 ayounsi@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on cr1-esams,cr1-esams IPv6,re0.cr1-esams.mgmt with reason: router upgrade
11:55 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/proton: apply
11:54 jmm@deploy1003: helmfile [staging] START helmfile.d/services/proton: apply
11:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2146', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260414-114732-fceratto.json
11:47 ayounsi@cumin1003: conftool action : set/pooled=no; selector: cluster=dnsbox,dc=esams [reason: esams maintenance over]
11:46 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: depool esams [reason: network maintenance,
T416450
11:46 ayounsi@cumin1003: START - Cookbook sre.dns.admin DNS admin: depool esams [reason: network maintenance,
T416450
11:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2146 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260414-113721-fceratto.json
11:35 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2146 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260414-113510-fceratto.json
11:35 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2146.codfw.wmnet with reason: Maintenance
11:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2145 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260414-113456-fceratto.json
11:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2145', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260414-112448-fceratto.json
11:24 root@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1153: Security updates
11:24 root@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
11:24 root@cumin1003: START - Cookbook sre.mysql.parsercache
11:24 root@cumin1003: START - Cookbook sre.mysql.pool pool db1153: Security updates
11:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2145', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260414-111440-fceratto.json
11:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2145 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260414-110432-fceratto.json
11:00 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2068.codfw.wmnet with OS bullseye
10:59 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2145 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260414-105920-fceratto.json
10:59 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2145.codfw.wmnet with reason: Maintenance
10:59 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2141.codfw.wmnet with reason: Maintenance
10:56 root@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1153: Security updates
10:56 root@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
10:56 root@cumin1003: START - Cookbook sre.mysql.parsercache
10:56 root@cumin1003: START - Cookbook sre.mysql.depool depool db1153: Security updates
10:55 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1151: Security update
10:55 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
10:54 fceratto@cumin1003: START - Cookbook sre.mysql.parsercache
10:54 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1151: Security update
10:25 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cumin1003.eqiad.wmnet
10:24 volans@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:05:00 on sretest1005.eqiad.wmnet with reason: Testing cumin v6.0.0
10:23 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
10:23 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
10:21 volans: install cumin v6.0.0 on cumin1003 (last host remained to upgrade)
10:19 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host cumin1003.eqiad.wmnet
10:16 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
10:14 fceratto@cumin2002: dbctl commit (dc=all): 'Pool in', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260414-101428-fceratto.json
10:14 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
10:13 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2068.codfw.wmnet
10:11 marostegui@cumin1003: dbctl commit (dc=all): 'Fully repool db1168', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260414-101119-marostegui.json
10:10 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
10:10 marostegui@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db1168: after reimage to trixie
10:09 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
10:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2146', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260414-100942-fceratto.json
10:09 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
10:08 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
10:07 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1168: after reimage to trixie
10:06 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1168.eqiad.wmnet with OS trixie
10:04 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
10:04 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
10:03 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
10:02 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
09:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2146', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260414-095934-fceratto.json
09:56 elukey: rotated debmonitor client and server certs fleetwide for intermediate certs rotation -
T420993
09:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2146 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260414-094926-fceratto.json
09:47 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool pc2011: Test depool
09:47 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
09:46 fceratto@cumin1003: START - Cookbook sre.mysql.parsercache
09:46 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool pc2011: Test depool
09:46 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool pc2011: Test depool
09:46 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
09:46 fceratto@cumin1003: START - Cookbook sre.mysql.parsercache
09:46 fceratto@cumin1003: START - Cookbook sre.mysql.depool depool pc2011: Test depool
09:46 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool pc2011: Test depool
09:46 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
09:46 fceratto@cumin1003: START - Cookbook sre.mysql.parsercache
09:46 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool pc2011: Test depool
09:45 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2144.codfw.wmnet with reason:
T419961
09:38 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1151.eqiad.wmnet with reason:
T419961
09:32 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool pc2011: Test depool
09:32 fceratto@cumin1003: END (FAIL) - Cookbook sre.mysql.parsercache (exit_code=99)
09:32 fceratto@cumin1003: START - Cookbook sre.mysql.parsercache
09:32 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool pc2011: Test depool
09:32 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2146 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260414-093204-fceratto.json
09:31 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool pc2011: Test depool
09:31 fceratto@cumin1003: END (FAIL) - Cookbook sre.mysql.parsercache (exit_code=99)
09:31 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2146.codfw.wmnet with reason: Maintenance
09:31 fceratto@cumin1003: START - Cookbook sre.mysql.parsercache
09:31 fceratto@cumin1003: START - Cookbook sre.mysql.depool depool pc2011: Test depool
09:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2145 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260414-093138-fceratto.json
09:29 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2144: Test depool
09:29 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
09:29 fceratto@cumin1003: START - Cookbook sre.mysql.parsercache
09:29 fceratto@cumin1003: START - Cookbook sre.mysql.depool depool db2144: Test depool
09:27 root@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1151: Security updates
09:27 root@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
09:27 root@cumin1003: START - Cookbook sre.mysql.parsercache
09:27 root@cumin1003: START - Cookbook sre.mysql.depool depool db1151: Security updates
09:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host backup1006
09:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host backup1006
09:24 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host backup1006
09:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) backup1006.eqiad.wmnet 162.32.64.10.in-addr.arpa 2.6.1.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
09:24 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache backup1006.eqiad.wmnet 162.32.64.10.in-addr.arpa 2.6.1.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
09:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
09:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host backup1006 - ayounsi@cumin1003"
09:24 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host backup1006 - ayounsi@cumin1003"
09:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2145', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260414-092130-fceratto.json
09:19 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
09:17 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host backup1006
09:12 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on backup[1006-1007,1014].eqiad.wmnet with reason: maintenance
09:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2145', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260414-091122-fceratto.json
09:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2145 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260414-090112-fceratto.json
08:45 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2180: repool after maintenance
08:43 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2145 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260414-084353-fceratto.json
08:43 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2145.codfw.wmnet with reason: Maintenance
08:34 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2068.codfw.wmnet
08:34 mvernon@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ms-be2068.codfw.wmnet
08:31 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1168.eqiad.wmnet with reason: host reimage
08:28 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2141.codfw.wmnet with reason: Maintenance
08:25 mvernon@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ms-be2068.codfw.wmnet
08:25 mvernon@cumin2002: START - Cookbook sre.swift.convert-disks for host ms-be2068
08:23 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2068.codfw.wmnet with OS bullseye
08:22 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1168.eqiad.wmnet with reason: host reimage
08:20 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool pc2012:
T419961
08:20 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
08:20 fceratto@cumin1003: START - Cookbook sre.mysql.parsercache
08:20 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool pc2012:
T419961
08:07 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1168.eqiad.wmnet with OS trixie
08:06 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1168: Reimage to Trixie
08:05 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1168: Reimage to Trixie
08:05 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1168.eqiad.wmnet with reason: Reimage to Trixie
08:05 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2068.codfw.wmnet with reason: host reimage
08:04 moritzm: installing libnginx-mod-http-lua security updates
08:02 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool pc2012:
T419961
08:02 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
08:02 fceratto@cumin1003: START - Cookbook sre.mysql.parsercache
08:02 fceratto@cumin1003: START - Cookbook sre.mysql.depool depool pc2012:
T419961
08:02 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool pc1012:
T419961
08:02 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
08:02 fceratto@cumin1003: START - Cookbook sre.mysql.parsercache
08:02 fceratto@cumin1003: START - Cookbook sre.mysql.depool depool pc1012:
T419961
08:02 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool pc1012:
T419961
08:02 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
08:02 fceratto@cumin1003: START - Cookbook sre.mysql.parsercache
08:01 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool pc1012:
T419961
08:00 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2180: repool after maintenance
08:00 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2180.codfw.wmnet with OS trixie
07:58 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2068.codfw.wmnet with reason: host reimage
07:51 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1180: after upgrade
07:49 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on pc1012.eqiad.wmnet with reason:
T419961
07:49 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on pc2012.codfw.wmnet with reason:
T419961
07:41 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host ms-be2068
07:41 mvernon@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-be2068
07:40 mvernon@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ms-be2068
07:40 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ms-be2068.codfw.wmnet 91.32.192.10.in-addr.arpa 1.9.0.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
07:40 mvernon@cumin2002: START - Cookbook sre.dns.wipe-cache ms-be2068.codfw.wmnet 91.32.192.10.in-addr.arpa 1.9.0.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
07:40 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
07:40 mvernon@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be2068 - mvernon@cumin2002"
07:40 mvernon@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be2068 - mvernon@cumin2002"
07:38 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2180.codfw.wmnet with reason: host reimage
07:36 mvernon@cumin2002: START - Cookbook sre.dns.netbox
07:36 mvernon@cumin2002: START - Cookbook sre.hosts.move-vlan for host ms-be2068
07:35 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2068.codfw.wmnet with OS bullseye
07:32 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2180.codfw.wmnet with reason: host reimage
07:22 mszwarc@deploy1003: Finished scap sync-world: Backport for
Prepare $wgOATH2FARequiredGroupRemovalPages for next groups (T423118)
(duration: 12m 36s)
07:16 mszwarc@deploy1003: mszwarc: Continuing with sync
07:15 mszwarc@deploy1003: mszwarc: Backport for
Prepare $wgOATH2FARequiredGroupRemovalPages for next groups (T423118)
synced to the testservers (see
). Changes can now be verified there.
07:14 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2180.codfw.wmnet with OS trixie
07:12 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2180: Reimage to Trixie
07:11 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2180: Reimage to Trixie
07:11 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2180.codfw.wmnet with reason: Reimage to Trixie
07:11 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2180.codfw.wmnet with reason: Reimage to Trixie
07:10 mszwarc@deploy1003: Started scap sync-world: Backport for
Prepare $wgOATH2FARequiredGroupRemovalPages for next groups (T423118)
07:09 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2180.codfw.wmnet with reason: Reimage to Trixie
07:09 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2180.codfw.wmnet with reason: Reimage to Trixie
07:06 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1180: after upgrade
06:58 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2217: repool after reimage to trixie
06:57 jmm@dns1004: END - running authdns-update
06:56 jmm@dns1004: START - running authdns-update
06:47 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host puppetserver1002.eqiad.wmnet
06:40 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host puppetserver1002.eqiad.wmnet
06:40 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host puppetserver2002.codfw.wmnet
06:33 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host puppetserver2002.codfw.wmnet
06:30 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-wmde: apply
06:30 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1180.eqiad.wmnet with OS trixie
06:30 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-wmde: apply
06:27 jmm@dns1004: END - running authdns-update
06:25 jmm@dns1004: START - running authdns-update
06:22 jmm@dns1004: END - running authdns-update
06:20 jmm@dns1004: START - running authdns-update
06:13 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2217: repool after reimage to trixie
06:11 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2217.codfw.wmnet with OS trixie
06:08 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1180.eqiad.wmnet with reason: host reimage
06:02 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1180.eqiad.wmnet with reason: host reimage
06:02 jmm@dns1004: END - running authdns-update
06:00 jmm@dns1004: START - running authdns-update
05:54 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2217.codfw.wmnet with reason: host reimage
05:49 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2217.codfw.wmnet with reason: host reimage
05:47 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1180.eqiad.wmnet with OS trixie
05:46 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.depool (exit_code=99) depool db1180: Upgrade package
05:35 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1180.eqiad.wmnet with reason: Reimage to Trixie
05:35 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1180: Upgrade package
05:30 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2217.codfw.wmnet with OS trixie
05:28 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2217: Reimage
05:27 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2217: Reimage
05:27 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2217.codfw.wmnet with reason: Reimage to Trixie
04:02 mwpresync@deploy1003: Pruned MediaWiki: 1.46.0-wmf.21 (duration: 02m 34s)
03:39 mwpresync@deploy1003: Finished scap sync-world: testwikis to 1.46.0-wmf.24 refs
T420482
(duration: 35m 44s)
03:03 mwpresync@deploy1003: Started scap sync-world: testwikis to 1.46.0-wmf.24 refs
T420482
00:57 ladsgroup@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1227: Work done
00:51 eevans@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for aqs1025.eqiad.wmnet
00:51 eevans@cumin1003: START - Cookbook sre.hosts.remove-downtime for aqs1025.eqiad.wmnet
00:39 ladsgroup@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2208: Work done
00:09 ladsgroup@cumin1003: START - Cookbook sre.mysql.pool pool db1227: Work done
00:08 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: sync
00:05 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: sync
2026-04-13
23:54 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/data-gateway: sync
23:53 eevans@deploy1003: helmfile [staging] START helmfile.d/services/data-gateway: sync
23:49 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: sync
23:49 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: sync
23:49 ladsgroup@cumin1003: START - Cookbook sre.mysql.pool pool db2208: Work done
23:05 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-upgrade-ats (exit_code=0) Rolling upgrade of ATS on A:cp-text_codfw - 9.2.13 Upgrade ()
23:02 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-upgrade-ats (exit_code=0) Rolling upgrade of ATS on A:cp-upload_codfw - 9.2.13 Upgrade ()
22:29 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5024.*
22:29 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5023.*
22:26 brett@cumin2002: START - Cookbook sre.cdn.roll-upgrade-ats Rolling upgrade of ATS on A:cp-upload_codfw - 9.2.13 Upgrade ()
22:26 brett@cumin2002: START - Cookbook sre.cdn.roll-upgrade-ats Rolling upgrade of ATS on A:cp-text_codfw - 9.2.13 Upgrade ()
22:23 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-upgrade-ats (exit_code=0) Rolling upgrade of ATS on P{cp[5023-5024].eqsin.wmnet} and A:cp - 9.2.13 Upgrade ()
22:15 sbassett@deploy1003: Finished scap sync-world: Deployed security fix for
T422085
(duration: 30m 14s)
22:08 brett@cumin2002: START - Cookbook sre.cdn.roll-upgrade-ats Rolling upgrade of ATS on P{cp[5023-5024].eqsin.wmnet} and A:cp - 9.2.13 Upgrade ()
22:08 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-upgrade-ats (exit_code=0) Rolling upgrade of ATS on A:cp-upload_eqsin - 9.2.13 Upgrade ()
22:04 brett@dns1006: END - running authdns-update
22:04 swfrench-wmf: applied pending external-services network policy diffs for aqs1025 in wikikube clusters
22:03 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
22:02 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
22:02 brett@dns1006: START - running authdns-update
21:56 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
21:55 swfrench@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
21:55 brett@cumin2002: END (FAIL) - Cookbook sre.cdn.roll-upgrade-ats (exit_code=1) Rolling upgrade of ATS on A:cp-text_eqsin - 9.2.13 Upgrade ()
21:55 swfrench@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
21:54 swfrench@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
21:53 swfrench@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
21:52 swfrench@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
21:44 sbassett@deploy1003: Started scap sync-world: Deployed security fix for
T422085
21:41 sbassett: Deployed security patch for
T418533
21:16 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db2208 (
T410589
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260413-211606-ladsgroup.json
21:15 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db2208.codfw.wmnet with reason: Maintenance
21:13 brett@cumin2002: START - Cookbook sre.cdn.roll-upgrade-ats Rolling upgrade of ATS on A:cp-upload_eqsin - 9.2.13 Upgrade ()
21:13 brett@cumin2002: START - Cookbook sre.cdn.roll-upgrade-ats Rolling upgrade of ATS on A:cp-text_eqsin - 9.2.13 Upgrade ()
21:08 eevans@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on aqs1025.eqiad.wmnet with reason: Bootstrapping —
T412830
20:55 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2198.codfw.wmnet with reason: Maintenance
20:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2195 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260413-205531-fceratto.json
20:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2195', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260413-204523-fceratto.json
20:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2195', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260413-203514-fceratto.json
20:31 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-upgrade-ats (exit_code=0) Rolling upgrade of ATS on P{cp[3066,3068-3073].esams.wmnet} and A:cp - 9.2.13 Upgrade ()
20:28 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-upgrade-ats (exit_code=0) Rolling upgrade of ATS on P{cp[3075-3081].esams.wmnet} and A:cp - 9.2.13 Upgrade ()
20:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2195 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260413-202506-fceratto.json
20:22 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2195 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260413-202201-fceratto.json
20:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2195.codfw.wmnet with reason: Maintenance
20:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2181 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260413-202137-fceratto.json
20:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2181', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260413-201130-fceratto.json
20:07 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
20:07 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
20:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2181', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260413-200122-fceratto.json
20:01 andrewtavis-wmde@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-wmde: apply
20:01 andrewtavis-wmde@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-wmde: apply
19:56 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudvirtlocal1003.eqiad.wmnet
19:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2181 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260413-195113-fceratto.json
19:49 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudvirtlocal1003.eqiad.wmnet
19:49 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudvirtlocal1002.eqiad.wmnet
19:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2181 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260413-194759-fceratto.json
19:47 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2181.codfw.wmnet with reason: Maintenance
19:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2167 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260413-194734-fceratto.json
19:46 brett@cumin2002: START - Cookbook sre.cdn.roll-upgrade-ats Rolling upgrade of ATS on P{cp[3075-3081].esams.wmnet} and A:cp - 9.2.13 Upgrade ()
19:45 brett@cumin2002: START - Cookbook sre.cdn.roll-upgrade-ats Rolling upgrade of ATS on P{cp[3066,3068-3073].esams.wmnet} and A:cp - 9.2.13 Upgrade ()
19:42 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudvirtlocal1002.eqiad.wmnet
19:42 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudvirtlocal1001.eqiad.wmnet
19:39 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-upgrade-ats (exit_code=0) Rolling upgrade of ATS on A:cp-text_drmrs - 9.2.13 Upgrade ()
19:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2167', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260413-193726-fceratto.json
19:36 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-upgrade-ats (exit_code=0) Rolling upgrade of ATS on A:cp-upload_drmrs - 9.2.13 Upgrade ()
19:35 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudvirtlocal1001.eqiad.wmnet
19:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2167', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260413-192715-fceratto.json
19:25 cdanis: 💙cdanis@apt1002.wikimedia.org ~ 🕞🍵 sudo -i reprepro --component main --restrict cidergrinder update trixie-wikimedia
19:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2167 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260413-191707-fceratto.json
19:14 swfrench-wmf: applied aqs cassandra host list changes from
T423168
19:13 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2167 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260413-191355-fceratto.json
19:13 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2167.codfw.wmnet with reason: Maintenance
19:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2166 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260413-191330-fceratto.json
19:12 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/page-analytics: apply
19:11 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/page-analytics: apply
19:11 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/media-analytics: apply
19:10 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/media-analytics: apply
19:10 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/geo-analytics: apply
19:10 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/geo-analytics: apply
19:09 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/edit-analytics: apply
19:09 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/edit-analytics: apply
19:08 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/device-analytics: apply
19:08 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/device-analytics: apply
19:07 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/data-gateway: apply
19:07 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/data-gateway: apply
19:06 zabe@deploy1003: Finished scap sync-world: Backport for
Revert "NewFilesPager: Make sure filerevision is queried before file"
(duration: 05m 51s)
19:03 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2166', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260413-190322-fceratto.json
19:02 zabe@deploy1003: zabe: Continuing with sync
19:02 zabe@deploy1003: zabe: Backport for
Revert "NewFilesPager: Make sure filerevision is queried before file"
synced to the testservers (see
). Changes can now be verified there.
19:00 zabe@deploy1003: Started scap sync-world: Backport for
Revert "NewFilesPager: Make sure filerevision is queried before file"
18:55 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/page-analytics: apply
18:54 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/page-analytics: apply
18:53 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/page-analytics: apply
18:53 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/media-analytics: apply
18:53 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2166', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260413-185314-fceratto.json
18:52 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/media-analytics: apply
18:52 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/geo-analytics: apply
18:52 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/geo-analytics: apply
18:52 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/edit-analytics: apply
18:51 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/edit-analytics: apply
18:51 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/device-analytics: apply
18:51 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/device-analytics: apply
18:51 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/data-gateway: apply
18:51 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/data-gateway: apply
18:45 brett@cumin2002: START - Cookbook sre.cdn.roll-upgrade-ats Rolling upgrade of ATS on A:cp-upload_drmrs - 9.2.13 Upgrade ()
18:45 brett@cumin2002: START - Cookbook sre.cdn.roll-upgrade-ats Rolling upgrade of ATS on A:cp-text_drmrs - 9.2.13 Upgrade ()
18:44 zabe@deploy1003: Sync cancelled.
18:43 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2166 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260413-184305-fceratto.json
18:41 zabe@deploy1003: zabe: Backport for
NewFilesPager: Make sure filerevision is queried before file (T422946)
synced to the testservers (see
). Changes can now be verified there.
18:40 zabe@deploy1003: Started scap sync-world: Backport for
NewFilesPager: Make sure filerevision is queried before file (T422946)
18:39 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2166 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260413-183953-fceratto.json
18:39 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2166.codfw.wmnet with reason: Maintenance
18:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2164 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260413-183927-fceratto.json
18:37 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-upgrade-ats (exit_code=0) Rolling upgrade of ATS on A:cp-text_ulsfo - 9.2.13 Upgrade ()
18:36 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-upgrade-ats (exit_code=0) Rolling upgrade of ATS on A:cp-upload_ulsfo - 9.2.13 Upgrade ()
18:30 root@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool pc1018: Security updates
18:30 root@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
18:30 root@cumin1003: START - Cookbook sre.mysql.parsercache
18:30 root@cumin1003: START - Cookbook sre.mysql.pool pool pc1018: Security updates
18:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2164', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260413-182919-fceratto.json
18:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2164', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260413-181911-fceratto.json
18:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2164 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260413-180902-fceratto.json
18:05 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2164 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260413-180551-fceratto.json
18:05 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2164.codfw.wmnet with reason: Maintenance
18:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2163 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260413-180525-fceratto.json
18:04 root@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool pc1018: Security updates
18:04 root@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
18:04 root@cumin1003: START - Cookbook sre.mysql.parsercache
18:04 root@cumin1003: START - Cookbook sre.mysql.depool depool pc1018: Security updates
17:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2163', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260413-175517-fceratto.json
17:52 brett@cumin2002: START - Cookbook sre.cdn.roll-upgrade-ats Rolling upgrade of ATS on A:cp-upload_ulsfo - 9.2.13 Upgrade ()
17:52 brett@cumin2002: START - Cookbook sre.cdn.roll-upgrade-ats Rolling upgrade of ATS on A:cp-text_ulsfo - 9.2.13 Upgrade ()
17:46 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/editor-analytics: apply
17:46 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/editor-analytics: apply
17:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2163', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260413-174509-fceratto.json
17:40 swfrench-wmf: applied latent external-services network policy changes for aqs{1023,1024} -
T423168
17:37 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
17:36 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
17:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2163 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260413-173501-fceratto.json
17:34 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
17:33 root@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool pc1017: Security updates
17:33 root@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
17:33 root@cumin1003: START - Cookbook sre.mysql.parsercache
17:33 root@cumin1003: START - Cookbook sre.mysql.pool pool pc1017: Security updates
17:33 swfrench@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
17:33 Amir1: dropping templatelinks and pagelinks on testcommonswiki core db (
T421914
17:32 swfrench@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
17:31 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2163 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260413-173148-fceratto.json
17:31 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2163.codfw.wmnet with reason: Maintenance
17:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2161 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260413-173123-fceratto.json
17:31 swfrench@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
17:30 ladsgroup@deploy1003: Finished scap sync-world: Backport for
Revert^6 "Use envoy for swift inside mediawiki"
(duration: 07m 31s)
17:29 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-upgrade-ats (exit_code=0) Rolling upgrade of ATS on P{cp[7002-7008].magru.wmnet} and A:cp - 9.2.13 Upgrade ()
17:26 ladsgroup@deploy1003: ladsgroup: Continuing with sync
17:26 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-upgrade-ats (exit_code=0) Rolling upgrade of ATS on P{cp[7010-7016].magru.wmnet} and A:cp - 9.2.13 Upgrade ()
17:24 ladsgroup@deploy1003: ladsgroup: Backport for
Revert^6 "Use envoy for swift inside mediawiki"
synced to the testservers (see
). Changes can now be verified there.
17:23 ladsgroup@deploy1003: Started scap sync-world: Backport for
Revert^6 "Use envoy for swift inside mediawiki"
17:21 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host phab2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
17:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2161', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260413-172115-fceratto.json
17:20 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/editor-analytics: apply
17:19 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/editor-analytics: apply
17:19 swfrench@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
17:18 swfrench@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
17:13 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/editor-analytics: apply
17:12 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host phab2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
17:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2161', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260413-171107-fceratto.json
17:06 root@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool pc1017: Security updates
17:06 root@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
17:06 root@cumin1003: START - Cookbook sre.mysql.parsercache
17:06 root@cumin1003: START - Cookbook sre.mysql.depool depool pc1017: Security updates
17:03 ladsgroup@deploy1003: Finished scap sync-world: Backport for
ExternalStore: Start reading and writing from clusters 32 and 33 (T421729)
(duration: 06m 43s)
17:03 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/editor-analytics: apply
17:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2161 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260413-170059-fceratto.json
16:59 ladsgroup@deploy1003: ladsgroup: Continuing with sync
16:58 ladsgroup@deploy1003: ladsgroup: Backport for
ExternalStore: Start reading and writing from clusters 32 and 33 (T421729)
synced to the testservers (see
). Changes can now be verified there.
16:57 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2161 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260413-165747-fceratto.json
16:57 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2161.codfw.wmnet with reason: Maintenance
16:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2154 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260413-165721-fceratto.json
16:56 ladsgroup@deploy1003: Started scap sync-world: Backport for
ExternalStore: Start reading and writing from clusters 32 and 33 (T421729)
16:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2154', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260413-164713-fceratto.json
16:46 mutante: contint2002 (prod CI) - re-enabled puppet - this applied a refresh of the contint.wikimedia.org certificate (
T423152
T420993
16:44 brett@cumin2002: START - Cookbook sre.cdn.roll-upgrade-ats Rolling upgrade of ATS on P{cp[7010-7016].magru.wmnet} and A:cp - 9.2.13 Upgrade ()
16:44 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1187: After Reimage
16:44 brett@cumin2002: START - Cookbook sre.cdn.roll-upgrade-ats Rolling upgrade of ATS on P{cp[7002-7008].magru.wmnet} and A:cp - 9.2.13 Upgrade ()
16:44 mutante: contint2002 (prod CI) - re-enabled puppet - this applied a refresh of the contint.wikimedia.org certificate
16:40 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/editor-analytics: apply
16:40 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/editor-analytics: apply
16:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2154', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260413-163706-fceratto.json
16:36 root@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool pc1014: Security updates
16:36 root@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
16:35 root@cumin1003: START - Cookbook sre.mysql.parsercache
16:35 root@cumin1003: START - Cookbook sre.mysql.pool pool pc1014: Security updates
16:35 Amir1: banning non-standard thumbs with external referrer regardless of cache status (
T414805
16:28 cgoubert@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker1273.eqiad.wmnet
16:28 cgoubert@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker1273.eqiad.wmnet
16:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2154 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260413-162657-fceratto.json
16:23 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2154 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260413-162344-fceratto.json
16:23 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2154.codfw.wmnet with reason: Maintenance
16:23 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2152 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260413-162318-fceratto.json
16:14 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host clouddb1019.eqiad.wmnet with OS trixie
16:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2152', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260413-161310-fceratto.json
16:07 root@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool pc1014: Security updates
16:07 root@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
16:07 root@cumin1003: START - Cookbook sre.mysql.parsercache
16:07 root@cumin1003: START - Cookbook sre.mysql.depool depool pc1014: Security updates
16:03 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2152', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260413-160301-fceratto.json
16:02 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2069.codfw.wmnet with OS bullseye
15:59 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1187: After Reimage
15:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2152 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260413-155253-fceratto.json
15:51 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1187.eqiad.wmnet with OS trixie
15:50 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2224: After Reimage
15:49 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2152 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260413-154937-fceratto.json
15:49 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2152.codfw.wmnet with reason: Maintenance
15:46 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1166: repool after maintenance
15:43 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2069.codfw.wmnet with reason: host reimage
15:43 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
15:43 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
15:39 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2069.codfw.wmnet with reason: host reimage
15:37 root@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool pc1013: Security updates
15:37 root@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
15:37 root@cumin1003: START - Cookbook sre.mysql.parsercache
15:37 root@cumin1003: START - Cookbook sre.mysql.pool pool pc1013: Security updates
15:36 moritzm: installing postgresql-15 security updates
15:32 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
15:32 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
15:31 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
15:31 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
15:31 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db1227 (
T410589
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260413-153107-ladsgroup.json
15:30 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db1227.eqiad.wmnet with reason: Maintenance
15:30 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1202 (
T410589
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260413-153042-ladsgroup.json
15:29 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1187.eqiad.wmnet with reason: host reimage
15:25 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1187.eqiad.wmnet with reason: host reimage
15:21 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
15:21 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
15:21 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
15:21 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
15:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
15:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
15:20 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
15:20 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
15:20 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1202', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260413-152034-ladsgroup.json
15:16 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
15:16 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
15:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
15:13 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
15:11 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host phab2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
15:10 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1187.eqiad.wmnet with OS trixie
15:10 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1202', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260413-151027-ladsgroup.json
15:10 root@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool pc1013: Security updates
15:10 root@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
15:09 root@cumin1003: START - Cookbook sre.mysql.parsercache
15:09 root@cumin1003: START - Cookbook sre.mysql.depool depool pc1013: Security updates
15:08 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1187: Upgrade package
15:08 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1187.eqiad.wmnet with reason: Reimage to Trixie
15:08 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1187: Upgrade package
15:07 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host phab2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
15:06 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host phab2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
15:05 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2224: After Reimage
15:04 marostegui@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db2224: After Reimage
15:04 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2224: After Reimage
15:04 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host phab2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
15:03 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2224.codfw.wmnet with OS trixie
15:01 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2222 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260413-150116-fceratto.json
15:00 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1166: repool after maintenance
15:00 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1202 (
T410589
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260413-150019-ladsgroup.json
14:54 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host clouddb1019.eqiad.wmnet with OS trixie
14:53 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool pc1012:
T419961
14:53 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
14:53 fceratto@cumin1003: START - Cookbook sre.mysql.parsercache
14:53 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool pc1012:
T419961
14:53 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool pc2012:
T419961
14:53 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
14:53 fceratto@cumin1003: START - Cookbook sre.mysql.parsercache
14:53 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool pc2012:
T419961
14:51 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host ms-be2069
14:51 mvernon@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-be2069
14:50 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2222', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260413-145028-fceratto.json
14:49 mvernon@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ms-be2069
14:49 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ms-be2069.codfw.wmnet 181.48.192.10.in-addr.arpa 1.8.1.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
14:48 mvernon@cumin2002: START - Cookbook sre.dns.wipe-cache ms-be2069.codfw.wmnet 181.48.192.10.in-addr.arpa 1.8.1.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
14:48 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
14:48 mvernon@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be2069 - mvernon@cumin2002"
14:48 mvernon@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be2069 - mvernon@cumin2002"
14:44 mvernon@cumin2002: START - Cookbook sre.dns.netbox
14:40 mvernon@cumin2002: START - Cookbook sre.hosts.move-vlan for host ms-be2069
14:40 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2224.codfw.wmnet with reason: host reimage
14:39 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2069.codfw.wmnet with OS bullseye
14:39 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2222', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260413-143939-fceratto.json
14:36 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2224.codfw.wmnet with reason: host reimage
14:34 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2070.codfw.wmnet with OS bullseye
14:28 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2222 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260413-142851-fceratto.json
14:22 Lucas_WMDE: UTC afternoon backport+config window done
14:22 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for
Record TOR account creation failure separately (T422283)
stats: add counters for experiment account creation (T422283)
GrowthSuggestionToneCheck: flag as non-experimental (T422835)
(duration: 10m 22s)
14:20 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host an-redacteddb1001.eqiad.wmnet with OS bookworm
14:19 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
14:19 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthbook: apply
14:19 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2224.codfw.wmnet with OS trixie
14:19 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
14:18 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook: apply
14:18 lucaswerkmeister-wmde@deploy1003: migr, lucaswerkmeister-wmde, urbanecm: Continuing with sync
14:17 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
14:16 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2224: Reimage
14:16 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2224: Reimage
14:15 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2224.codfw.wmnet with reason: Reimage to Trixie
14:14 root@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool pc1012: Security updates
14:14 root@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
14:14 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2070.codfw.wmnet with reason: host reimage
14:14 root@cumin1003: START - Cookbook sre.mysql.parsercache
14:14 root@cumin1003: START - Cookbook sre.mysql.depool depool pc1012: Security updates
14:14 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool pc2011:
T419961
14:14 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
14:14 fceratto@cumin2002: dbctl commit (dc=all): 'Depooling db2222 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260413-141414-fceratto.json
14:14 inflatador: bking@apt1002 sudo -E reprepro --ignore=wrongdistribution -C component/opensearch2 include trixie-wikimedia ~/opensearch-madvise-0.2/opensearch-madvise_0.2_amd64.changes
T422860
14:13 fceratto@cumin1003: START - Cookbook sre.mysql.parsercache
14:13 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool pc2011:
T419961
14:13 fceratto@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2222.codfw.wmnet with reason: Maintenance
14:13 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool pc1011:
T419961
14:13 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
14:13 lucaswerkmeister-wmde@deploy1003: migr, lucaswerkmeister-wmde, urbanecm: Backport for
Record TOR account creation failure separately (T422283)
stats: add counters for experiment account creation (T422283)
GrowthSuggestionToneCheck: flag as non-experimental (T422835)
synced to the testservers (see
). Changes can now be
14:13 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2221 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260413-141306-fceratto.json
14:12 fceratto@cumin1003: START - Cookbook sre.mysql.parsercache
14:12 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool pc1011:
T419961
14:11 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for
Record TOR account creation failure separately (T422283)
stats: add counters for experiment account creation (T422283)
GrowthSuggestionToneCheck: flag as non-experimental (T422835)
14:10 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
14:09 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2070.codfw.wmnet with reason: host reimage
14:08 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
14:02 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2221', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260413-140218-fceratto.json
14:01 lucaswerkmeister-wmde@deploy1003: mwscript-k8s job started: namespaceDupes urwikisource --fix #
T422824
14:00 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for
EventStreamConfig: remove unused contextual attributes causing problems (T422001)
[abstractwiki] Enable wgParserEnableUserLanguage, so we don't need ⧼lang⧽s
urwikisource: add مصنف (author) namespace (T422824)
(duration: 08m 30s)
13:59 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-redacteddb1001.eqiad.wmnet with reason: host reimage
13:56 lucaswerkmeister-wmde@deploy1003: sgimeno, anzx, lucaswerkmeister-wmde, jforrester: Continuing with sync
13:55 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
13:55 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
13:53 lucaswerkmeister-wmde@deploy1003: sgimeno, anzx, lucaswerkmeister-wmde, jforrester: Backport for
EventStreamConfig: remove unused contextual attributes causing problems (T422001)
[abstractwiki] Enable wgParserEnableUserLanguage, so we don't need ⧼lang⧽s
urwikisource: add مصنف (author) namespace (T422824)
synced to the testservers (see
13:53 moritzm: installing postgresql-common bugfix updates
13:52 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on an-redacteddb1001.eqiad.wmnet with reason: host reimage
13:52 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for
EventStreamConfig: remove unused contextual attributes causing problems (T422001)
[abstractwiki] Enable wgParserEnableUserLanguage, so we don't need ⧼lang⧽s
urwikisource: add مصنف (author) namespace (T422824)
13:51 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2221', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260413-135129-fceratto.json
13:50 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host ms-be2070
13:50 mvernon@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-be2070
13:49 mvernon@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ms-be2070
13:49 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ms-be2070.codfw.wmnet 86.0.192.10.in-addr.arpa 6.8.0.0.0.0.0.0.2.9.1.0.0.1.0.0.1.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
13:49 mvernon@cumin2002: START - Cookbook sre.dns.wipe-cache ms-be2070.codfw.wmnet 86.0.192.10.in-addr.arpa 6.8.0.0.0.0.0.0.2.9.1.0.0.1.0.0.1.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
13:49 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
13:49 mvernon@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be2070 - mvernon@cumin2002"
13:49 mvernon@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be2070 - mvernon@cumin2002"
13:49 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for
Re-add p-personal id to the user menu (T422885)
(duration: 10m 41s)
13:44 mvernon@cumin2002: START - Cookbook sre.dns.netbox
13:44 mvernon@cumin2002: START - Cookbook sre.hosts.move-vlan for host ms-be2070
13:43 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2070.codfw.wmnet with OS bullseye
13:42 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, aude: Continuing with sync
13:42 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, aude: Backport for
Re-add p-personal id to the user menu (T422885)
synced to the testservers (see
). Changes can now be verified there.
13:41 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host phab1006.eqiad.wmnet with OS trixie
13:41 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
13:40 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2221 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260413-134041-fceratto.json
13:38 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for
Re-add p-personal id to the user menu (T422885)
13:37 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for
Opt-in new accounts to ReadingLists beta feature on pilot wikis (T422833)
(duration: 34m 09s)
13:37 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
13:36 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host an-redacteddb1001
13:36 btullis@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host an-redacteddb1001
13:35 btullis@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host an-redacteddb1001
13:35 btullis@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) an-redacteddb1001.eqiad.wmnet 18.48.64.10.in-addr.arpa 8.1.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
13:35 btullis@cumin1003: START - Cookbook sre.dns.wipe-cache an-redacteddb1001.eqiad.wmnet 18.48.64.10.in-addr.arpa 8.1.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
13:35 btullis@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
13:35 btullis@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host an-redacteddb1001 - btullis@cumin1003"
13:35 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host an-redacteddb1001 - btullis@cumin1003"
13:26 fceratto@cumin2002: dbctl commit (dc=all): 'Depooling db2221 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260413-132604-fceratto.json
13:25 fceratto@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2221.codfw.wmnet with reason: Maintenance
13:25 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2218 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260413-132457-fceratto.json
13:24 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host clouddb1019.eqiad.wmnet with OS trixie
13:24 root@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool pc1011: Security updates
13:24 root@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
13:24 root@cumin1003: START - Cookbook sre.mysql.parsercache
13:24 root@cumin1003: START - Cookbook sre.mysql.depool depool pc1011: Security updates
13:24 lucaswerkmeister-wmde@deploy1003: aude, lucaswerkmeister-wmde: Continuing with sync
13:24 btullis@cumin1003: START - Cookbook sre.dns.netbox
13:24 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host clouddb1019.eqiad.wmnet with OS trixie
13:21 lucaswerkmeister-wmde@deploy1003: aude, lucaswerkmeister-wmde: Backport for
Opt-in new accounts to ReadingLists beta feature on pilot wikis (T422833)
synced to the testservers (see
). Changes can now be verified there.
13:20 btullis@cumin1003: START - Cookbook sre.hosts.move-vlan for host an-redacteddb1001
13:19 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host an-redacteddb1001.eqiad.wmnet with OS bookworm
13:18 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on phab1006.eqiad.wmnet with reason: host reimage
13:14 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2218', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260413-131408-fceratto.json
13:13 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on phab1006.eqiad.wmnet with reason: host reimage
13:04 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host apt1002.wikimedia.org
13:03 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for
Opt-in new accounts to ReadingLists beta feature on pilot wikis (T422833)
13:03 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2218', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260413-130320-fceratto.json
13:01 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host phab1006.eqiad.wmnet with OS trixie
13:00 moritzm: installing libnginx-mod-http-lua security updates
12:58 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host apt1002.wikimedia.org
12:52 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2218 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260413-125231-fceratto.json
12:47 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host clouddb1019.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
12:38 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host clouddb1019.eqiad.wmnet with OS trixie
12:38 fceratto@cumin2002: dbctl commit (dc=all): 'Depooling db2218 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260413-123801-fceratto.json
12:37 fceratto@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2218.codfw.wmnet with reason: Maintenance
12:36 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2208 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260413-123653-fceratto.json
12:36 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
12:36 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
12:26 jclark@cumin1003: START - Cookbook sre.hosts.provision for host clouddb1019.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
12:26 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2208', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260413-122604-fceratto.json
12:21 jmm@dns1004: END - running authdns-update
12:20 jmm@dns1004: START - running authdns-update
12:15 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2208', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260413-121516-fceratto.json
12:04 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2208 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260413-120428-fceratto.json
12:01 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host puppetserver1003.eqiad.wmnet
11:55 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host puppetserver1003.eqiad.wmnet
11:54 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host puppetserver2004.codfw.wmnet
11:49 fceratto@cumin2002: dbctl commit (dc=all): 'Depooling db2208 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260413-114953-fceratto.json
11:49 fceratto@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2208.codfw.wmnet with reason: Maintenance
11:48 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host puppetserver2004.codfw.wmnet
11:38 jmm@dns1004: END - running authdns-update
11:36 jmm@dns1004: START - running authdns-update
11:36 fceratto@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2200.codfw.wmnet with reason: Maintenance
11:36 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2182 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260413-113630-fceratto.json
11:25 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2182', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260413-112541-fceratto.json
11:14 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2182', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260413-111452-fceratto.json
11:04 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2182 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260413-110405-fceratto.json
10:48 fceratto@cumin2002: dbctl commit (dc=all): 'Depooling db2182 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260413-104852-fceratto.json
10:48 fceratto@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2182.codfw.wmnet with reason: Maintenance
10:47 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2168 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260413-104756-fceratto.json
10:38 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
10:38 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
10:37 vgutierrez@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-ats (exit_code=0) Rolling upgrade of ATS on P{cp[3067,3074].esams.wmnet} and A:cp - 9.2.13 upgrade (
T422328
10:37 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2168', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260413-103707-fceratto.json
10:34 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
10:33 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
10:29 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
10:28 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
10:26 vgutierrez@cumin1003: START - Cookbook sre.cdn.roll-upgrade-ats Rolling upgrade of ATS on P{cp[3067,3074].esams.wmnet} and A:cp - 9.2.13 upgrade (
T422328
10:26 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2168', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260413-102619-fceratto.json
10:19 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
10:19 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host krb1002.eqiad.wmnet
10:19 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
10:15 blake@deploy1003: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
10:15 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2168 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260413-101530-fceratto.json
10:15 blake@deploy1003: helmfile [eqiad] START helmfile.d/services/thumbor: apply
10:14 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
10:14 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host krb1002.eqiad.wmnet
10:14 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
10:09 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
10:09 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
10:07 blake@deploy1003: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
10:06 blake@deploy1003: helmfile [codfw] START helmfile.d/services/thumbor: apply
10:05 blake@deploy1003: helmfile [staging] DONE helmfile.d/services/thumbor: apply
10:05 blake@deploy1003: helmfile [staging] START helmfile.d/services/thumbor: apply
10:00 fceratto@cumin2002: dbctl commit (dc=all): 'Depooling db2168 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260413-100003-fceratto.json
09:59 fceratto@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2168.codfw.wmnet with reason: Maintenance
09:59 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2159 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260413-095906-fceratto.json
09:49 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
09:48 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2159', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260413-094818-fceratto.json
09:47 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
09:38 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db2200.codfw.wmnet with reason: Maintenance
09:37 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2159', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260413-093729-fceratto.json
09:26 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2159 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260413-092640-fceratto.json
09:19 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
09:19 root@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool pc1011: Security updates
09:19 root@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
09:19 root@cumin1003: START - Cookbook sre.mysql.parsercache
09:19 root@cumin1003: START - Cookbook sre.mysql.depool depool pc1011: Security updates
09:17 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool pc1011: Security updates
09:17 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
09:17 fceratto@cumin1003: START - Cookbook sre.mysql.parsercache
09:17 fceratto@cumin1003: START - Cookbook sre.mysql.depool depool pc1011: Security updates
09:15 root@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool pc1011: Security updates
09:15 root@cumin1003: END (FAIL) - Cookbook sre.mysql.parsercache (exit_code=99)
09:15 root@cumin1003: START - Cookbook sre.mysql.parsercache
09:15 root@cumin1003: START - Cookbook sre.mysql.depool depool pc1011: Security updates
09:11 fceratto@cumin2002: dbctl commit (dc=all): 'Depooling db2159 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260413-091122-fceratto.json
09:10 fceratto@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2159.codfw.wmnet with reason: Maintenance
09:10 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2150 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260413-091027-fceratto.json
08:59 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2150', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260413-085938-fceratto.json
08:48 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host bast1003.wikimedia.org
08:48 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2150', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260413-084850-fceratto.json
08:43 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host bast1003.wikimedia.org
08:38 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2150 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260413-083801-fceratto.json
08:22 fceratto@cumin2002: dbctl commit (dc=all): 'Depooling db2150 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260413-082233-fceratto.json
08:21 fceratto@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2150.codfw.wmnet with reason: Maintenance
08:10 taavi@dns1004: END - running authdns-update
08:09 taavi@dns1004: START - running authdns-update
08:06 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'article-descriptions' for release 'main' .
07:40 elukey@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
07:35 elukey@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
07:23 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mirror1001.wikimedia.org
07:21 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthboo-next: apply
07:20 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook-next: apply
07:16 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host mirror1001.wikimedia.org
07:09 moritzm: installing openssh security updates
05:51 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db1202 (
T410589
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260413-055130-ladsgroup.json
05:51 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db1202.eqiad.wmnet with reason: Maintenance
05:51 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1194 (
T410589
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260413-055106-ladsgroup.json
05:41 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1194', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260413-054100-ladsgroup.json
05:30 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1194', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260413-053050-ladsgroup.json
05:20 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1194 (
T410589
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260413-052042-ladsgroup.json
03:34 TimStarling: on gerrit2003 restarted gerrit
T423027
2026-04-12
21:21 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db2198.codfw.wmnet with reason: Maintenance
21:20 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2182 (
T410589
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260412-212043-ladsgroup.json
21:10 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2182', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260412-211036-ladsgroup.json
21:00 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2182', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260412-210028-ladsgroup.json
20:55 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db1194 (
T410589
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260412-205525-ladsgroup.json
20:55 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db1194.eqiad.wmnet with reason: Maintenance
20:55 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1191 (
T410589
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260412-205500-ladsgroup.json
20:50 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2182 (
T410589
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260412-205020-ladsgroup.json
20:44 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1191', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260412-204451-ladsgroup.json
20:34 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1191', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260412-203443-ladsgroup.json
20:24 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1191 (
T410589
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260412-202435-ladsgroup.json
14:32 cgoubert@dns2004: START - running authdns-update
11:51 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db1191 (
T410589
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260412-115148-ladsgroup.json
11:51 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db1191.eqiad.wmnet with reason: Maintenance
11:51 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1174 (
T410589
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260412-115124-ladsgroup.json
11:41 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260412-114116-ladsgroup.json
11:31 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260412-113108-ladsgroup.json
11:21 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1174 (
T410589
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260412-112100-ladsgroup.json
07:06 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db2182 (
T410589
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260412-070649-ladsgroup.json
07:06 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db2182.codfw.wmnet with reason: Maintenance
07:06 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2168 (
T410589
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260412-070624-ladsgroup.json
06:56 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2168', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260412-065616-ladsgroup.json
06:46 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2168', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260412-064608-ladsgroup.json
06:36 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2168 (
T410589
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260412-063600-ladsgroup.json
02:44 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db1174 (
T410589
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260412-024415-ladsgroup.json
02:44 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db1174.eqiad.wmnet with reason: Maintenance
02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 19s)
02:01 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
2026-04-11
22:40 cmooney@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 16735
22:38 cmooney@cumin1003: START - Cookbook sre.network.peering with action 'configure' for AS: 16735
22:38 cmooney@cumin1003: END (FAIL) - Cookbook sre.network.peering (exit_code=99) with action 'configure' for AS: 16735
22:37 cmooney@cumin1003: START - Cookbook sre.network.peering with action 'configure' for AS: 16735
18:50 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2248 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260411-185048-fceratto.json
18:40 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2248', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260411-184000-fceratto.json
18:29 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2248', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260411-182912-fceratto.json
18:18 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2248 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260411-181823-fceratto.json
17:23 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db2168 (
T410589
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260411-172321-ladsgroup.json
17:23 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db2168.codfw.wmnet with reason: Maintenance
17:22 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2159 (
T410589
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260411-172257-ladsgroup.json
17:12 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2159', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260411-171248-ladsgroup.json
17:02 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2159', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260411-170240-ladsgroup.json
17:02 fceratto@cumin2002: dbctl commit (dc=all): 'Depooling db2248 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260411-170233-fceratto.json
17:01 fceratto@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2248.codfw.wmnet with reason: Maintenance
17:01 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2247 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260411-170138-fceratto.json
16:52 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2159 (
T410589
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260411-165232-ladsgroup.json
16:50 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2247', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260411-165049-fceratto.json
16:40 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2247', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260411-164000-fceratto.json
16:29 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2247 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260411-162912-fceratto.json
14:40 fceratto@cumin2002: dbctl commit (dc=all): 'Depooling db2247 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260411-144002-fceratto.json
14:39 fceratto@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2247.codfw.wmnet with reason: Maintenance
14:38 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2246 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260411-143854-fceratto.json
14:28 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2246', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260411-142805-fceratto.json
14:17 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2246', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260411-141717-fceratto.json
14:06 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2246 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260411-140628-fceratto.json
12:43 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db1171.eqiad.wmnet with reason: Maintenance
12:42 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1170 (
T410589
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260411-124244-ladsgroup.json
12:32 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1170', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260411-123235-ladsgroup.json
12:22 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1170', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260411-122226-ladsgroup.json
12:14 fceratto@cumin2002: dbctl commit (dc=all): 'Depooling db2246 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260411-121410-fceratto.json
12:13 fceratto@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2246.codfw.wmnet with reason: Maintenance
12:13 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2245 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260411-121302-fceratto.json
12:12 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1170 (
T410589
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260411-121218-ladsgroup.json
12:02 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2245', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260411-120214-fceratto.json
11:51 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2245', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260411-115126-fceratto.json
11:40 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2245 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260411-114037-fceratto.json
09:52 fceratto@cumin2002: dbctl commit (dc=all): 'Depooling db2245 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260411-095220-fceratto.json
09:51 fceratto@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2245.codfw.wmnet with reason: Maintenance
09:51 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2240 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260411-095113-fceratto.json
09:40 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2240', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260411-094024-fceratto.json
09:29 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2240', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260411-092936-fceratto.json
09:18 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2240 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260411-091847-fceratto.json
07:36 fceratto@cumin2002: dbctl commit (dc=all): 'Depooling db2240 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260411-073627-fceratto.json
07:35 fceratto@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2240.codfw.wmnet with reason: Maintenance
06:01 fceratto@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2239.codfw.wmnet with reason: Maintenance
06:01 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2237 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260411-060126-fceratto.json
05:50 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2237', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260411-055038-fceratto.json
05:39 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2237', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260411-053950-fceratto.json
05:29 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2237 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260411-052901-fceratto.json
03:45 fceratto@cumin2002: dbctl commit (dc=all): 'Depooling db2237 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260411-034549-fceratto.json
03:45 fceratto@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2237.codfw.wmnet with reason: Maintenance
03:44 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2236 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260411-034441-fceratto.json
03:37 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db2159 (
T410589
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260411-033701-ladsgroup.json
03:36 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db2159.codfw.wmnet with reason: Maintenance
03:36 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2150 (
T410589
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260411-033636-ladsgroup.json
03:33 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2236', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260411-033352-fceratto.json
03:26 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2150', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260411-032628-ladsgroup.json
03:23 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2236', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260411-032304-fceratto.json
03:16 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2150', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260411-031620-ladsgroup.json
03:12 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2236 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260411-031216-fceratto.json
03:06 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2150 (
T410589
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260411-030611-ladsgroup.json
01:31 fceratto@cumin2002: dbctl commit (dc=all): 'Depooling db2236 (
T419635
)', diff saved to Unable to send diff to phaste and previous config saved to /var/cache/conftool/dbconfig/20260411-013151-fceratto.json
01:31 fceratto@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2236.codfw.wmnet with reason: Maintenance
01:30 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2219 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260411-013040-fceratto.json
01:19 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2219', diff saved to Unable to send diff to phaste and previous config saved to /var/cache/conftool/dbconfig/20260411-011948-fceratto.json
01:09 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2219', diff saved to and previous config saved to /var/cache/conftool/dbconfig/20260411-010859-fceratto.json
00:58 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2219 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260411-005811-fceratto.json
00:08 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host phab2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
00:03 jclark@cumin1003: START - Cookbook sre.hosts.provision for host phab2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
00:01 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host phab2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
2026-04-10
23:54 jclark@cumin1003: START - Cookbook sre.hosts.provision for host phab2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
23:53 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host phab1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
23:51 jclark@cumin1003: START - Cookbook sre.hosts.provision for host phab1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
23:49 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host apus-be1006.eqiad.wmnet with OS bookworm
23:49 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
23:49 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host phab1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
23:48 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
23:47 jclark@cumin1003: START - Cookbook sre.hosts.provision for host phab1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
23:25 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on apus-be1006.eqiad.wmnet with reason: host reimage
23:20 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on apus-be1006.eqiad.wmnet with reason: host reimage
23:17 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host apus-be1005.eqiad.wmnet with OS bookworm
23:17 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
23:16 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
23:13 fceratto@cumin2002: dbctl commit (dc=all): 'Depooling db2219 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260410-231337-fceratto.json
23:12 fceratto@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2219.codfw.wmnet with reason: Maintenance
23:12 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2210 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260410-231231-fceratto.json
23:05 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host apus-be1006.eqiad.wmnet with OS bookworm
23:01 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2210', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260410-230143-fceratto.json
22:59 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on apus-be1005.eqiad.wmnet with reason: host reimage
22:55 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on apus-be1005.eqiad.wmnet with reason: host reimage
22:50 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2210', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260410-225055-fceratto.json
22:40 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2210 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260410-224008-fceratto.json
22:33 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host apus-be1005.eqiad.wmnet with OS bookworm
22:32 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host apus-be1005.eqiad.wmnet with OS bookworm
22:31 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host apus-be1005.eqiad.wmnet with OS bookworm
22:31 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host apus-be1005.eqiad.wmnet with OS bookworm
22:30 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host apus-be1005.eqiad.wmnet with OS bookworm
22:28 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host phab1006.eqiad.wmnet with OS trixie
22:24 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db1170 (
T410589
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260410-222445-ladsgroup.json
22:24 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db1170.eqiad.wmnet with reason: Maintenance
22:24 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1158 (
T410589
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260410-222421-ladsgroup.json
22:17 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host phab1006.eqiad.wmnet with OS trixie
22:14 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260410-221414-ladsgroup.json
22:13 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host phab1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
22:04 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260410-220406-ladsgroup.json
22:02 jclark@cumin1003: START - Cookbook sre.hosts.provision for host phab1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
22:00 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host phab1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
21:58 jclark@cumin1003: START - Cookbook sre.hosts.provision for host phab1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
21:57 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host phab1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
21:54 jclark@cumin1003: START - Cookbook sre.hosts.provision for host phab1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
21:54 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1158 (
T410589
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260410-215358-ladsgroup.json
21:37 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host clouddb1019.eqiad.wmnet with OS trixie
20:59 bking@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5 days, 0:00:00 on cloudelastic1012.eqiad.wmnet with reason: still fixing Puppet
20:54 fceratto@cumin2002: dbctl commit (dc=all): 'Depooling db2210 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260410-205420-fceratto.json
20:53 fceratto@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2210.codfw.wmnet with reason: Maintenance
20:53 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2206 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260410-205324-fceratto.json
20:42 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2206', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260410-204236-fceratto.json
20:31 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2206', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260410-203147-fceratto.json
20:21 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2206 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260410-202059-fceratto.json
20:02 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1055.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
19:58 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1055.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
19:57 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1055.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
19:55 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1055.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
19:52 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1055.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
19:48 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1055.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
18:34 fceratto@cumin2002: dbctl commit (dc=all): 'Depooling db2206 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260410-183455-fceratto.json
18:34 fceratto@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2206.codfw.wmnet with reason: Maintenance
18:27 dancy@deploy1003: Finished deploy [releng/jenkins-deploy@46eae53] (releasing): (no justification provided) (duration: 00m 56s)
18:26 dancy@deploy1003: Started deploy [releng/jenkins-deploy@46eae53] (releasing): (no justification provided)
17:46 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1055.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
17:41 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1055.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
17:28 dancy@deploy1003: Installation of scap version "4.248.0" completed for 2 hosts
17:27 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudelastic1012.eqiad.wmnet with OS trixie
17:26 dancy@deploy1003: Installing scap version "4.248.0" for 2 host(s)
17:12 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudelastic1012.eqiad.wmnet with reason: host reimage
17:08 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudelastic1012.eqiad.wmnet with reason: host reimage
17:00 fceratto@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2199.codfw.wmnet with reason: Maintenance
16:59 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2172 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260410-165951-fceratto.json
16:57 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1012.eqiad.wmnet with OS trixie
16:49 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2172', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260410-164902-fceratto.json
16:39 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4 days, 0:00:00 on zuul1001.eqiad.wmnet with reason:
T421398
16:38 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2172', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260410-163814-fceratto.json
16:27 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2172 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260410-162726-fceratto.json
16:05 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.roll-restart-ntp (exit_code=0) rolling restart_daemons on A:dnsbox
15:59 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
15:59 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
15:46 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcephmon1006.eqiad.wmnet
15:45 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
15:44 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
15:41 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudelastic1012.eqiad.wmnet with OS trixie
15:39 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudcephmon1006.eqiad.wmnet
15:38 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
15:38 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
15:36 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
15:34 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
15:34 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudelastic1012.eqiad.wmnet with reason: host reimage
15:27 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudelastic1012.eqiad.wmnet with reason: host reimage
15:23 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcephmon1006.eqiad.wmnet
15:17 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudcephmon1006.eqiad.wmnet
15:16 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1012.eqiad.wmnet with OS trixie
14:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
14:53 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
14:38 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
14:38 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
14:36 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
14:32 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
14:30 gkyziridis@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
14:30 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
14:27 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
14:25 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
14:23 fceratto@cumin2002: dbctl commit (dc=all): 'Depooling db2172 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260410-142308-fceratto.json
14:22 fceratto@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2172.codfw.wmnet with reason: Maintenance
14:22 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2155 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260410-142200-fceratto.json
14:13 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
14:13 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
14:13 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
14:13 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
14:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
14:11 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
14:11 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
14:11 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2155', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260410-141112-fceratto.json
14:10 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
14:10 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
14:10 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
14:00 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2155', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260410-140023-fceratto.json
13:49 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2155 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260410-134935-fceratto.json
13:44 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
13:44 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
13:40 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
13:32 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
13:32 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
13:32 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
13:30 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
13:30 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
13:30 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
13:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
13:29 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
13:28 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
13:22 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db2150 (
T410589
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260410-132215-ladsgroup.json
13:22 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db2150.codfw.wmnet with reason: Maintenance
13:21 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
13:21 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db1158 (
T410589
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260410-132119-ladsgroup.json
13:21 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db1158.eqiad.wmnet with reason: Maintenance
13:20 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
13:19 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
13:19 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
13:16 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
13:15 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
13:08 cmooney@dns2005: END - running authdns-update
13:07 cmooney@dns2005: START - running authdns-update
13:06 cmooney@dns2005: START - running authdns-update
13:05 sukhe@cumin1003: START - Cookbook sre.dns.roll-restart-ntp rolling restart_daemons on A:dnsbox
13:02 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
13:02 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
13:02 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: remove dns for decom lumen transport cct - cmooney@cumin1003"
13:01 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: remove dns for decom lumen transport cct - cmooney@cumin1003"
12:59 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
12:57 cmooney@cumin1003: START - Cookbook sre.dns.netbox
12:57 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
12:57 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
12:57 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
12:55 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
12:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
12:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
12:52 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthboo-next: apply
12:51 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
12:47 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook-next: apply
12:46 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
12:44 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
12:34 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
12:32 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
12:27 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
12:26 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
12:24 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
12:24 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
12:22 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
12:09 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
12:09 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
11:50 fceratto@cumin2002: dbctl commit (dc=all): 'Depooling db2155 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260410-115015-fceratto.json
11:49 fceratto@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2155.codfw.wmnet with reason: Maintenance
11:49 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2147 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260410-114919-fceratto.json
11:38 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2147', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260410-113830-fceratto.json
11:27 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2147', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260410-112742-fceratto.json
11:22 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
11:20 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
11:19 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
11:16 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2147 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260410-111654-fceratto.json
11:16 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
11:15 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
11:10 aokoth@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1004.wikimedia.org with reason: Security Release -
T422668
11:04 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
11:04 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
11:02 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
11:02 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
11:00 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
11:00 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
10:58 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
10:30 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
10:27 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
10:27 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
10:25 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
10:21 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
10:20 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
10:20 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
10:19 vgutierrez: upload haproxy 2.8.20 to thirdparty/haproxy28 for bookworm-wikimedia (apt.wm.o) -
T422926
10:18 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
10:16 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
10:16 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
10:14 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
10:11 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
10:11 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
10:11 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
10:10 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
10:10 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
10:09 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
10:09 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
10:07 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
10:03 aokoth@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1004.wikimedia.org with reason: Security Release -
T422668
09:58 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
09:57 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
09:50 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
09:48 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
09:48 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
09:39 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
09:36 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
09:35 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
09:31 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
09:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
09:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
09:24 elukey@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/aux-k8s-services/jaeger: sync
09:24 elukey@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/aux-k8s-services/jaeger: sync
09:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
09:24 elukey@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/aux-k8s-services/jaeger: sync
09:24 elukey@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/aux-k8s-services/jaeger: sync
09:22 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
09:22 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
09:21 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
09:21 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
09:21 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
09:21 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
09:18 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
09:17 fceratto@cumin2002: dbctl commit (dc=all): 'Depooling db2147 (
T419635
)', diff saved to
and previous config saved to /var/cache/conftool/dbconfig/20260410-091713-fceratto.json
09:17 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
09:16 fceratto@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2147.codfw.wmnet with reason: Maintenance
09:15 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
09:14 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
09:13 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
09:04 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
09:02 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
09:02 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
09:00 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
08:55 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
08:54 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
08:54 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
08:54 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
08:54 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
08:45 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
08:45 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
08:45 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
08:44 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
08:44 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
08:42 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
08:19 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
07:30 jelto@dns1004: END - running authdns-update
07:29 jelto@dns1004: START - running authdns-update
07:09 aokoth@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on gitlab1003.wikimedia.org with reason: Upgrade
06:26 marostegui@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host clouddb1019.eqiad.wmnet with OS trixie
05:05 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host clouddb1019.eqiad.wmnet with OS trixie
01:26 rzl@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
01:25 rzl@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
01:23 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
01:23 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
00:57 rzl@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
00:57 rzl@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
00:54 zabe@deploy1003: Finished scap sync-world: Backport for
Stop setting specific virtual domain for link tables (T421914)
(duration: 05m 51s)
00:50 zabe@deploy1003: zabe: Continuing with sync
00:50 zabe@deploy1003: zabe: Backport for
Stop setting specific virtual domain for link tables (T421914)
synced to the testservers (see
). Changes can now be verified there.
00:48 zabe@deploy1003: Started scap sync-world: Backport for
Stop setting specific virtual domain for link tables (T421914)
00:46 zabe@deploy1003: Finished scap sync-world: Backport for
Start reading from new file tables on enwiki (T416548)
(duration: 06m 11s)
00:43 zabe@deploy1003: zabe: Continuing with sync
00:42 zabe@deploy1003: zabe: Backport for
Start reading from new file tables on enwiki (T416548)
synced to the testservers (see
). Changes can now be verified there.
00:40 zabe@deploy1003: Started scap sync-world: Backport for
Start reading from new file tables on enwiki (T416548)
00:29 zabe: marked 425 content rows as bad #
T393237
00:26 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host apus-be2005.codfw.wmnet with OS bookworm
00:26 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
00:21 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
00:08 zabe@deploy1003: Finished scap sync-world: Backport for
Disable query pages on testcommonswiki not compatible with split (T421914)
(duration: 07m 17s)
00:07 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host apus-be2006.codfw.wmnet with OS bookworm
00:07 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
00:07 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
00:04 zabe@deploy1003: zabe: Continuing with sync
00:03 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on apus-be2005.codfw.wmnet with reason: host reimage
00:02 zabe@deploy1003: zabe: Backport for
Disable query pages on testcommonswiki not compatible with split (T421914)
synced to the testservers (see
). Changes can now be verified there.
00:00 zabe@deploy1003: Started scap sync-world: Backport for
Disable query pages on testcommonswiki not compatible with split (T421914)
2026-04-09
23:58 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on apus-be2005.codfw.wmnet with reason: host reimage
23:48 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on apus-be2006.codfw.wmnet with reason: host reimage
23:43 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on apus-be2006.codfw.wmnet with reason: host reimage
23:15 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host apus-be2006.codfw.wmnet with OS bookworm
23:15 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host apus-be2005.codfw.wmnet with OS bookworm
23:14 jhancock@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts ['apus-be2005']
23:14 jhancock@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['apus-be2005']
23:13 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host apus-be2005.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
23:11 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host apus-be2006.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
22:58 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host apus-be2006.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
22:58 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host apus-be2005.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
21:57 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host phab2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
21:27 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host phab2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
21:25 cscott@deploy1003: Finished scap sync-world: Backport for
ParsoidLanguageConverter: Don't convert inside