2026-04-25
- 01:55 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2195', diff saved to https://phabricator.wikimedia.org/P91522 and previous config saved to /var/cache/conftool/dbconfig/20260425-015535-ladsgroup.json
- 01:45 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2195', diff saved to https://phabricator.wikimedia.org/P91521 and previous config saved to /var/cache/conftool/dbconfig/20260425-014528-ladsgroup.json
- 01:35 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2195 (T410589)', diff saved to https://phabricator.wikimedia.org/P91520 and previous config saved to /var/cache/conftool/dbconfig/20260425-013520-ladsgroup.json
2026-04-24
- 20:28 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host druid-internal1005.eqiad.wmnet with OS trixie
- 20:28 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
- 20:27 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
- 20:23 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host druid-internal1002.eqiad.wmnet with OS trixie
- 20:23 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
- 20:19 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
- 20:15 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host druid-internal1004.eqiad.wmnet with OS trixie
- 20:15 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
- 20:14 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
- 20:12 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host druid-internal1006.eqiad.wmnet with OS trixie
- 20:12 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
- 20:09 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
- 20:09 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on druid-internal1005.eqiad.wmnet with reason: host reimage
- 20:07 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host druid-internal1001.eqiad.wmnet with OS trixie
- 20:07 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
- 20:06 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
- 20:04 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on druid-internal1002.eqiad.wmnet with reason: host reimage
- 20:04 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host druid-internal1003.eqiad.wmnet with OS trixie
- 20:04 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
- 20:03 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on druid-internal1005.eqiad.wmnet with reason: host reimage
- 20:01 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
- 19:59 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on druid-internal1004.eqiad.wmnet with reason: host reimage
- 19:54 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on druid-internal1006.eqiad.wmnet with reason: host reimage
- 19:52 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
- 19:51 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
- 19:51 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
- 19:51 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
- 19:51 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host druid-internal1005.eqiad.wmnet with OS trixie
- 19:50 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on druid-internal1001.eqiad.wmnet with reason: host reimage
- 19:49 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on druid-internal1006.eqiad.wmnet with reason: host reimage
- 19:49 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host druid-internal1005.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 19:45 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on druid-internal1003.eqiad.wmnet with reason: host reimage
- 19:41 jclark@cumin1003: START - Cookbook sre.hosts.provision for host druid-internal1005.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 19:40 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host druid-internal1005.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 19:40 jclark@cumin1003: START - Cookbook sre.hosts.provision for host druid-internal1005.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 19:38 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on druid-internal1002.eqiad.wmnet with reason: host reimage
- 19:38 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on druid-internal1004.eqiad.wmnet with reason: host reimage
- 19:38 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on druid-internal1001.eqiad.wmnet with reason: host reimage
- 19:38 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on druid-internal1003.eqiad.wmnet with reason: host reimage
- 19:37 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host druid-internal1006.eqiad.wmnet with OS trixie
- 19:37 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
- 19:37 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
- 19:36 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host druid-internal1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 19:35 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host druid-internal1005.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 19:33 jclark@cumin1003: START - Cookbook sre.hosts.provision for host druid-internal1005.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 19:32 jclark@cumin1003: START - Cookbook sre.hosts.provision for host druid-internal1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 19:27 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host druid-internal1004.eqiad.wmnet with OS trixie
- 19:27 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host druid-internal1003.eqiad.wmnet with OS trixie
- 19:27 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host druid-internal1002.eqiad.wmnet with OS trixie
- 19:27 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host druid-internal1001.eqiad.wmnet with OS trixie
- 19:25 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host druid-internal1005.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 19:24 jclark@cumin1003: START - Cookbook sre.hosts.provision for host druid-internal1005.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 19:24 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host druid-internal1005.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 19:22 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host druid-internal1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 19:21 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host druid-internal1002.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 19:20 jclark@cumin1003: START - Cookbook sre.hosts.provision for host druid-internal1005.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 19:20 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host druid-internal1001.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 19:20 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host druid-internal1003.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 19:20 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host druid-internal1004.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 19:18 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
- 19:17 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
- 19:16 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host druid-internal1005.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 19:16 jclark@cumin1003: START - Cookbook sre.hosts.provision for host druid-internal1005.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 19:16 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host druid-internal1005.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 19:15 jclark@cumin1003: START - Cookbook sre.hosts.provision for host druid-internal1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 19:15 jclark@cumin1003: START - Cookbook sre.hosts.provision for host druid-internal1005.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 19:14 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host druid-internal1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 19:13 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host druid-internal1005.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 19:12 jclark@cumin1003: START - Cookbook sre.hosts.provision for host druid-internal1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 19:12 jclark@cumin1003: START - Cookbook sre.hosts.provision for host druid-internal1003.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 19:12 jclark@cumin1003: START - Cookbook sre.hosts.provision for host druid-internal1005.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 19:12 jclark@cumin1003: START - Cookbook sre.hosts.provision for host druid-internal1002.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 19:12 jclark@cumin1003: START - Cookbook sre.hosts.provision for host druid-internal1004.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 19:11 jclark@cumin1003: START - Cookbook sre.hosts.provision for host druid-internal1001.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 19:10 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 19:10 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding druid-internal1001 to eqiad - jclark@cumin1003"
- 19:10 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding druid-internal1001 to eqiad - jclark@cumin1003"
- 19:06 jclark@cumin1003: START - Cookbook sre.dns.netbox
- 18:50 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-ctrl1006.eqiad.wmnet with OS trixie
- 18:50 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
- 18:49 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
- 18:48 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-ctrl1005.eqiad.wmnet with OS trixie
- 18:48 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
- 18:45 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
- 18:34 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-ctrl1006.eqiad.wmnet with reason: host reimage
- 18:29 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-ctrl1005.eqiad.wmnet with reason: host reimage
- 18:23 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-ctrl1006.eqiad.wmnet with reason: host reimage
- 18:23 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-ctrl1005.eqiad.wmnet with reason: host reimage
- 18:17 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
- 18:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1251 (T419635)', diff saved to https://phabricator.wikimedia.org/P91517 and previous config saved to /var/cache/conftool/dbconfig/20260424-181705-fceratto.json
- 18:11 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-ctrl1006.eqiad.wmnet with OS trixie
- 18:11 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-ctrl1005.eqiad.wmnet with OS trixie
- 18:09 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl1005.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 18:08 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 18:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1251', diff saved to https://phabricator.wikimedia.org/P91516 and previous config saved to /var/cache/conftool/dbconfig/20260424-180657-fceratto.json
- 18:01 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl1005.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 18:01 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 18:01 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 18:01 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding rdb1016 to eqiad - jclark@cumin1003"
- 18:01 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding rdb1016 to eqiad - jclark@cumin1003"
- 17:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1251', diff saved to https://phabricator.wikimedia.org/P91515 and previous config saved to /var/cache/conftool/dbconfig/20260424-175649-fceratto.json
- 17:56 jclark@cumin1003: START - Cookbook sre.dns.netbox
- 17:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1251 (T419635)', diff saved to https://phabricator.wikimedia.org/P91513 and previous config saved to /var/cache/conftool/dbconfig/20260424-174641-fceratto.json
- 17:29 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1251 (T419635)', diff saved to https://phabricator.wikimedia.org/P91512 and previous config saved to /var/cache/conftool/dbconfig/20260424-172952-fceratto.json
- 17:29 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1251.eqiad.wmnet with reason: Maintenance
- 17:16 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1240.eqiad.wmnet with reason: Maintenance
- 17:02 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1239.eqiad.wmnet with reason: Maintenance
- 17:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1235 (T419635)', diff saved to https://phabricator.wikimedia.org/P91511 and previous config saved to /var/cache/conftool/dbconfig/20260424-170225-fceratto.json
- 16:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1235', diff saved to https://phabricator.wikimedia.org/P91510 and previous config saved to /var/cache/conftool/dbconfig/20260424-165217-fceratto.json
- 16:51 dancy@deploy1003: Installation of scap version "4.251.0" completed for 2 hosts
- 16:49 dancy@deploy1003: Installing scap version "4.251.0" for 2 host(s)
- 16:44 akhatun@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
- 16:44 akhatun@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
- 16:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1235', diff saved to https://phabricator.wikimedia.org/P91509 and previous config saved to /var/cache/conftool/dbconfig/20260424-164209-fceratto.json
- 16:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1235 (T419635)', diff saved to https://phabricator.wikimedia.org/P91508 and previous config saved to /var/cache/conftool/dbconfig/20260424-163200-fceratto.json
- 16:16 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1235 (T419635)', diff saved to https://phabricator.wikimedia.org/P91507 and previous config saved to /var/cache/conftool/dbconfig/20260424-161607-fceratto.json
- 16:15 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1235.eqiad.wmnet with reason: Maintenance
- 16:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1234 (T419635)', diff saved to https://phabricator.wikimedia.org/P91506 and previous config saved to /var/cache/conftool/dbconfig/20260424-161541-fceratto.json
- 16:14 akhatun@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
- 16:14 akhatun@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
- 16:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1234', diff saved to https://phabricator.wikimedia.org/P91505 and previous config saved to /var/cache/conftool/dbconfig/20260424-160531-fceratto.json
- 16:01 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
- 16:00 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
- 15:59 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
- 15:59 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
- 15:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1234', diff saved to https://phabricator.wikimedia.org/P91504 and previous config saved to /var/cache/conftool/dbconfig/20260424-155523-fceratto.json
- 15:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1234 (T419635)', diff saved to https://phabricator.wikimedia.org/P91503 and previous config saved to /var/cache/conftool/dbconfig/20260424-154515-fceratto.json
- 15:38 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db2195 (T410589)', diff saved to https://phabricator.wikimedia.org/P91502 and previous config saved to /var/cache/conftool/dbconfig/20260424-153827-ladsgroup.json
- 15:38 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db2195.codfw.wmnet with reason: Maintenance
- 15:38 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2181 (T410589)', diff saved to https://phabricator.wikimedia.org/P91501 and previous config saved to /var/cache/conftool/dbconfig/20260424-153802-ladsgroup.json
- 15:35 bking@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cloudelastic1010.eqiad.wmnet with OS trixie
- 15:32 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1010.eqiad.wmnet with OS trixie
- 15:30 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1234 (T419635)', diff saved to https://phabricator.wikimedia.org/P91500 and previous config saved to /var/cache/conftool/dbconfig/20260424-153020-fceratto.json
- 15:30 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1234.eqiad.wmnet with reason: Maintenance
- 15:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1232 (T419635)', diff saved to https://phabricator.wikimedia.org/P91499 and previous config saved to /var/cache/conftool/dbconfig/20260424-153005-fceratto.json
- 15:27 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2181', diff saved to https://phabricator.wikimedia.org/P91498 and previous config saved to /var/cache/conftool/dbconfig/20260424-152755-ladsgroup.json
- 15:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1232', diff saved to https://phabricator.wikimedia.org/P91497 and previous config saved to /var/cache/conftool/dbconfig/20260424-151957-fceratto.json
- 15:17 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2181', diff saved to https://phabricator.wikimedia.org/P91496 and previous config saved to /var/cache/conftool/dbconfig/20260424-151746-ladsgroup.json
- 15:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1232', diff saved to https://phabricator.wikimedia.org/P91495 and previous config saved to /var/cache/conftool/dbconfig/20260424-150949-fceratto.json
- 15:07 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2181 (T410589)', diff saved to https://phabricator.wikimedia.org/P91494 and previous config saved to /var/cache/conftool/dbconfig/20260424-150738-ladsgroup.json
- 14:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1232 (T419635)', diff saved to https://phabricator.wikimedia.org/P91493 and previous config saved to /var/cache/conftool/dbconfig/20260424-145940-fceratto.json
- 14:44 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1232 (T419635)', diff saved to https://phabricator.wikimedia.org/P91492 and previous config saved to /var/cache/conftool/dbconfig/20260424-144405-fceratto.json
- 14:43 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1232.eqiad.wmnet with reason: Maintenance
- 14:43 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1219 (T419635)', diff saved to https://phabricator.wikimedia.org/P91491 and previous config saved to /var/cache/conftool/dbconfig/20260424-144340-fceratto.json
- 14:33 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1219', diff saved to https://phabricator.wikimedia.org/P91490 and previous config saved to /var/cache/conftool/dbconfig/20260424-143332-fceratto.json
- 14:29 moritzm: imported debdeploy 0.0.99.15 for bullseye-wikimedia (compat release for Cumin 6)
- 14:29 moritzm: updating debdeploy on bullseye to 0.0.99.15
- 14:23 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1219', diff saved to https://phabricator.wikimedia.org/P91489 and previous config saved to /var/cache/conftool/dbconfig/20260424-142323-fceratto.json
- 14:19 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-serve1015.eqiad.wmnet
- 14:15 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 14:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1219 (T419635)', diff saved to https://phabricator.wikimedia.org/P91488 and previous config saved to /var/cache/conftool/dbconfig/20260424-141315-fceratto.json
- 14:13 jclark@cumin1003: START - Cookbook sre.dns.netbox
- 14:11 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
- 14:11 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
- 14:09 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts db2148.codfw.wmnet
- 14:09 marostegui@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 14:09 marostegui@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2148.codfw.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
- 14:09 marostegui@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2148.codfw.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
- 14:07 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2223: after reimage to trixie
- 14:05 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-serve1015.eqiad.wmnet
- 14:05 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-serve1014.eqiad.wmnet
- 14:05 marostegui@cumin1003: START - Cookbook sre.dns.netbox
- 14:01 marostegui@cumin1003: START - Cookbook sre.hosts.decommission for hosts db2148.codfw.wmnet
- 14:00 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-serve1014.eqiad.wmnet
- 14:00 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1185: after reimage to trixie
- 14:00 moritzm: imported zookeeper 3.4.13-6+deb11u1~wmf13u1 into component/zookeeper34 for trixie-wikimedia (forward port of Zookeeper 3.4 from Bullseye to Trixie) T424266
- 13:59 moritzm: imported zookeeper 3.4.13-6+deb11u1~wmf13u1 into component/zookeeper34 for trixie-wikimedia (forward port of Zookeeper 3.4 from Bullseye to Trixie)
- 13:55 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1219 (T419635)', diff saved to https://phabricator.wikimedia.org/P91485 and previous config saved to /var/cache/conftool/dbconfig/20260424-135555-fceratto.json
- 13:55 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1219.eqiad.wmnet with reason: Maintenance
- 13:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1218 (T419635)', diff saved to https://phabricator.wikimedia.org/P91484 and previous config saved to /var/cache/conftool/dbconfig/20260424-135529-fceratto.json
- 13:47 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-serve1014.eqiad.wmnet
- 13:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1218', diff saved to https://phabricator.wikimedia.org/P91482 and previous config saved to /var/cache/conftool/dbconfig/20260424-134522-fceratto.json
- 13:41 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-serve1014.eqiad.wmnet
- 13:40 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-serve1014.eqiad.wmnet
- 13:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1218', diff saved to https://phabricator.wikimedia.org/P91479 and previous config saved to /var/cache/conftool/dbconfig/20260424-133513-fceratto.json
- 13:33 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-serve1014.eqiad.wmnet
- 13:31 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-serve1015.eqiad.wmnet
- 13:26 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-serve1015.eqiad.wmnet
- 13:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1218 (T419635)', diff saved to https://phabricator.wikimedia.org/P91476 and previous config saved to /var/cache/conftool/dbconfig/20260424-132505-fceratto.json
- 13:21 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2223: after reimage to trixie
- 13:19 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2223.codfw.wmnet with OS trixie
- 13:18 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-serve1015.eqiad.wmnet
- 13:14 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1185: after reimage to trixie
- 13:12 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1185.eqiad.wmnet with OS trixie
- 13:11 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-serve1015.eqiad.wmnet
- 13:08 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1218 (T419635)', diff saved to https://phabricator.wikimedia.org/P91473 and previous config saved to /var/cache/conftool/dbconfig/20260424-130840-fceratto.json
- 13:08 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1218.eqiad.wmnet with reason: Maintenance
- 13:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1206 (T419635)', diff saved to https://phabricator.wikimedia.org/P91472 and previous config saved to /var/cache/conftool/dbconfig/20260424-130815-fceratto.json
- 12:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1206', diff saved to https://phabricator.wikimedia.org/P91471 and previous config saved to /var/cache/conftool/dbconfig/20260424-125807-fceratto.json
- 12:54 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2223.codfw.wmnet with reason: host reimage
- 12:50 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1185.eqiad.wmnet with reason: host reimage
- 12:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1206', diff saved to https://phabricator.wikimedia.org/P91470 and previous config saved to /var/cache/conftool/dbconfig/20260424-124759-fceratto.json
- 12:45 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2223.codfw.wmnet with reason: host reimage
- 12:42 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1185.eqiad.wmnet with reason: host reimage
- 12:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1206 (T419635)', diff saved to https://phabricator.wikimedia.org/P91468 and previous config saved to /var/cache/conftool/dbconfig/20260424-123751-fceratto.json
- 12:29 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1244 (T419961)', diff saved to https://phabricator.wikimedia.org/P91467 and previous config saved to /var/cache/conftool/dbconfig/20260424-122939-fceratto.json
- 12:29 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1244.eqiad.wmnet with reason: Maintenance
- 12:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1243 (T419961)', diff saved to https://phabricator.wikimedia.org/P91466 and previous config saved to /var/cache/conftool/dbconfig/20260424-122910-fceratto.json
- 12:28 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1185.eqiad.wmnet with OS trixie
- 12:28 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2223.codfw.wmnet with OS trixie
- 12:27 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2223: Reimage to Trixie
- 12:27 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1185: Reimage to Trixie
- 12:27 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2223: Reimage to Trixie
- 12:27 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1185: Reimage to Trixie
- 12:27 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2223.codfw.wmnet with reason: Reimage to Trixie
- 12:27 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1185.eqiad.wmnet with reason: Reimage to Trixie
- 12:21 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1206 (T419635)', diff saved to https://phabricator.wikimedia.org/P91463 and previous config saved to /var/cache/conftool/dbconfig/20260424-122125-fceratto.json
- 12:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1206.eqiad.wmnet with reason: Maintenance
- 12:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1196 (T419635)', diff saved to https://phabricator.wikimedia.org/P91462 and previous config saved to /var/cache/conftool/dbconfig/20260424-122100-fceratto.json
- 12:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1243', diff saved to https://phabricator.wikimedia.org/P91461 and previous config saved to /var/cache/conftool/dbconfig/20260424-121902-fceratto.json
- 12:17 elukey@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'sync'.
- 12:17 elukey@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'sync'.
- 12:17 elukey@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'sync'.
- 12:17 elukey@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'sync'.
- 12:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1196', diff saved to https://phabricator.wikimedia.org/P91460 and previous config saved to /var/cache/conftool/dbconfig/20260424-121053-fceratto.json
- 12:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1243', diff saved to https://phabricator.wikimedia.org/P91458 and previous config saved to /var/cache/conftool/dbconfig/20260424-120854-fceratto.json
- 12:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1196', diff saved to https://phabricator.wikimedia.org/P91456 and previous config saved to /var/cache/conftool/dbconfig/20260424-120045-fceratto.json
- 11:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1243 (T419961)', diff saved to https://phabricator.wikimedia.org/P91455 and previous config saved to /var/cache/conftool/dbconfig/20260424-115845-fceratto.json
- 11:58 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2228: after reimage to trixie
- 11:53 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1159: after reimage to trixie
- 11:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1196 (T419635)', diff saved to https://phabricator.wikimedia.org/P91452 and previous config saved to /var/cache/conftool/dbconfig/20260424-115036-fceratto.json
- 11:50 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1243 (T419961)', diff saved to https://phabricator.wikimedia.org/P91451 and previous config saved to /var/cache/conftool/dbconfig/20260424-115025-fceratto.json
- 11:50 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1243.eqiad.wmnet with reason: Maintenance
- 11:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1242 (T419961)', diff saved to https://phabricator.wikimedia.org/P91450 and previous config saved to /var/cache/conftool/dbconfig/20260424-114956-fceratto.json
- 11:44 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-serve1015.eqiad.wmnet
- 11:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1242', diff saved to https://phabricator.wikimedia.org/P91447 and previous config saved to /var/cache/conftool/dbconfig/20260424-113948-fceratto.json
- 11:32 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1196 (T419635)', diff saved to https://phabricator.wikimedia.org/P91445 and previous config saved to /var/cache/conftool/dbconfig/20260424-113235-fceratto.json
- 11:32 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1013,1017].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
- 11:32 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1196.eqiad.wmnet with reason: Maintenance
- 11:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1195 (T419635)', diff saved to https://phabricator.wikimedia.org/P91444 and previous config saved to /var/cache/conftool/dbconfig/20260424-113149-fceratto.json
- 11:31 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-serve1015.eqiad.wmnet
- 11:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1242', diff saved to https://phabricator.wikimedia.org/P91443 and previous config saved to /var/cache/conftool/dbconfig/20260424-112939-fceratto.json
- 11:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1195', diff saved to https://phabricator.wikimedia.org/P91440 and previous config saved to /var/cache/conftool/dbconfig/20260424-112141-fceratto.json
- 11:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1242 (T419961)', diff saved to https://phabricator.wikimedia.org/P91439 and previous config saved to /var/cache/conftool/dbconfig/20260424-111931-fceratto.json
- 11:16 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-serve1015.eqiad.wmnet
- 11:13 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2228: after reimage to trixie
- 11:13 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-serve1015.eqiad.wmnet
- 11:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1195', diff saved to https://phabricator.wikimedia.org/P91436 and previous config saved to /var/cache/conftool/dbconfig/20260424-111132-fceratto.json
- 11:11 aokoth@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1004.wikimedia.org with reason: Security Release - T424175
- 11:11 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1242 (T419961)', diff saved to https://phabricator.wikimedia.org/P91434 and previous config saved to /var/cache/conftool/dbconfig/20260424-111108-fceratto.json
- 11:11 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2228.codfw.wmnet with OS trixie
- 11:11 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1242.eqiad.wmnet with reason: Maintenance
- 11:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1241 (T419961)', diff saved to https://phabricator.wikimedia.org/P91433 and previous config saved to /var/cache/conftool/dbconfig/20260424-111039-fceratto.json
- 11:08 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1159: after reimage to trixie
- 11:05 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1159.eqiad.wmnet with OS trixie
- 11:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1195 (T419635)', diff saved to https://phabricator.wikimedia.org/P91431 and previous config saved to /var/cache/conftool/dbconfig/20260424-110125-fceratto.json
- 11:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1241', diff saved to https://phabricator.wikimedia.org/P91430 and previous config saved to /var/cache/conftool/dbconfig/20260424-110031-fceratto.json
- 10:59 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-serve1015.eqiad.wmnet
- 10:56 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-serve1015.eqiad.wmnet
- 10:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1241', diff saved to https://phabricator.wikimedia.org/P91429 and previous config saved to /var/cache/conftool/dbconfig/20260424-105023-fceratto.json
- 10:48 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2228.codfw.wmnet with reason: host reimage
- 10:42 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1195 (T419635)', diff saved to https://phabricator.wikimedia.org/P91428 and previous config saved to /var/cache/conftool/dbconfig/20260424-104235-fceratto.json
- 10:42 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1195.eqiad.wmnet with reason: Maintenance
- 10:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1186 (T419635)', diff saved to https://phabricator.wikimedia.org/P91427 and previous config saved to /var/cache/conftool/dbconfig/20260424-104210-fceratto.json
- 10:41 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1159.eqiad.wmnet with reason: host reimage
- 10:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1241 (T419961)', diff saved to https://phabricator.wikimedia.org/P91426 and previous config saved to /var/cache/conftool/dbconfig/20260424-104016-fceratto.json
- 10:38 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2228.codfw.wmnet with reason: host reimage
- 10:36 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1159.eqiad.wmnet with reason: host reimage
- 10:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1186', diff saved to https://phabricator.wikimedia.org/P91424 and previous config saved to /var/cache/conftool/dbconfig/20260424-103202-fceratto.json
- 10:31 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1241 (T419961)', diff saved to https://phabricator.wikimedia.org/P91423 and previous config saved to /var/cache/conftool/dbconfig/20260424-103146-fceratto.json
- 10:31 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1241.eqiad.wmnet with reason: Maintenance
- 10:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1238 (T419961)', diff saved to https://phabricator.wikimedia.org/P91422 and previous config saved to /var/cache/conftool/dbconfig/20260424-103116-fceratto.json
- 10:30 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-serve1015.eqiad.wmnet
- 10:26 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-serve1015.eqiad.wmnet
- 10:22 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2228.codfw.wmnet with OS trixie
- 10:22 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1159.eqiad.wmnet with OS trixie
- 10:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1186', diff saved to https://phabricator.wikimedia.org/P91421 and previous config saved to /var/cache/conftool/dbconfig/20260424-102154-fceratto.json
- 10:21 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2228: Reimage to Trixie
- 10:21 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1159: Reimage to Trixie
- 10:21 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2228: Reimage to Trixie
- 10:21 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2228.codfw.wmnet with reason: Reimage to Trixie
- 10:21 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1159: Reimage to Trixie
- 10:21 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1159.eqiad.wmnet with reason: Reimage to Trixie
- 10:21 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-serve1015.eqiad.wmnet
- 10:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1238', diff saved to https://phabricator.wikimedia.org/P91418 and previous config saved to /var/cache/conftool/dbconfig/20260424-102108-fceratto.json
- 10:17 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-serve1015.eqiad.wmnet
- 10:15 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-serve1015.eqiad.wmnet
- 10:12 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-serve1015.eqiad.wmnet
- 10:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1186 (T419635)', diff saved to https://phabricator.wikimedia.org/P91417 and previous config saved to /var/cache/conftool/dbconfig/20260424-101146-fceratto.json
- 10:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1238', diff saved to https://phabricator.wikimedia.org/P91416 and previous config saved to /var/cache/conftool/dbconfig/20260424-101056-fceratto.json
- 10:02 aokoth@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1004.wikimedia.org with reason: Security Release - T424175
- 10:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1238 (T419961)', diff saved to https://phabricator.wikimedia.org/P91415 and previous config saved to /var/cache/conftool/dbconfig/20260424-100047-fceratto.json
- 09:57 cmooney@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-serve1015.eqiad.wmnet
- 09:57 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ml-serve1015.eqiad.wmnet on all recursors
- 09:56 cmooney@cumin1003: START - Cookbook sre.dns.wipe-cache ml-serve1015.eqiad.wmnet on all recursors
- 09:56 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 09:56 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: change records for ml-serve1014 - cmooney@cumin1003"
- 09:56 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: change records for ml-serve1014 - cmooney@cumin1003"
- 09:54 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1186 (T419635)', diff saved to https://phabricator.wikimedia.org/P91414 and previous config saved to /var/cache/conftool/dbconfig/20260424-095450-fceratto.json
- 09:54 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1186.eqiad.wmnet with reason: Maintenance
- 09:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1184 (T419635)', diff saved to https://phabricator.wikimedia.org/P91413 and previous config saved to /var/cache/conftool/dbconfig/20260424-095425-fceratto.json
- 09:52 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1238 (T419961)', diff saved to https://phabricator.wikimedia.org/P91412 and previous config saved to /var/cache/conftool/dbconfig/20260424-095228-fceratto.json
- 09:52 cmooney@cumin1003: START - Cookbook sre.dns.netbox
- 09:52 cmooney@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-serve1015.eqiad.wmnet
- 09:52 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1238.eqiad.wmnet with reason: Maintenance
- 09:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1221 (T419961)', diff saved to https://phabricator.wikimedia.org/P91411 and previous config saved to /var/cache/conftool/dbconfig/20260424-095159-fceratto.json
- 09:50 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-serve1015.eqiad.wmnet
- 09:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1184', diff saved to https://phabricator.wikimedia.org/P91410 and previous config saved to /var/cache/conftool/dbconfig/20260424-094417-fceratto.json
- 09:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1221', diff saved to https://phabricator.wikimedia.org/P91409 and previous config saved to /var/cache/conftool/dbconfig/20260424-094151-fceratto.json
- 09:40 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-serve1015.eqiad.wmnet
- 09:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1184', diff saved to https://phabricator.wikimedia.org/P91408 and previous config saved to /var/cache/conftool/dbconfig/20260424-093409-fceratto.json
- 09:34 cmooney@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host ml-serve1014.eqiad.wmnet
- 09:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1221', diff saved to https://phabricator.wikimedia.org/P91407 and previous config saved to /var/cache/conftool/dbconfig/20260424-093143-fceratto.json
- 09:28 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ml-serve1014.eqiad.wmnet on all recursors
- 09:28 cmooney@cumin1003: START - Cookbook sre.dns.wipe-cache ml-serve1014.eqiad.wmnet on all recursors
- 09:27 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 09:27 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: change records for ml-serve1014 - cmooney@cumin1003"
- 09:27 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: change records for ml-serve1014 - cmooney@cumin1003"
- 09:24 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-serve1015.eqiad.wmnet
- 09:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1184 (T419635)', diff saved to https://phabricator.wikimedia.org/P91406 and previous config saved to /var/cache/conftool/dbconfig/20260424-092401-fceratto.json
- 09:23 cmooney@cumin1003: START - Cookbook sre.dns.netbox
- 09:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1221 (T419961)', diff saved to https://phabricator.wikimedia.org/P91405 and previous config saved to /var/cache/conftool/dbconfig/20260424-092135-fceratto.json
- 09:21 cmooney@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-serve1014.eqiad.wmnet
- 09:19 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-serve1015.eqiad.wmnet
- 09:16 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-serve1015.eqiad.wmnet
- 09:13 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1221 (T419961)', diff saved to https://phabricator.wikimedia.org/P91404 and previous config saved to /var/cache/conftool/dbconfig/20260424-091316-fceratto.json
- 09:13 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1015,1024-1025].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
- 09:12 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1221.eqiad.wmnet with reason: Maintenance
- 09:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1199 (T419961)', diff saved to https://phabricator.wikimedia.org/P91403 and previous config saved to /var/cache/conftool/dbconfig/20260424-091237-fceratto.json
- 09:11 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-serve1015.eqiad.wmnet
- 09:04 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1184 (T419635)', diff saved to https://phabricator.wikimedia.org/P91402 and previous config saved to /var/cache/conftool/dbconfig/20260424-090454-fceratto.json
- 09:04 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1184.eqiad.wmnet with reason: Maintenance
- 09:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1169 (T419635)', diff saved to https://phabricator.wikimedia.org/P91401 and previous config saved to /var/cache/conftool/dbconfig/20260424-090429-fceratto.json
- 09:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1199', diff saved to https://phabricator.wikimedia.org/P91400 and previous config saved to /var/cache/conftool/dbconfig/20260424-090229-fceratto.json
- 09:01 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-serve1015.eqiad.wmnet
- 08:56 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-serve1015.eqiad.wmnet
- 08:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P91399 and previous config saved to /var/cache/conftool/dbconfig/20260424-085421-fceratto.json
- 08:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1199', diff saved to https://phabricator.wikimedia.org/P91398 and previous config saved to /var/cache/conftool/dbconfig/20260424-085221-fceratto.json
- 08:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P91397 and previous config saved to /var/cache/conftool/dbconfig/20260424-084414-fceratto.json
- 08:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1199 (T419961)', diff saved to https://phabricator.wikimedia.org/P91396 and previous config saved to /var/cache/conftool/dbconfig/20260424-084213-fceratto.json
- 08:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1169 (T419635)', diff saved to https://phabricator.wikimedia.org/P91395 and previous config saved to /var/cache/conftool/dbconfig/20260424-083406-fceratto.json
- 08:31 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1199 (T419961)', diff saved to https://phabricator.wikimedia.org/P91394 and previous config saved to /var/cache/conftool/dbconfig/20260424-083118-fceratto.json
- 08:31 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1199.eqiad.wmnet with reason: Maintenance
- 08:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1190 (T419961)', diff saved to https://phabricator.wikimedia.org/P91393 and previous config saved to /var/cache/conftool/dbconfig/20260424-083050-fceratto.json
- 08:29 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host hcaptcha-proxy5003.wikimedia.org
- 08:29 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) hcaptcha-proxy5003.wikimedia.org on all recursors
- 08:29 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache hcaptcha-proxy5003.wikimedia.org on all recursors
- 08:29 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 08:29 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove records for VM hcaptcha-proxy5003.wikimedia.org - jmm@cumin2002"
- 08:29 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove records for VM hcaptcha-proxy5003.wikimedia.org - jmm@cumin2002"
- 08:27 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-serve1014.eqiad.wmnet
- 08:24 jmm@cumin2002: START - Cookbook sre.dns.netbox
- 08:24 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) hcaptcha-proxy5003.wikimedia.org on all recursors
- 08:24 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache hcaptcha-proxy5003.wikimedia.org on all recursors
- 08:24 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 08:24 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM hcaptcha-proxy5003.wikimedia.org - jmm@cumin2002"
- 08:24 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM hcaptcha-proxy5003.wikimedia.org - jmm@cumin2002"
- 08:22 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-serve1014.eqiad.wmnet
- 08:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1190', diff saved to https://phabricator.wikimedia.org/P91392 and previous config saved to /var/cache/conftool/dbconfig/20260424-082041-fceratto.json
- 08:19 jmm@cumin2002: START - Cookbook sre.dns.netbox
- 08:19 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host hcaptcha-proxy5003.wikimedia.org
- 08:15 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1169 (T419635)', diff saved to https://phabricator.wikimedia.org/P91391 and previous config saved to /var/cache/conftool/dbconfig/20260424-081539-fceratto.json
- 08:15 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1169.eqiad.wmnet with reason: Maintenance
- 08:12 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host hcaptcha-proxy5003.wikimedia.org
- 08:12 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) hcaptcha-proxy5003.wikimedia.org on all recursors
- 08:12 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache hcaptcha-proxy5003.wikimedia.org on all recursors
- 08:12 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 08:12 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove records for VM hcaptcha-proxy5003.wikimedia.org - jmm@cumin2002"
- 08:12 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove records for VM hcaptcha-proxy5003.wikimedia.org - jmm@cumin2002"
- 08:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1190', diff saved to https://phabricator.wikimedia.org/P91390 and previous config saved to /var/cache/conftool/dbconfig/20260424-081033-fceratto.json
- 08:09 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
- 08:08 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
- 08:08 jmm@cumin2002: START - Cookbook sre.dns.netbox
- 08:08 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) hcaptcha-proxy5003.wikimedia.org on all recursors
- 08:08 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache hcaptcha-proxy5003.wikimedia.org on all recursors
- 08:08 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 08:05 jmm@cumin2002: START - Cookbook sre.dns.netbox
- 08:05 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts netflow5002.eqsin.wmnet
- 08:05 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 08:05 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: netflow5002.eqsin.wmnet decommissioned, removing all IPs except the asset tag one - ayounsi@cumin1003"
- 08:04 jmm@cumin2002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
- 08:03 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: netflow5002.eqsin.wmnet decommissioned, removing all IPs except the asset tag one - ayounsi@cumin1003"
- 08:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1190 (T419961)', diff saved to https://phabricator.wikimedia.org/P91388 and previous config saved to /var/cache/conftool/dbconfig/20260424-080025-fceratto.json
- 08:00 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
- 07:54 jmm@cumin2002: START - Cookbook sre.dns.netbox
- 07:54 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host hcaptcha-proxy5003.wikimedia.org
- 07:51 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1190 (T419961)', diff saved to https://phabricator.wikimedia.org/P91386 and previous config saved to /var/cache/conftool/dbconfig/20260424-075145-fceratto.json
- 07:51 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1190.eqiad.wmnet with reason: Maintenance
- 07:50 ayounsi@cumin1003: START - Cookbook sre.hosts.decommission for hosts netflow5002.eqsin.wmnet
- 07:45 ayounsi@cumin1003: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=99) for hosts netflow5002.eqsin.wmnet
- 07:45 ayounsi@cumin1003: START - Cookbook sre.hosts.decommission for hosts netflow5002.eqsin.wmnet
- 06:28 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on db[2142-2143].codfw.wmnet with reason: Cloning
- 05:56 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 264595
- 05:53 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 264595
- 05:49 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 58717
- 05:48 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 58717
- 05:44 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 20940
- 05:41 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 20940
- 05:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 19165
- 05:40 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 19165
- 05:33 marostegui@cumin1003: dbctl commit (dc=all): 'Remove db2148 from dbctl T424309', diff saved to https://phabricator.wikimedia.org/P91385 and previous config saved to /var/cache/conftool/dbconfig/20260424-053342-marostegui.json
- 03:30 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db2181 (T410589)', diff saved to https://phabricator.wikimedia.org/P91384 and previous config saved to /var/cache/conftool/dbconfig/20260424-033021-ladsgroup.json
- 03:30 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db2181.codfw.wmnet with reason: Maintenance
- 03:29 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2167 (T410589)', diff saved to https://phabricator.wikimedia.org/P91383 and previous config saved to /var/cache/conftool/dbconfig/20260424-032955-ladsgroup.json
- 03:19 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2167', diff saved to https://phabricator.wikimedia.org/P91382 and previous config saved to /var/cache/conftool/dbconfig/20260424-031947-ladsgroup.json
- 03:09 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2167', diff saved to https://phabricator.wikimedia.org/P91381 and previous config saved to /var/cache/conftool/dbconfig/20260424-030938-ladsgroup.json
- 02:59 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2167 (T410589)', diff saved to https://phabricator.wikimedia.org/P91380 and previous config saved to /var/cache/conftool/dbconfig/20260424-025930-ladsgroup.json
- 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 32s)
- 02:01 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
- 00:05 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host rdb2013.codfw.wmnet with OS trixie
2026-04-23
- 23:38 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host rdb2014.codfw.wmnet with OS trixie
- 23:38 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host rdb2013.codfw.wmnet with OS trixie
- 23:31 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host rdb2014.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 23:26 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host rdb2014.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 23:00 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host rdb2014.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 23:00 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host rdb2014.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 22:38 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host rdb2013.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 22:37 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host rdb2014.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 22:34 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host rdb2014.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 22:31 ladsgroup@deploy1003: Finished scap sync-world: Backport for QuickView: Fix relying on non-standard sizes (T424032) (duration: 07m 19s)
- 22:28 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host rdb2014.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 22:28 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host rdb2014.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 22:28 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
- 22:27 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host rdb2013.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 22:27 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host rdb2014.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 22:26 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host rdb2014.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 22:26 ladsgroup@deploy1003: ladsgroup: Backport for QuickView: Fix relying on non-standard sizes (T424032) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 22:24 ladsgroup@deploy1003: Started scap sync-world: Backport for QuickView: Fix relying on non-standard sizes (T424032)
- 22:21 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudelastic1011.eqiad.wmnet with OS trixie
- 22:21 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host rdb2013.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 22:18 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host rdb2014.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 22:14 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host rdb2014.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 22:14 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host rdb2013.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 22:13 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host rdb2014
- 22:12 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host rdb2014
- 22:12 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host rdb2013
- 22:12 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host rdb2013
- 22:08 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 22:08 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding rdb2013 to codfw - jhancock@cumin2002"
- 22:07 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding rdb2013 to codfw - jhancock@cumin2002"
- 22:03 jhancock@cumin2002: START - Cookbook sre.dns.netbox
- 21:54 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudelastic1011.eqiad.wmnet with reason: host reimage
- 21:48 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudelastic1011.eqiad.wmnet with reason: host reimage
- 21:36 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1011.eqiad.wmnet with OS trixie
- 21:10 krinkle@deploy1003: Finished scap sync-world: Backport for ext.wikiEditor: Set background-size for toolbar buttons (T414805) (duration: 05m 47s)
- 21:06 krinkle@deploy1003: krinkle: Continuing with deployment
- 21:05 krinkle@deploy1003: krinkle: Backport for ext.wikiEditor: Set background-size for toolbar buttons (T414805) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 21:04 krinkle@deploy1003: Started scap sync-world: Backport for ext.wikiEditor: Set background-size for toolbar buttons (T414805)
- 21:03 krinkle@deploy1003: Finished scap sync-world: Backport for ext.wikiEditor: Set background-size for toolbar buttons (T414805) (duration: 03m 05s)
- 21:03 krinkle@deploy1003: krinkle: Rolling back deployment
- 21:02 krinkle@deploy1003: krinkle: Backport for ext.wikiEditor: Set background-size for toolbar buttons (T414805) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 21:00 krinkle@deploy1003: Started scap sync-world: Backport for ext.wikiEditor: Set background-size for toolbar buttons (T414805)
- 20:51 cscott@deploy1003: Finished scap sync-world: Backport for Deploy Parsoid Read Views to banwiki/ganwiki (T423785) (duration: 06m 02s)
- 20:47 cscott@deploy1003: cscott: Continuing with deployment
- 20:47 cscott@deploy1003: cscott: Backport for Deploy Parsoid Read Views to banwiki/ganwiki (T423785) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 20:45 cscott@deploy1003: Started scap sync-world: Backport for Deploy Parsoid Read Views to banwiki/ganwiki (T423785)
- 19:28 otto@deploy1003: Finished scap sync-world: Backport for Remove stream 'mediawiki.dump.revision_content_history.reconcile.rc0' (T417694), EventStreamConfig - add rc0 streams for html and feature count change (T423920) (duration: 22m 05s)
- 19:24 otto@deploy1003: xcollazo, otto: Continuing with deployment
- 19:14 otto@deploy1003: xcollazo, otto: Backport for Remove stream 'mediawiki.dump.revision_content_history.reconcile.rc0' (T417694), EventStreamConfig - add rc0 streams for html and feature count change (T423920) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 19:09 jasmine@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-ctrl[2004-2005].codfw.wmnet
- 19:09 jasmine@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-ctrl[2004-2005].codfw.wmnet
- 19:06 jasmine_: “ran homer on lsw1-c7-codfw and lsw1-b2-codfw following new control planes (T390861)"
- 19:06 otto@deploy1003: Started scap sync-world: Backport for Remove stream 'mediawiki.dump.revision_content_history.reconcile.rc0' (T417694), EventStreamConfig - add rc0 streams for html and feature count change (T423920)
- 18:19 jasmine@cumin2002: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Syncing netbox hieradata to fetch BGP for new control planes - jasmine@cumin2002 - T390861"
- 18:13 jasmine@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Syncing netbox hieradata to fetch BGP for new control planes - jasmine@cumin2002 - T390861"
- 17:09 bd808@deploy1003: helmfile [eqiad] DONE helmfile.d/services/developer-portal: apply
- 17:09 bd808@deploy1003: helmfile [eqiad] START helmfile.d/services/developer-portal: apply
- 17:08 bd808@deploy1003: helmfile [codfw] DONE helmfile.d/services/developer-portal: apply
- 17:08 bd808@deploy1003: helmfile [codfw] START helmfile.d/services/developer-portal: apply
- 17:07 bd808@deploy1003: helmfile [staging] DONE helmfile.d/services/developer-portal: apply
- 17:07 bd808@deploy1003: helmfile [staging] START helmfile.d/services/developer-portal: apply
- 16:46 jasmine@dns1004: END - running authdns-update
- 16:44 jasmine@dns1004: START - running authdns-update
- 16:39 jasmine@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on wikikube-ctrl2005.codfw.wmnet with reason: Downtiming to avoid page in case of race condition
- 16:29 herron@cumin1003: END (PASS) - Cookbook sre.kafka.change-confluent-distro-version (exit_code=0) Change Confluent distribution for Kafka A:kafka-logging-codfw cluster: Change Confluent distribution.
- 16:26 ladsgroup@deploy1003: Finished scap sync-world: Backport for Media: Fallback to the largest standard size if an overly large one is requested (T418745 T423895) (duration: 05m 53s)
- 16:22 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
- 16:22 ladsgroup@deploy1003: ladsgroup: Backport for Media: Fallback to the largest standard size if an overly large one is requested (T418745 T423895) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 16:20 ladsgroup@deploy1003: Started scap sync-world: Backport for Media: Fallback to the largest standard size if an overly large one is requested (T418745 T423895)
- 16:16 Amir1: re-enabling general ban on any non-standard thumb (T414805)
- 16:13 klausman@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
- 16:13 klausman@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
- 16:12 klausman@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
- 16:12 klausman@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
- 16:12 klausman@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
- 16:11 klausman@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
- 16:10 herron@cumin1003: START - Cookbook sre.kafka.change-confluent-distro-version Change Confluent distribution for Kafka A:kafka-logging-codfw cluster: Change Confluent distribution.
- 15:54 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host ncredir5004.eqsin.wmnet
- 15:54 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ncredir5004.eqsin.wmnet with OS bookworm
- 15:48 sukhe: sudo cumin -b31 "A:cp and not P{cp2041* or cp2042*}" "run-puppet-agent --enable 'merging CR 1276017'" T420604. finish rollout of removing CSP in VCL from beta
- 15:34 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ncredir5004.eqsin.wmnet with reason: host reimage
- 15:30 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ncredir5004.eqsin.wmnet with reason: host reimage
- 15:25 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db2167 (T410589)', diff saved to https://phabricator.wikimedia.org/P91378 and previous config saved to /var/cache/conftool/dbconfig/20260423-152514-ladsgroup.json
- 15:25 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db2167.codfw.wmnet with reason: Maintenance
- 15:24 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2166 (T410589)', diff saved to https://phabricator.wikimedia.org/P91377 and previous config saved to /var/cache/conftool/dbconfig/20260423-152450-ladsgroup.json
- 15:16 aokoth@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab2002.wikimedia.org with reason: Security Release - T424175
- 15:14 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2166', diff saved to https://phabricator.wikimedia.org/P91375 and previous config saved to /var/cache/conftool/dbconfig/20260423-151441-ladsgroup.json
- 15:07 aokoth@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab2002.wikimedia.org with reason: Security Release - T424175
- 15:06 aokoth@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1003.wikimedia.org with reason: Security Release - T424175
- 15:04 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2166', diff saved to https://phabricator.wikimedia.org/P91374 and previous config saved to /var/cache/conftool/dbconfig/20260423-150433-ladsgroup.json
- 15:03 moritzm: installing rsync security updates
- 14:57 aokoth@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1003.wikimedia.org with reason: Security Release - T424175
- 14:54 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2166 (T410589)', diff saved to https://phabricator.wikimedia.org/P91373 and previous config saved to /var/cache/conftool/dbconfig/20260423-145425-ladsgroup.json
- 14:48 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ncredir5004.eqsin.wmnet with OS bookworm
- 14:47 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM ncredir5004.eqsin.wmnet - jmm@cumin2002"
- 14:46 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM ncredir5004.eqsin.wmnet - jmm@cumin2002"
- 14:46 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ncredir5004.eqsin.wmnet on all recursors
- 14:46 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache ncredir5004.eqsin.wmnet on all recursors
- 14:46 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 14:46 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM ncredir5004.eqsin.wmnet - jmm@cumin2002"
- 14:46 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM ncredir5004.eqsin.wmnet - jmm@cumin2002"
- 14:42 jmm@cumin2002: START - Cookbook sre.dns.netbox
- 14:42 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host ncredir5004.eqsin.wmnet
- 14:39 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host ncredir5003.eqsin.wmnet
- 14:39 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ncredir5003.eqsin.wmnet with OS bookworm
- 14:34 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
- 14:33 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
- 14:33 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
- 14:33 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
- 14:22 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts tcp-proxy5002.eqsin.wmnet
- 14:22 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 14:22 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: tcp-proxy5002.eqsin.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
- 14:22 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: tcp-proxy5002.eqsin.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
- 14:15 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ncredir5003.eqsin.wmnet with reason: host reimage
- 14:10 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ncredir5003.eqsin.wmnet with reason: host reimage
- 14:10 jmm@cumin2002: START - Cookbook sre.dns.netbox
- 14:06 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts db2145.codfw.wmnet
- 14:06 marostegui@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 14:06 marostegui@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2145.codfw.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
- 14:06 marostegui@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2145.codfw.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
- 14:00 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts tcp-proxy5002.eqsin.wmnet
- 13:59 marostegui@cumin1003: START - Cookbook sre.dns.netbox
- 13:59 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts tcp-proxy5001.eqsin.wmnet
- 13:59 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 13:59 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: tcp-proxy5001.eqsin.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
- 13:59 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: tcp-proxy5001.eqsin.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
- 13:52 marostegui@cumin1003: START - Cookbook sre.hosts.decommission for hosts db2145.codfw.wmnet
- 13:39 Lucas_WMDE: UTC afternoon backport+config window done
- 13:36 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for Enable the CampaignEvents extension on incubator (T421749) (duration: 06m 11s)
- 13:32 lucaswerkmeister-wmde@deploy1003: mhorsey, lucaswerkmeister-wmde: Continuing with deployment
- 13:32 lucaswerkmeister-wmde@deploy1003: mhorsey, lucaswerkmeister-wmde: Backport for Enable the CampaignEvents extension on incubator (T421749) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 13:32 jmm@cumin2002: START - Cookbook sre.dns.netbox
- 13:30 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for Enable the CampaignEvents extension on incubator (T421749)
- 13:28 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ncredir5003.eqsin.wmnet with OS bookworm
- 13:27 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts tcp-proxy5001.eqsin.wmnet
- 13:25 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM ncredir5003.eqsin.wmnet - jmm@cumin2002"
- 13:25 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM ncredir5003.eqsin.wmnet - jmm@cumin2002"
- 13:25 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ncredir5003.eqsin.wmnet on all recursors
- 13:25 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache ncredir5003.eqsin.wmnet on all recursors
- 13:25 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 13:25 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM ncredir5003.eqsin.wmnet - jmm@cumin2002"
- 13:23 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2048 (T419961)', diff saved to https://phabricator.wikimedia.org/P91370 and previous config saved to /var/cache/conftool/dbconfig/20260423-132311-fceratto.json
- 13:22 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
- 13:22 aude@deploy1003: Finished scap sync-world: Backport for Parsoid Read Views: 100% rollout to Russian Wikipedia (T423188), Opt-in new accounts to ReadingLists beta feature on all Wikipedia wikis (T420881) (duration: 06m 42s)
- 13:22 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
- 13:21 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
- 13:21 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
- 13:18 aude@deploy1003: cscott, aude: Continuing with deployment
- 13:16 aude@deploy1003: cscott, aude: Backport for Parsoid Read Views: 100% rollout to Russian Wikipedia (T423188), Opt-in new accounts to ReadingLists beta feature on all Wikipedia wikis (T420881) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 13:15 aude@deploy1003: Started scap sync-world: Backport for Parsoid Read Views: 100% rollout to Russian Wikipedia (T423188), Opt-in new accounts to ReadingLists beta feature on all Wikipedia wikis (T420881)
- 13:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2048', diff saved to https://phabricator.wikimedia.org/P91369 and previous config saved to /var/cache/conftool/dbconfig/20260423-131303-fceratto.json
- 13:12 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM ncredir5003.eqsin.wmnet - jmm@cumin2002"
- 13:05 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-analytics-test: apply
- 13:05 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-analytics-test: apply
- 13:03 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host thanos-be2009.codfw.wmnet with OS bullseye
- 13:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2048', diff saved to https://phabricator.wikimedia.org/P91368 and previous config saved to /var/cache/conftool/dbconfig/20260423-130255-fceratto.json
- 13:01 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-test-k8s: apply
- 13:01 eevans@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on aqs1015.eqiad.wmnet with reason: Decommissioning — T412830
- 13:00 jmm@cumin2002: START - Cookbook sre.dns.netbox
- 13:00 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host ncredir5003.eqsin.wmnet
- 13:00 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
- 12:55 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host netflow5003.eqsin.wmnet
- 12:55 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host netflow5003.eqsin.wmnet with OS bookworm
- 12:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2048 (T419961)', diff saved to https://phabricator.wikimedia.org/P91367 and previous config saved to /var/cache/conftool/dbconfig/20260423-125247-fceratto.json
- 12:45 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es2048 (T419961)', diff saved to https://phabricator.wikimedia.org/P91366 and previous config saved to /var/cache/conftool/dbconfig/20260423-124535-fceratto.json
- 12:45 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es2048.codfw.wmnet with reason: Maintenance
- 12:45 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on thanos-be2009.codfw.wmnet with reason: host reimage
- 12:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2040 (T419961)', diff saved to https://phabricator.wikimedia.org/P91365 and previous config saved to /var/cache/conftool/dbconfig/20260423-124504-fceratto.json
- 12:39 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on thanos-be2009.codfw.wmnet with reason: host reimage
- 12:38 kharlan@deploy1003: Finished scap sync-world: Backport for hCaptcha: Retry SiteVerify up to two times (T421204) (duration: 06m 25s)
- 12:36 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on netflow5003.eqsin.wmnet with reason: host reimage
- 12:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2040', diff saved to https://phabricator.wikimedia.org/P91363 and previous config saved to /var/cache/conftool/dbconfig/20260423-123456-fceratto.json
- 12:34 kharlan@deploy1003: kharlan: Continuing with deployment
- 12:33 kharlan@deploy1003: kharlan: Backport for hCaptcha: Retry SiteVerify up to two times (T421204) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 12:32 kharlan@deploy1003: Started scap sync-world: Backport for hCaptcha: Retry SiteVerify up to two times (T421204)
- 12:30 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on netflow5003.eqsin.wmnet with reason: host reimage
- 12:30 kharlan@deploy1003: Finished scap sync-world: Backport for hCaptcha: Disable Private Access Tokens in secure-api URL (T424216) (duration: 06m 57s)
- 12:26 kharlan@deploy1003: kharlan: Continuing with deployment
- 12:24 kharlan@deploy1003: kharlan: Backport for hCaptcha: Disable Private Access Tokens in secure-api URL (T424216) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 12:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2040', diff saved to https://phabricator.wikimedia.org/P91362 and previous config saved to /var/cache/conftool/dbconfig/20260423-122448-fceratto.json
- 12:23 kharlan@deploy1003: Started scap sync-world: Backport for hCaptcha: Disable Private Access Tokens in secure-api URL (T424216)
- 12:19 kharlan@deploy1003: Finished scap sync-world: Backport for hCaptcha: Don't prevent opening links present in the hCaptcha popup (T408812) (duration: 08m 11s)
- 12:16 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host thanos-be2009.codfw.wmnet with OS bullseye
- 12:15 kharlan@deploy1003: harroyo-wmf, kharlan: Continuing with deployment
- 12:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2040 (T419961)', diff saved to https://phabricator.wikimedia.org/P91361 and previous config saved to /var/cache/conftool/dbconfig/20260423-121439-fceratto.json
- 12:12 kharlan@deploy1003: harroyo-wmf, kharlan: Backport for hCaptcha: Don't prevent opening links present in the hCaptcha popup (T408812) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 12:11 kharlan@deploy1003: Started scap sync-world: Backport for hCaptcha: Don't prevent opening links present in the hCaptcha popup (T408812)
- 12:08 kart_: staging: Update cxserver to 2026-04-23-114216-production (T423002)
- 12:04 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es2040 (T419961)', diff saved to https://phabricator.wikimedia.org/P91360 and previous config saved to /var/cache/conftool/dbconfig/20260423-120400-fceratto.json
- 12:03 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es2040.codfw.wmnet with reason: Maintenance
- 12:03 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2039 (T419961)', diff saved to https://phabricator.wikimedia.org/P91359 and previous config saved to /var/cache/conftool/dbconfig/20260423-120332-fceratto.json
- 12:00 kartik@deploy1003: helmfile [staging] DONE helmfile.d/services/cxserver: apply
- 12:00 kartik@deploy1003: helmfile [staging] START helmfile.d/services/cxserver: apply
- 11:53 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2039', diff saved to https://phabricator.wikimedia.org/P91358 and previous config saved to /var/cache/conftool/dbconfig/20260423-115324-fceratto.json
- 11:47 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host netflow5003.eqsin.wmnet with OS bookworm
- 11:44 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
- 11:44 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
- 11:43 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2039', diff saved to https://phabricator.wikimedia.org/P91357 and previous config saved to /var/cache/conftool/dbconfig/20260423-114316-fceratto.json
- 11:42 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM netflow5003.eqsin.wmnet - jmm@cumin2002"
- 11:42 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM netflow5003.eqsin.wmnet - jmm@cumin2002"
- 11:42 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) netflow5003.eqsin.wmnet on all recursors
- 11:42 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache netflow5003.eqsin.wmnet on all recursors
- 11:42 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 11:42 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM netflow5003.eqsin.wmnet - jmm@cumin2002"
- 11:40 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM netflow5003.eqsin.wmnet - jmm@cumin2002"
- 11:36 jmm@cumin2002: START - Cookbook sre.dns.netbox
- 11:36 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host netflow5003.eqsin.wmnet
- 11:33 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2039 (T419961)', diff saved to https://phabricator.wikimedia.org/P91356 and previous config saved to /var/cache/conftool/dbconfig/20260423-113307-fceratto.json
- 11:21 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es2039 (T419961)', diff saved to https://phabricator.wikimedia.org/P91355 and previous config saved to /var/cache/conftool/dbconfig/20260423-112133-fceratto.json
- 11:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es2039.codfw.wmnet with reason: Maintenance
- 11:21 moritzm: installing ngtcp2 security updates
- 11:20 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
- 11:19 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
- 11:13 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
- 11:13 hnowlan@deploy1003: Finished deploy [restbase/deploy@8a25036]: Add urwikisource T415975 (repeat attempt, last deploy did not include change) (duration: 11m 55s)
- 11:13 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
- 11:12 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host doh5004.wikimedia.org
- 11:12 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host doh5004.wikimedia.org with OS bookworm
- 11:08 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
- 11:08 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
- 11:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1037 (T419961)', diff saved to https://phabricator.wikimedia.org/P91354 and previous config saved to /var/cache/conftool/dbconfig/20260423-110359-fceratto.json
- 11:01 hnowlan@deploy1003: Started deploy [restbase/deploy@8a25036]: Add urwikisource T415975 (repeat attempt, last deploy did not include change)
- 11:00 hnowlan@deploy1003: Finished deploy [restbase/deploy@8a25036]: Add urwikisource T415975 (repeat attempt, last deploy did not include change) (duration: 33m 20s)
- 10:57 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host thanos-be2008.codfw.wmnet with OS bullseye
- 10:54 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on doh5004.wikimedia.org with reason: host reimage
- 10:53 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1037', diff saved to https://phabricator.wikimedia.org/P91353 and previous config saved to /var/cache/conftool/dbconfig/20260423-105351-fceratto.json
- 10:50 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on doh5004.wikimedia.org with reason: host reimage
- 10:43 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1037', diff saved to https://phabricator.wikimedia.org/P91352 and previous config saved to /var/cache/conftool/dbconfig/20260423-104343-fceratto.json
- 10:42 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
- 10:39 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on thanos-be2008.codfw.wmnet with reason: host reimage
- 10:37 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
- 10:33 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
- 10:33 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1037 (T419961)', diff saved to https://phabricator.wikimedia.org/P91351 and previous config saved to /var/cache/conftool/dbconfig/20260423-103334-fceratto.json
- 10:32 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on thanos-be2008.codfw.wmnet with reason: host reimage
- 10:27 hnowlan@deploy1003: Started deploy [restbase/deploy@8a25036]: Add urwikisource T415975 (repeat attempt, last deploy did not include change)
- 10:24 lucaswerkmeister-wmde@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikidata-query-gui: apply
- 10:24 lucaswerkmeister-wmde@deploy1003: helmfile [eqiad] START helmfile.d/services/wikidata-query-gui: apply
- 10:24 lucaswerkmeister-wmde@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikidata-query-gui: apply
- 10:24 lucaswerkmeister-wmde@deploy1003: helmfile [codfw] START helmfile.d/services/wikidata-query-gui: apply
- 10:24 lucaswerkmeister-wmde@deploy1003: helmfile [staging] DONE helmfile.d/services/wikidata-query-gui: apply
- 10:23 lucaswerkmeister-wmde@deploy1003: helmfile [staging] START helmfile.d/services/wikidata-query-gui: apply
- 10:21 isaranto@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
- 10:20 isaranto@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
- 10:19 marostegui@cumin1003: dbctl commit (dc=all): 'Repool pc2 with pc2022 as codfw master T418973', diff saved to https://phabricator.wikimedia.org/P91348 and previous config saved to /var/cache/conftool/dbconfig/20260423-101957-marostegui.json
- 10:19 daniel@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/aux-k8s-services/redioscope: apply
- 10:19 daniel@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/aux-k8s-services/redioscope: apply
- 10:19 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es1037 (T419961)', diff saved to https://phabricator.wikimedia.org/P91347 and previous config saved to /var/cache/conftool/dbconfig/20260423-101855-fceratto.json
- 10:18 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1037.eqiad.wmnet with reason: Maintenance
- 10:17 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
- 10:16 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
- 10:16 daniel@deploy1003: Finished scap sync-world: Backport for api rate limits: use global apihighlimits-requestor group. (T419796) (duration: 07m 37s)
- 10:16 marostegui@cumin1003: dbctl commit (dc=all): 'Make pc2022 master of pc2 T418973', diff saved to https://phabricator.wikimedia.org/P91346 and previous config saved to /var/cache/conftool/dbconfig/20260423-101611-marostegui.json
- 10:15 marostegui@cumin1003: dbctl commit (dc=all): 'Add pc2022, remove pc2012 T418973 T424201', diff saved to https://phabricator.wikimedia.org/P91345 and previous config saved to /var/cache/conftool/dbconfig/20260423-101544-marostegui.json
- 10:15 daniel@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/aux-k8s-services/redioscope: apply
- 10:15 ayounsi@cumin1003: END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox-canary
- 10:14 ayounsi@cumin1003: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox-canary
- 10:14 daniel@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/aux-k8s-services/redioscope: apply
- 10:14 ayounsi@cumin1003: END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox
- 10:13 ayounsi@cumin1003: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox
- 10:12 daniel@deploy1003: daniel: Continuing with deployment
- 10:10 daniel@deploy1003: daniel: Backport for api rate limits: use global apihighlimits-requestor group. (T419796) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 10:10 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host thanos-be2008.codfw.wmnet with OS bullseye
- 10:08 daniel@deploy1003: Started scap sync-world: Backport for api rate limits: use global apihighlimits-requestor group. (T419796)
- 10:00 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host doh5004.wikimedia.org with OS bookworm
- 10:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2047 (T419961)', diff saved to https://phabricator.wikimedia.org/P91343 and previous config saved to /var/cache/conftool/dbconfig/20260423-100035-fceratto.json
- 09:59 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM doh5004.wikimedia.org - jmm@cumin2002"
- 09:59 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM doh5004.wikimedia.org - jmm@cumin2002"
- 09:58 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) doh5004.wikimedia.org on all recursors
- 09:58 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache doh5004.wikimedia.org on all recursors
- 09:58 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 09:58 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM doh5004.wikimedia.org - jmm@cumin2002"
- 09:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2047', diff saved to https://phabricator.wikimedia.org/P91341 and previous config saved to /var/cache/conftool/dbconfig/20260423-095027-fceratto.json
- 09:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2047', diff saved to https://phabricator.wikimedia.org/P91340 and previous config saved to /var/cache/conftool/dbconfig/20260423-094019-fceratto.json
- 09:32 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM doh5004.wikimedia.org - jmm@cumin2002"
- 09:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2047 (T419961)', diff saved to https://phabricator.wikimedia.org/P91339 and previous config saved to /var/cache/conftool/dbconfig/20260423-093010-fceratto.json
- 09:27 aikochou@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
- 09:25 jmm@cumin2002: START - Cookbook sre.dns.netbox
- 09:25 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host doh5004.wikimedia.org
- 09:23 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es2047 (T419961)', diff saved to https://phabricator.wikimedia.org/P91338 and previous config saved to /var/cache/conftool/dbconfig/20260423-092303-fceratto.json
- 09:22 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es2047.codfw.wmnet with reason: Maintenance
- 09:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2037 (T419961)', diff saved to https://phabricator.wikimedia.org/P91337 and previous config saved to /var/cache/conftool/dbconfig/20260423-092232-fceratto.json
- 09:19 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host doh5003.wikimedia.org
- 09:19 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host doh5003.wikimedia.org with OS bookworm
- 09:17 aikochou@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
- 09:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2037', diff saved to https://phabricator.wikimedia.org/P91336 and previous config saved to /var/cache/conftool/dbconfig/20260423-091224-fceratto.json
- 09:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2037', diff saved to https://phabricator.wikimedia.org/P91335 and previous config saved to /var/cache/conftool/dbconfig/20260423-090216-fceratto.json
- 09:00 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on doh5003.wikimedia.org with reason: host reimage
- 09:00 marostegui@cumin1003: dbctl commit (dc=all): 'Remove db2146 from dbctl T424179', diff saved to https://phabricator.wikimedia.org/P91334 and previous config saved to /var/cache/conftool/dbconfig/20260423-090014-marostegui.json
- 08:58 jmm@puppetserver1001: conftool action : set/pooled=no; selector: name=tcp-proxy5002.eqsin.wmnet
- 08:58 jmm@puppetserver1001: conftool action : set/pooled=no; selector: name=tcp-proxy5001.eqsin.wmnet
- 08:56 jmm@puppetserver1001: conftool action : set/pooled=yes; selector: name=tcp-proxy5004.eqsin.wmnet
- 08:56 jmm@puppetserver1001: conftool action : set/weight=1; selector: name=tcp-proxy5004.eqsin.wmnet
- 08:55 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on doh5003.wikimedia.org with reason: host reimage
- 08:53 jmm@puppetserver1001: conftool action : set/pooled=yes; selector: name=tcp-proxy5003.eqsin.wmnet
- 08:52 jmm@puppetserver1001: conftool action : set/weight=1; selector: name=tcp-proxy5003.eqsin.wmnet
- 08:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2037 (T419961)', diff saved to https://phabricator.wikimedia.org/P91333 and previous config saved to /var/cache/conftool/dbconfig/20260423-085207-fceratto.json
- 08:40 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es2037 (T419961)', diff saved to https://phabricator.wikimedia.org/P91330 and previous config saved to /var/cache/conftool/dbconfig/20260423-084035-fceratto.json
- 08:40 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es2037.codfw.wmnet with reason: Maintenance
- 08:12 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2152.codfw.wmnet with reason: Maintenance
- 08:10 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host thanos-be2007.codfw.wmnet with OS bullseye
- 08:06 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host doh5003.wikimedia.org with OS bookworm
- 08:05 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM doh5003.wikimedia.org - jmm@cumin2002"
- 08:05 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM doh5003.wikimedia.org - jmm@cumin2002"
- 08:05 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) doh5003.wikimedia.org on all recursors
- 08:05 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache doh5003.wikimedia.org on all recursors
- 08:05 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 08:05 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM doh5003.wikimedia.org - jmm@cumin2002"
- 08:04 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM doh5003.wikimedia.org - jmm@cumin2002"
- 08:01 jmm@cumin2002: START - Cookbook sre.dns.netbox
- 08:01 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host doh5003.wikimedia.org
- 07:50 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on thanos-be2007.codfw.wmnet with reason: host reimage
- 07:45 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on thanos-be2007.codfw.wmnet with reason: host reimage
- 07:22 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host thanos-be2007.codfw.wmnet with OS bullseye
- 07:15 marostegui@cumin1003: dbctl commit (dc=all): 'Remove db2145 from dbctl T424177', diff saved to https://phabricator.wikimedia.org/P91329 and previous config saved to /var/cache/conftool/dbconfig/20260423-071500-marostegui.json
- 06:58 marostegui@cumin1003: dbctl commit (dc=all): 'Repool ms1 with db2252 as new codfw master T418979', diff saved to https://phabricator.wikimedia.org/P91328 and previous config saved to /var/cache/conftool/dbconfig/20260423-065803-marostegui.json
- 06:57 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2252: Cloning
- 06:57 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.parsercache (exit_code=99)
- 06:56 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
- 06:56 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2252: Cloning
- 06:53 marostegui@cumin1003: dbctl commit (dc=all): 'Make db2252 master of ms3 T418979', diff saved to https://phabricator.wikimedia.org/P91327 and previous config saved to /var/cache/conftool/dbconfig/20260423-065323-marostegui.json
- 06:52 marostegui@cumin1003: dbctl commit (dc=all): 'Remove db2143 from ms3, add db2252 T418979', diff saved to https://phabricator.wikimedia.org/P91326 and previous config saved to /var/cache/conftool/dbconfig/20260423-065214-marostegui.json
- 06:28 jelto: gerrit2003 maintenance finished - T333143
- 06:05 jelto: start gerrit2003 maintenance - T333143
- 05:57 jelto@cumin1003: DONE (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 0:35:00 on gerrit.discovery.wmnet with reason: Gerrit maintenance
- 05:57 jelto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:35:00 on gerrit2003.wikimedia.org with reason: Gerrit maintenance
- 05:41 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db[2143,2252].codfw.wmnet,db1153.eqiad.wmnet with reason: Cloning
- 05:41 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2143: Cloning db2252 from db2143
- 05:41 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
- 05:41 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
- 05:41 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2143: Cloning db2252 from db2143
- 05:28 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on pc[2012,2022].codfw.wmnet,pc1012.eqiad.wmnet with reason: Cloning
- 05:28 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool pc2012: Cloning pc2022 from pc2012
- 05:28 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
- 05:28 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
- 05:28 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool pc2012: Cloning pc2022 from pc2012
- 05:27 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on pc[2012,2022].codfw.wmnet with reason: Cloning
- 03:15 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db2166 (T410589)', diff saved to https://phabricator.wikimedia.org/P91321 and previous config saved to /var/cache/conftool/dbconfig/20260423-031538-ladsgroup.json
- 03:15 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db2166.codfw.wmnet with reason: Maintenance
- 03:15 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2164 (T410589)', diff saved to https://phabricator.wikimedia.org/P91320 and previous config saved to /var/cache/conftool/dbconfig/20260423-031512-ladsgroup.json
- 03:05 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2164', diff saved to https://phabricator.wikimedia.org/P91319 and previous config saved to /var/cache/conftool/dbconfig/20260423-030504-ladsgroup.json
- 02:54 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2164', diff saved to https://phabricator.wikimedia.org/P91318 and previous config saved to /var/cache/conftool/dbconfig/20260423-025455-ladsgroup.json
- 02:44 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2164 (T410589)', diff saved to https://phabricator.wikimedia.org/P91317 and previous config saved to /var/cache/conftool/dbconfig/20260423-024447-ladsgroup.json
- 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 12s)
- 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
2026-04-22
- 15:40 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db[2141,2250].codfw.wmnet with reason: clone
- 15:08 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db2164 (T410589)', diff saved to https://phabricator.wikimedia.org/P91315 and previous config saved to /var/cache/conftool/dbconfig/20260422-150817-ladsgroup.json
- 15:08 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db2164.codfw.wmnet with reason: Maintenance
- 15:07 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2163 (T410589)', diff saved to https://phabricator.wikimedia.org/P91314 and previous config saved to /var/cache/conftool/dbconfig/20260422-150752-ladsgroup.json
- 14:57 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2163', diff saved to https://phabricator.wikimedia.org/P91313 and previous config saved to /var/cache/conftool/dbconfig/20260422-145744-ladsgroup.json
- 14:47 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2163', diff saved to https://phabricator.wikimedia.org/P91312 and previous config saved to /var/cache/conftool/dbconfig/20260422-144736-ladsgroup.json
- 14:37 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2163 (T410589)', diff saved to https://phabricator.wikimedia.org/P91311 and previous config saved to /var/cache/conftool/dbconfig/20260422-143728-ladsgroup.json
- 11:59 mvolz@deploy1003: helmfile [eqiad] DONE helmfile.d/services/zotero: apply
- 11:58 mvolz@deploy1003: helmfile [eqiad] START helmfile.d/services/zotero: apply
- 11:41 mvolz@deploy1003: helmfile [codfw] DONE helmfile.d/services/zotero: apply
- 11:41 mvolz@deploy1003: helmfile [codfw] START helmfile.d/services/zotero: apply
- 11:36 mvolz@deploy1003: helmfile [staging] DONE helmfile.d/services/zotero: apply
- 11:36 mvolz@deploy1003: helmfile [staging] START helmfile.d/services/zotero: apply
- 11:26 mvolz@deploy1003: helmfile [codfw] DONE helmfile.d/services/citoid: apply
- 11:26 mvolz@deploy1003: helmfile [codfw] START helmfile.d/services/citoid: apply
- 11:25 mvolz@deploy1003: helmfile [eqiad] DONE helmfile.d/services/citoid: apply
- 11:25 mvolz@deploy1003: helmfile [eqiad] START helmfile.d/services/citoid: apply
- 11:25 mvolz@deploy1003: helmfile [staging] DONE helmfile.d/services/citoid: apply
- 11:24 mvolz@deploy1003: helmfile [staging] START helmfile.d/services/citoid: apply
- 11:22 mvolz@deploy1003: helmfile [staging] DONE helmfile.d/services/citoid: apply
- 11:22 mvolz@deploy1003: helmfile [staging] START helmfile.d/services/citoid: apply
- 11:12 mvolz@deploy1003: helmfile [eqiad] DONE helmfile.d/services/citoid: apply
- 11:12 mvolz@deploy1003: helmfile [eqiad] START helmfile.d/services/citoid: apply
- 11:08 mvolz@deploy1003: helmfile [codfw] DONE helmfile.d/services/citoid: apply
- 11:07 mvolz@deploy1003: helmfile [codfw] START helmfile.d/services/citoid: apply
- 11:06 mvolz@deploy1003: helmfile [staging] DONE helmfile.d/services/citoid: apply
- 11:06 mvolz@deploy1003: helmfile [staging] START helmfile.d/services/citoid: apply
- 10:27 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db[2141,2250].codfw.wmnet with reason: clone
- 07:23 samwilson@deploy1003: Finished scap sync-world: Backport for Use canvas rather than webgl for OpenSeadragon (T423548) (duration: 08m 31s)
- 07:17 samwilson@deploy1003: samwilson: Continuing with deployment
- 07:16 samwilson@deploy1003: samwilson: Backport for Use canvas rather than webgl for OpenSeadragon (T423548) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 07:14 samwilson@deploy1003: Started scap sync-world: Backport for Use canvas rather than webgl for OpenSeadragon (T423548)
- 04:35 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
- 04:19 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
- 03:03 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db2163 (T410589)', diff saved to https://phabricator.wikimedia.org/P91310 and previous config saved to /var/cache/conftool/dbconfig/20260422-030300-ladsgroup.json
- 03:02 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db2163.codfw.wmnet with reason: Maintenance
- 03:02 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2161 (T410589)', diff saved to https://phabricator.wikimedia.org/P91309 and previous config saved to /var/cache/conftool/dbconfig/20260422-030235-ladsgroup.json
- 02:52 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2161', diff saved to https://phabricator.wikimedia.org/P91308 and previous config saved to /var/cache/conftool/dbconfig/20260422-025227-ladsgroup.json
- 02:42 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2161', diff saved to https://phabricator.wikimedia.org/P91307 and previous config saved to /var/cache/conftool/dbconfig/20260422-024219-ladsgroup.json
- 02:32 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2161 (T410589)', diff saved to https://phabricator.wikimedia.org/P91306 and previous config saved to /var/cache/conftool/dbconfig/20260422-023211-ladsgroup.json
- 02:17 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host phab2003.codfw.wmnet with OS trixie
- 02:17 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
- 02:16 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
- 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 06s)
- 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
- 01:59 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on phab2003.codfw.wmnet with reason: host reimage
- 01:51 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on phab2003.codfw.wmnet with reason: host reimage
- 01:37 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host phab2003.codfw.wmnet with OS trixie
- 01:24 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host phab2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 00:55 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host phab2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
2026-04-21
- 23:15 denisse@deploy1003: Finished deploy [librenms/librenms@4a0466d]: Upgrade LibreNMS to 26.4.0 - T423229 (duration: 00m 18s)
- 23:15 denisse@deploy1003: Started deploy [librenms/librenms@4a0466d]: Upgrade LibreNMS to 26.4.0 - T423229
- {{safesubst:SAL entry|1=22:37 musikanimal@deploy1003: Finished scap sync-world: Backport for Promote CM6 out of beta, remove CM5 modules, and add v6 aliases (T373720), VisualEditor.CodeMirror.less: remove CM5 styles, CodeEditorHooks: remove temporary code for CodeMirror beta feature (T419332), DescriptionField: use new module name for loading CodeMirror, [[gerrit:1275998|H}}
- 22:31 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host rdb1015.eqiad.wmnet with OS trixie
- 22:25 musikanimal@deploy1003: musikanimal: Continuing with deployment
- {{safesubst:SAL entry|1=22:19 musikanimal@deploy1003: musikanimal: Backport for Promote CM6 out of beta, remove CM5 modules, and add v6 aliases (T373720), VisualEditor.CodeMirror.less: remove CM5 styles, CodeEditorHooks: remove temporary code for CodeMirror beta feature (T419332), DescriptionField: use new module name for loading CodeMirror, [[gerrit:1275998|Hooks: remove}}
- {{safesubst:SAL entry|1=22:02 musikanimal@deploy1003: Started scap sync-world: Backport for Promote CM6 out of beta, remove CM5 modules, and add v6 aliases (T373720), VisualEditor.CodeMirror.less: remove CM5 styles, CodeEditorHooks: remove temporary code for CodeMirror beta feature (T419332), DescriptionField: use new module name for loading CodeMirror, [[gerrit:1275998|Ho}}
- 21:58 cjming@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
- 21:57 cjming@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
- 21:47 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host rdb1016.eqiad.wmnet with OS trixie
- 21:29 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host rdb1016.eqiad.wmnet with OS trixie
- 21:29 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host rdb1016.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 21:17 jclark@cumin1003: START - Cookbook sre.hosts.provision for host rdb1016.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 21:12 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host rdb1016.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 21:10 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host rdb1015.eqiad.wmnet with OS trixie
- 21:07 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host rdb1015.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 21:05 jclark@cumin1003: START - Cookbook sre.hosts.provision for host rdb1016.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 21:00 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host rdb1016.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 20:57 musikanimal@deploy1003: Finished scap sync-world: Backport for mw.FormDataTransport.test: Update expected API call for POSTed calls (T423529 T421288) (duration: 06m 27s)
- 20:53 musikanimal@deploy1003: musikanimal: Continuing with deployment
- 20:53 jclark@cumin1003: START - Cookbook sre.hosts.provision for host rdb1016.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 20:52 jclark@cumin1003: START - Cookbook sre.hosts.provision for host rdb1015.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 20:52 musikanimal@deploy1003: musikanimal: Backport for mw.FormDataTransport.test: Update expected API call for POSTed calls (T423529 T421288) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 20:51 musikanimal@deploy1003: Started scap sync-world: Backport for mw.FormDataTransport.test: Update expected API call for POSTed calls (T423529 T421288)
- 20:50 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 20:50 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding rdb1016 to eqiad - jclark@cumin1003"
- 20:49 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding rdb1016 to eqiad - jclark@cumin1003"
- 20:44 dancy@deploy1003: Installation of scap version "4.250.1" completed for 2 hosts
- 20:42 dancy@deploy1003: Installing scap version "4.250.1" for 2 host(s)
- 20:35 jclark@cumin1003: START - Cookbook sre.dns.netbox
- 20:28 Dreamy_Jazz: Evening UTC backport window done
- 20:16 Dreamy_Jazz: Running `mwscript-k8s maintenance/namespaceDupes.php --wiki=diqwiki --fix`
- 20:15 dreamyjazz@deploy1003: Finished scap sync-world: Backport for Diqwiki: change project namespace (T328207), Remove unused wgCheckUserUserAgentTableMigrationStage config, CheckUser Suggested Investigations: Enable on commonswiki (T424084) (duration: 07m 38s)
- 20:11 dreamyjazz@deploy1003: pppery, dreamyjazz: Continuing with sync
- 20:09 dreamyjazz@deploy1003: pppery, dreamyjazz: Backport for Diqwiki: change project namespace (T328207), Remove unused wgCheckUserUserAgentTableMigrationStage config, CheckUser Suggested Investigations: Enable on commonswiki (T424084) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 20:08 dreamyjazz@deploy1003: Started scap sync-world: Backport for Diqwiki: change project namespace (T328207), Remove unused wgCheckUserUserAgentTableMigrationStage config, CheckUser Suggested Investigations: Enable on commonswiki (T424084)
- 20:07 dancy@deploy1003: Installation of scap version "4.249.0" completed for 2 hosts
- 20:05 dancy@deploy1003: Installing scap version "4.249.0" for 2 host(s)
- 19:49 jasmine@dns1004: END - running authdns-update
- 19:47 jasmine@dns1004: START - running authdns-update
- 19:37 mutante: contint1003 - re-enabling puppet T418521
- 19:32 Dreamy_Jazz: Created cusi_user, cusi_case, and cusi_signal on commonswiki on the extension1 database cluster - T424084
- 18:02 dancy@deploy1003: Finished scap sync-world: Testing (duration: 02m 58s)
- 17:59 dancy@deploy1003: Started scap sync-world: Testing
- 17:58 dancy@deploy1003: Installation of scap version "4.250.0" completed for 2 hosts
- 17:56 dancy@deploy1003: Installing scap version "4.250.0" for 2 host(s)
- 17:42 rzl@deploy1003: Finished scap sync-world: https://gerrit.wikimedia.org/r/1275464 T423623 (duration: 02m 30s)
- 17:41 rzl@deploy1003: Started scap sync-world: https://gerrit.wikimedia.org/r/1275464 T423623
- 17:01 mvernon@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host thanos-be2005.codfw.wmnet with OS bullseye
- 17:00 aikochou@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
- 16:45 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on thanos-be2005.codfw.wmnet with reason: host reimage
- 16:37 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on thanos-be2005.codfw.wmnet with reason: host reimage
- 16:23 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host thanos-be2005.codfw.wmnet with OS bullseye
- 16:23 brennen@deploy1003: Finished deploy [phabricator/deployment@ceeecba]: deploy phab1004 for T424059 (duration: 00m 38s)
- 16:22 brennen@deploy1003: Started deploy [phabricator/deployment@ceeecba]: deploy phab1004 for T424059
- 16:22 brennen@deploy1003: Finished deploy [phabricator/deployment@ceeecba]: deploy phab2002 for T424059 (duration: 00m 47s)
- 16:21 brennen@deploy1003: Started deploy [phabricator/deployment@ceeecba]: deploy phab2002 for T424059
- 15:58 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host prometheus5003.eqsin.wmnet
- 15:57 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host prometheus5003.eqsin.wmnet with OS bookworm
- 15:45 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on prometheus5003.eqsin.wmnet with reason: host reimage
- 15:40 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on prometheus5003.eqsin.wmnet with reason: host reimage
- 15:39 moritzm: installing busybox updates from Trixie point release
- 15:05 brennen@deploy1003: Finished deploy [phabricator/deployment@ce0ec30]: deploy phab1004 for T424033 (duration: 00m 43s)
- 15:04 aokoth@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on phab1004.eqiad.wmnet with reason: Phorge Deploy
- 15:04 brennen@deploy1003: Started deploy [phabricator/deployment@ce0ec30]: deploy phab1004 for T424033
- 15:04 brennen@deploy1003: Finished deploy [phabricator/deployment@ce0ec30]: deploy phab2002 for T424033 (duration: 00m 44s)
- 15:03 brennen@deploy1003: Started deploy [phabricator/deployment@ce0ec30]: deploy phab2002 for T424033
- 15:01 aokoth@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on phab2002.codfw.wmnet with reason: Phorge Deploy
- 15:00 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host prometheus5003.eqsin.wmnet with OS bookworm
- 15:00 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db2161 (T410589)', diff saved to https://phabricator.wikimedia.org/P91305 and previous config saved to /var/cache/conftool/dbconfig/20260421-150025-ladsgroup.json
- 15:00 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db2161.codfw.wmnet with reason: Maintenance
- 15:00 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2154 (T410589)', diff saved to https://phabricator.wikimedia.org/P91304 and previous config saved to /var/cache/conftool/dbconfig/20260421-145959-ladsgroup.json
- 14:56 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM prometheus5003.eqsin.wmnet - jmm@cumin2002"
- 14:56 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM prometheus5003.eqsin.wmnet - jmm@cumin2002"
- 14:55 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) prometheus5003.eqsin.wmnet on all recursors
- 14:55 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache prometheus5003.eqsin.wmnet on all recursors
- 14:55 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 14:55 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM prometheus5003.eqsin.wmnet - jmm@cumin2002"
- 14:55 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM prometheus5003.eqsin.wmnet - jmm@cumin2002"
- 14:54 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host thanos-be1009.eqiad.wmnet with OS bullseye
- 14:53 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.powercycle (exit_code=0) for host sretest2010
- 14:51 elukey@cumin1003: START - Cookbook sre.hosts.powercycle for host sretest2010
- 14:51 jmm@cumin2002: START - Cookbook sre.dns.netbox
- 14:51 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host prometheus5003.eqsin.wmnet
- 14:49 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2154', diff saved to https://phabricator.wikimedia.org/P91303 and previous config saved to /var/cache/conftool/dbconfig/20260421-144951-ladsgroup.json
- 14:45 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host tcp-proxy5004.eqsin.wmnet
- 14:45 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host tcp-proxy5004.eqsin.wmnet with OS trixie
- 14:39 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2154', diff saved to https://phabricator.wikimedia.org/P91302 and previous config saved to /var/cache/conftool/dbconfig/20260421-143943-ladsgroup.json
- 14:39 cscott@deploy1003: Finished scap sync-world: Backport for Increase Parsoid Read Views percentage for ruwiki to 55% (duration: 09m 37s)
- 14:36 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on thanos-be1009.eqiad.wmnet with reason: host reimage
- 14:35 cscott@deploy1003: cscott: Continuing with sync
- 14:34 papaul: moving OOB link on mr1-eqiad to ge-0/0/7
- 14:32 moritzm: installing gdk-pixbuf security updates
- 14:31 marostegui@cumin1003: dbctl commit (dc=all): 'Repool ms1 T418979', diff saved to https://phabricator.wikimedia.org/P91301 and previous config saved to /var/cache/conftool/dbconfig/20260421-143145-marostegui.json
- 14:31 cscott@deploy1003: cscott: Backport for Increase Parsoid Read Views percentage for ruwiki to 55% synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 14:30 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on thanos-be1009.eqiad.wmnet with reason: host reimage
- 14:30 marostegui@cumin1003: dbctl commit (dc=all): 'Add db2251 to ms1 T418979', diff saved to https://phabricator.wikimedia.org/P91300 and previous config saved to /var/cache/conftool/dbconfig/20260421-143017-marostegui.json
- 14:29 cscott@deploy1003: Started scap sync-world: Backport for Increase Parsoid Read Views percentage for ruwiki to 55%
- 14:29 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2154 (T410589)', diff saved to https://phabricator.wikimedia.org/P91299 and previous config saved to /var/cache/conftool/dbconfig/20260421-142935-ladsgroup.json
- 14:29 marostegui@cumin1003: dbctl commit (dc=all): 'Add db2251, remove db2142 T418979', diff saved to https://phabricator.wikimedia.org/P91298 and previous config saved to /var/cache/conftool/dbconfig/20260421-142913-marostegui.json
- 14:24 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on tcp-proxy5004.eqsin.wmnet with reason: host reimage
- 14:22 cscott@deploy1003: Finished scap sync-world: Backport for Bump wikimedia/parsoid to 0.23.0-a28 (T420102 T421680 T422879 T422966 T423192 T423763 T423662), Bump wikimedia/parsoid to 0.23.0-a28 (T423662), [tests] add ParsoidLanguageConverterTest, ParsoidLanguageConverter: update lang/dir on content wrapper div (T423747) (duration: 13m 02s)
- 14:21 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on tcp-proxy5004.eqsin.wmnet with reason: host reimage
- 14:16 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 15 days, 0:00:00 on db2142.codfw.wmnet,pc2011.codfw.wmnet with reason: Will be decommissioned
- 14:16 cscott@deploy1003: cscott: Continuing with sync
- 14:11 cscott@deploy1003: cscott: Backport for Bump wikimedia/parsoid to 0.23.0-a28 (T420102 T421680 T422879 T422966 T423192 T423763 T423662), Bump wikimedia/parsoid to 0.23.0-a28 (T423662), [tests] add ParsoidLanguageConverterTest, ParsoidLanguageConverter: update lang/dir on content wrapper div (T423747) synced to the testservers (see https://wikit
- 14:10 cscott@deploy1003: Started scap sync-world: Backport for Bump wikimedia/parsoid to 0.23.0-a28 (T420102 T421680 T422879 T422966 T423192 T423763 T423662), Bump wikimedia/parsoid to 0.23.0-a28 (T423662), [tests] add ParsoidLanguageConverterTest, ParsoidLanguageConverter: update lang/dir on content wrapper div (T423747)
- 14:08 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host thanos-be1009.eqiad.wmnet with OS bullseye
- 13:55 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2142.codfw.wmnet,db1152.eqiad.wmnet with reason: Cloning
- 13:55 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2142: Cloning
- 13:55 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
- 13:55 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
- 13:55 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2142: Cloning
- 13:53 jayme@cumin1003: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts wikikube-worker[1029,1089,1092,1098-1099,1106,1112].eqiad.wmnet
- 13:53 jayme@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 13:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host atlas5001.wikimedia.org
- 13:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM atlas5001.wikimedia.org - ayounsi@cumin1003"
- 13:52 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM atlas5001.wikimedia.org - ayounsi@cumin1003"
- {{safesubst:SAL entry|1=13:52 stran@deploy1003: Finished scap sync-world: Backport for Enable non-emergency categories via config (T423244), Add next steps page for non-emergency "sockpuppetry" incidents (T423045), Add next steps page for non-emergency "vandalism" incidents (T423563), Add next steps page for non-emergency "user dispute" incidents (T423587), [[gerrit:127583}}
- 13:51 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) atlas5001.wikimedia.org on all recursors
- 13:51 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache atlas5001.wikimedia.org on all recursors
- 13:51 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 13:51 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM atlas5001.wikimedia.org - ayounsi@cumin1003"
- 13:50 jayme@cumin1003: START - Cookbook sre.dns.netbox
- 13:49 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM atlas5001.wikimedia.org - ayounsi@cumin1003"
- 13:45 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
- 13:45 ayounsi@cumin1003: START - Cookbook sre.ganeti.makevm for new host atlas5001.wikimedia.org
- 13:44 isaranto@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
- 13:44 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
- 13:44 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
- 13:40 stran@deploy1003: stran: Continuing with sync
- 13:39 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 13:39 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: eqsin remove old sandbox vlan - ayounsi@cumin1003"
- 13:38 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: eqsin remove old sandbox vlan - ayounsi@cumin1003"
- {{safesubst:SAL entry|1=13:37 stran@deploy1003: stran: Backport for Enable non-emergency categories via config (T423244), Add next steps page for non-emergency "sockpuppetry" incidents (T423045), Add next steps page for non-emergency "vandalism" incidents (T423563), Add next steps page for non-emergency "user dispute" incidents (T423587), [[gerrit:1275836|Add next steps pa}}
- 13:34 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
- 13:30 ayounsi@dns1004: END - running authdns-update
- 13:29 ayounsi@dns1004: START - running authdns-update
- 13:27 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host tcp-proxy5004.eqsin.wmnet with OS trixie
- 13:27 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM tcp-proxy5004.eqsin.wmnet - jmm@cumin2002"
- 13:27 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM tcp-proxy5004.eqsin.wmnet - jmm@cumin2002"
- 13:26 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) tcp-proxy5004.eqsin.wmnet on all recursors
- 13:26 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache tcp-proxy5004.eqsin.wmnet on all recursors
- 13:26 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 13:26 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM tcp-proxy5004.eqsin.wmnet - jmm@cumin2002"
- 13:25 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM tcp-proxy5004.eqsin.wmnet - jmm@cumin2002"
- {{safesubst:SAL entry|1=13:20 stran@deploy1003: Started scap sync-world: Backport for Enable non-emergency categories via config (T423244), Add next steps page for non-emergency "sockpuppetry" incidents (T423045), Add next steps page for non-emergency "vandalism" incidents (T423563), Add next steps page for non-emergency "user dispute" incidents (T423587), [[gerrit:1275836}}
- 13:16 aude@deploy1003: Finished scap sync-world: Backport for Opt-in new accounts to the ReadingLists beta feature on enwiki (T420881) (duration: 06m 50s)
- 13:13 jmm@cumin2002: START - Cookbook sre.dns.netbox
- 13:13 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host tcp-proxy5004.eqsin.wmnet
- 13:12 aude@deploy1003: aude: Continuing with sync
- 13:11 aude@deploy1003: aude: Backport for Opt-in new accounts to the ReadingLists beta feature on enwiki (T420881) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 13:09 aude@deploy1003: Started scap sync-world: Backport for Opt-in new accounts to the ReadingLists beta feature on enwiki (T420881)
- 13:09 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host tcp-proxy5003.eqsin.wmnet
- 13:09 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host tcp-proxy5003.eqsin.wmnet with OS trixie
- 13:08 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
- 13:06 jayme@cumin1003: START - Cookbook sre.hosts.decommission for hosts wikikube-worker[1029,1089,1092,1098-1099,1106,1112].eqiad.wmnet
- 13:03 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host thanos-be1008.eqiad.wmnet with OS bullseye
- 13:03 jmm@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts puppetserver1002.eqiad.wmnet
- 13:03 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host puppetserver1002.eqiad.wmnet
- 12:56 jiji@deploy1003: Unlocked for deployment [ALL REPOSITORIES]: Upgrading mw-mcrouter - effie (duration: 33m 37s)
- 12:54 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host puppetserver1002.eqiad.wmnet
- 12:53 moritzm: update firmware on puppetserver1002: NIC from 22.31.6 to 23.21.6 T423282
- 12:52 jmm@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts puppetserver1002.eqiad.wmnet
- 12:49 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on tcp-proxy5003.eqsin.wmnet with reason: host reimage
- 12:49 jiji@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-mcrouter: apply
- 12:47 jiji@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-mcrouter: apply
- 12:45 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on tcp-proxy5003.eqsin.wmnet with reason: host reimage
- 12:45 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on thanos-be1008.eqiad.wmnet with reason: host reimage
- 12:44 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host phab2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 12:43 elukey@cumin1003: START - Cookbook sre.hosts.provision for host phab2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 12:41 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-mcrouter: apply
- 12:40 jmm@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts puppetserver1002.eqiad.wmnet
- 12:40 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host puppetserver1002.eqiad.wmnet
- 12:39 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/mw-mcrouter: apply
- 12:38 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on thanos-be1008.eqiad.wmnet with reason: host reimage
- 12:29 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host puppetserver1002.eqiad.wmnet
- 12:29 moritzm: update firmware on puppetserver1002: BIOS from 1.9.2 to 1.20.2 T423282
- 12:28 moritzm: update firmware on puppetserver1002: idrac from 6.10.30.20 to 7.20.80.50 T423282
- 12:23 jiji@deploy1003: Locking from deployment [ALL REPOSITORIES]: Upgrading mw-mcrouter - effie
- 12:22 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
- 12:15 jmm@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts puppetserver1002.eqiad.wmnet
- 12:15 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host thanos-be1008.eqiad.wmnet with OS bullseye
- 12:06 gkyziridis@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
- 12:02 marostegui@cumin1003: dbctl commit (dc=all): 'Pool back pc1 but with pc2021 replacing pc2011', diff saved to https://phabricator.wikimedia.org/P91287 and previous config saved to /var/cache/conftool/dbconfig/20260421-120206-marostegui.json
- 11:58 jiji@deploy1003: Unlocked for deployment [ALL REPOSITORIES]: Upgrading mw-mcrouter - effie (duration: 68m 02s)
- 11:57 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool pc1011: Pool pc2021 into pc
- 11:57 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.parsercache (exit_code=99)
- 11:57 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
- 11:57 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool pc1011: Pool pc2021 into pc
- 11:57 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool pc2021: Pool pc2021 into pc
- 11:57 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.parsercache (exit_code=99)
- 11:57 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
- 11:57 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool pc2021: Pool pc2021 into pc
- 11:56 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool pc2021: Pool pc2021 into pc
- 11:56 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.parsercache (exit_code=99)
- 11:56 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
- 11:56 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool pc2021: Pool pc2021 into pc
- 11:53 gkyziridis@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
- 11:52 marostegui@cumin1003: dbctl commit (dc=all): 'add pc2021 to pc1', diff saved to https://phabricator.wikimedia.org/P91286 and previous config saved to /var/cache/conftool/dbconfig/20260421-115209-marostegui.json
- 11:50 moritzm: installing Tornado security updates
- 11:48 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
- 11:48 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host tcp-proxy5003.eqsin.wmnet with OS trixie
- 11:47 marostegui@cumin1003: dbctl commit (dc=all): 'Remove pc2011 and add pc2021 as replacement', diff saved to https://phabricator.wikimedia.org/P91285 and previous config saved to /var/cache/conftool/dbconfig/20260421-114718-marostegui.json
- 11:45 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM tcp-proxy5003.eqsin.wmnet - jmm@cumin2002"
- 11:45 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM tcp-proxy5003.eqsin.wmnet - jmm@cumin2002"
- 11:45 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) tcp-proxy5003.eqsin.wmnet on all recursors
- 11:45 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache tcp-proxy5003.eqsin.wmnet on all recursors
- 11:45 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 11:45 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM tcp-proxy5003.eqsin.wmnet - jmm@cumin2002"
- 11:44 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM tcp-proxy5003.eqsin.wmnet - jmm@cumin2002"
- 11:41 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-mcrouter: apply
- 11:41 jmm@cumin2002: START - Cookbook sre.dns.netbox
- 11:41 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host tcp-proxy5003.eqsin.wmnet
- 11:39 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1240.eqiad.wmnet with reason: Maintenance
- 11:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1223 (T419961)', diff saved to https://phabricator.wikimedia.org/P91283 and previous config saved to /var/cache/conftool/dbconfig/20260421-113927-fceratto.json
- 11:38 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 11:38 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: test current diff - jmm@cumin2002"
- 11:38 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: test current diff - jmm@cumin2002"
- 11:33 jmm@cumin2002: START - Cookbook sre.dns.netbox
- 11:31 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
- 11:30 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1016,1020].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
- 11:30 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1161.eqiad.wmnet with reason: Maintenance
- 11:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1159 (T419635)', diff saved to https://phabricator.wikimedia.org/P91282 and previous config saved to /var/cache/conftool/dbconfig/20260421-113010-fceratto.json
- 11:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1223', diff saved to https://phabricator.wikimedia.org/P91281 and previous config saved to /var/cache/conftool/dbconfig/20260421-112919-fceratto.json
- 11:27 klausman@deploy1003: helmfile [eqiad] DONE helmfile.d/services/changeprop: apply
- 11:26 klausman@deploy1003: helmfile [eqiad] START helmfile.d/services/changeprop: apply
- 11:26 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2143: repool after maintenance
- 11:26 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
- 11:25 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
- 11:25 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2143: repool after maintenance
- 11:24 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.pool (exit_code=99) pool db2143: after reimage to trixie
- 11:24 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2143: after reimage to trixie
- 11:23 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2143.codfw.wmnet with OS trixie
- 11:21 klausman@deploy1003: helmfile [codfw] DONE helmfile.d/services/changeprop: apply
- 11:21 klausman@deploy1003: helmfile [codfw] START helmfile.d/services/changeprop: apply
- 11:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1159', diff saved to https://phabricator.wikimedia.org/P91279 and previous config saved to /var/cache/conftool/dbconfig/20260421-112001-fceratto.json
- 11:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1223', diff saved to https://phabricator.wikimedia.org/P91278 and previous config saved to /var/cache/conftool/dbconfig/20260421-111911-fceratto.json
- 11:11 claime: Enabling puppet on A:cp to deploy https://gerrit.wikimedia.org/r/c/operations/puppet/+/1271804 - T422804
- 11:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1159', diff saved to https://phabricator.wikimedia.org/P91277 and previous config saved to /var/cache/conftool/dbconfig/20260421-110954-fceratto.json
- 11:09 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/mw-mcrouter: apply
- 11:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1223 (T419961)', diff saved to https://phabricator.wikimedia.org/P91276 and previous config saved to /var/cache/conftool/dbconfig/20260421-110903-fceratto.json
- 11:07 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host tcp-proxy5003.eqsin.wmnet
- 11:07 jmm@cumin2002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
- 11:00 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2143.codfw.wmnet with reason: host reimage
- 10:59 jmm@cumin2002: START - Cookbook sre.dns.netbox
- 10:59 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host tcp-proxy5003.eqsin.wmnet
- 10:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1159 (T419635)', diff saved to https://phabricator.wikimedia.org/P91275 and previous config saved to /var/cache/conftool/dbconfig/20260421-105945-fceratto.json
- 10:57 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host tcp-proxy5003.eqsin.wmnet
- 10:57 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host tcp-proxy5003.eqsin.wmnet with OS trixie
- 10:56 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
- 10:56 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2143.codfw.wmnet with reason: host reimage
- 10:51 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/proton: apply
- 10:50 gkyziridis@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
- 10:50 jiji@deploy1003: Locking from deployment [ALL REPOSITORIES]: Upgrading mw-mcrouter - effie
- 10:49 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/proton: apply
- 10:49 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/proton: apply
- 10:48 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host thanos-be1007.eqiad.wmnet with OS bullseye
- 10:47 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/proton: apply
- 10:44 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/proton: apply
- 10:43 jmm@deploy1003: helmfile [staging] START helmfile.d/services/proton: apply
- 10:39 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1223 (T419961)', diff saved to https://phabricator.wikimedia.org/P91274 and previous config saved to /var/cache/conftool/dbconfig/20260421-103945-fceratto.json
- 10:39 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1223.eqiad.wmnet with reason: Maintenance
- 10:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1212 (T419961)', diff saved to https://phabricator.wikimedia.org/P91273 and previous config saved to /var/cache/conftool/dbconfig/20260421-103915-fceratto.json
- 10:38 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2143.codfw.wmnet with OS trixie
- 10:37 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2143: Reimage to Trixie
- 10:37 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
- 10:37 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
- 10:37 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2143: Reimage to Trixie
- 10:37 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2143.codfw.wmnet with reason: Reimage to Trixie
- 10:36 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
- 10:35 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2143.codfw.wmnet,db1153.eqiad.wmnet with reason: Reimage to Trixie
- 10:30 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on thanos-be1007.eqiad.wmnet with reason: host reimage
- 10:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1212', diff saved to https://phabricator.wikimedia.org/P91271 and previous config saved to /var/cache/conftool/dbconfig/20260421-102907-fceratto.json
- 10:22 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on thanos-be1007.eqiad.wmnet with reason: host reimage
- 10:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1212', diff saved to https://phabricator.wikimedia.org/P91270 and previous config saved to /var/cache/conftool/dbconfig/20260421-101857-fceratto.json
- 10:13 lucaswerkmeister-wmde@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikidata-query-gui: apply
- 10:13 lucaswerkmeister-wmde@deploy1003: helmfile [codfw] START helmfile.d/services/wikidata-query-gui: apply
- 10:13 lucaswerkmeister-wmde@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikidata-query-gui: apply
- 10:13 lucaswerkmeister-wmde@deploy1003: helmfile [eqiad] START helmfile.d/services/wikidata-query-gui: apply
- 10:12 lucaswerkmeister-wmde@deploy1003: helmfile [staging] DONE helmfile.d/services/wikidata-query-gui: apply
- 10:12 lucaswerkmeister-wmde@deploy1003: helmfile [staging] START helmfile.d/services/wikidata-query-gui: apply
- 10:10 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
- 10:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1212 (T419961)', diff saved to https://phabricator.wikimedia.org/P91269 and previous config saved to /var/cache/conftool/dbconfig/20260421-100849-fceratto.json
- 10:07 claime: Disabling puppet on A:cp to merge https://gerrit.wikimedia.org/r/c/operations/puppet/+/1271804 - T422804
- 10:03 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host tcp-proxy5003.eqsin.wmnet with OS trixie
- 10:00 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1212 (T419961)', diff saved to https://phabricator.wikimedia.org/P91268 and previous config saved to /var/cache/conftool/dbconfig/20260421-100051-fceratto.json
- 10:00 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on 6 hosts with reason: Maintenance
- 10:00 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1212.eqiad.wmnet with reason: Maintenance
- 10:00 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host thanos-be1007.eqiad.wmnet with OS bullseye
- 09:59 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1159 (T419635)', diff saved to https://phabricator.wikimedia.org/P91267 and previous config saved to /var/cache/conftool/dbconfig/20260421-095928-fceratto.json
- 09:59 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1159.eqiad.wmnet with reason: Maintenance
- 09:55 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM tcp-proxy5003.eqsin.wmnet - jmm@cumin2002"
- 09:55 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM tcp-proxy5003.eqsin.wmnet - jmm@cumin2002"
- 09:54 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1198: Security update
- 09:54 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) tcp-proxy5003.eqsin.wmnet on all recursors
- 09:54 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache tcp-proxy5003.eqsin.wmnet on all recursors
- 09:54 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 09:54 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM tcp-proxy5003.eqsin.wmnet - jmm@cumin2002"
- 09:54 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM tcp-proxy5003.eqsin.wmnet - jmm@cumin2002"
- 09:50 jmm@cumin2002: START - Cookbook sre.dns.netbox
- 09:50 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host tcp-proxy5003.eqsin.wmnet
- 09:47 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
- 09:45 moritzm: updating debdeploy on trixie to 0.0.99.15
- 09:39 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1153: repool after maintenance
- 09:39 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
- 09:39 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
- 09:39 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1153: repool after maintenance
- 09:38 gkyziridis@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
- 09:37 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.pool (exit_code=99) pool db1153: repool after maintenance
- 09:37 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1153: repool after maintenance
- 09:35 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.pool (exit_code=99) pool db1153: after reimage to trixie
- 09:35 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1153: after reimage to trixie
- 09:34 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1016,1020].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
- 09:34 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1161.eqiad.wmnet with reason: Maintenance
- 09:34 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1153.eqiad.wmnet with OS trixie
- 09:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1159 (T419635)', diff saved to https://phabricator.wikimedia.org/P91262 and previous config saved to /var/cache/conftool/dbconfig/20260421-093401-fceratto.json
- 09:26 moritzm: imported debdeploy 0.0.99.15 for trixie-wikimedia (compat release for Cumin 6)
- 09:23 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1159', diff saved to https://phabricator.wikimedia.org/P91260 and previous config saved to /var/cache/conftool/dbconfig/20260421-092352-fceratto.json
- 09:21 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1198: Security update
- 09:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1198 (T419961)', diff saved to https://phabricator.wikimedia.org/P91259 and previous config saved to /var/cache/conftool/dbconfig/20260421-091949-fceratto.json
- 09:17 gkyziridis@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
- 09:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1159', diff saved to https://phabricator.wikimedia.org/P91258 and previous config saved to /var/cache/conftool/dbconfig/20260421-091344-fceratto.json
- 09:11 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1198 (T419961)', diff saved to https://phabricator.wikimedia.org/P91257 and previous config saved to /var/cache/conftool/dbconfig/20260421-091124-fceratto.json
- 09:11 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1198.eqiad.wmnet with reason: Maintenance
- 09:11 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1153.eqiad.wmnet with reason: host reimage
- 09:05 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1175.eqiad.wmnet with reason: Maintenance
- 09:05 jayme: kubectl delete node $(nodeset -e wikikube-worker[1002-1005,1011-1012,1019-1020,1029-1031,1058-1063,1082-1083,1088-1092,1096-1112,1166-1168].eqiad.wmnet) - T423863
- 09:05 fabfur: restarting pybal on lvs1019-1020 to clear alerts
- 09:03 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1175 (T419961)', diff saved to https://phabricator.wikimedia.org/P91256 and previous config saved to /var/cache/conftool/dbconfig/20260421-090358-fceratto.json
- 09:03 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1175.eqiad.wmnet with reason: Maintenance
- 09:03 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1159 (T419635)', diff saved to https://phabricator.wikimedia.org/P91255 and previous config saved to /var/cache/conftool/dbconfig/20260421-090336-fceratto.json
- 09:01 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1153.eqiad.wmnet with reason: host reimage
- 09:00 jayme: homer 'asw2-a-eqiad.mgmt.eqiad.wmnet' commit - T423863
- 09:00 jayme: homer 'asw2-b-eqiad.mgmt.eqiad.wmnet' commit - T423863
- 08:56 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
- 08:51 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1157: Security update
- 08:50 jayme: homer 'cr*eqiad*' commit - T423863
- 08:46 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti5006.eqsin.wmnet to cluster eqsin02 and group 01
- 08:46 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1153.eqiad.wmnet with OS trixie
- 08:45 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1153: Reimage to Trixie
- 08:45 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
- 08:45 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
- 08:45 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1153: Reimage to Trixie
- 08:45 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1153.eqiad.wmnet with reason: Reimage to Trixie
- 08:45 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2143.codfw.wmnet,db1153.eqiad.wmnet with reason: Reimage to Trixie
- 08:43 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti5006.eqsin.wmnet to cluster eqsin02 and group 01
- 08:40 jayme@cumin1003: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts wikikube-worker[1002-1005,1011-1012,1019-1020,1029-1031,1058-1061].eqiad.wmnet
- 08:40 jayme@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 08:39 jayme@cumin1003: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts wikikube-worker[1062-1063,1082-1083,1088-1092,1096-1101].eqiad.wmnet
- 08:39 jayme@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 08:39 jayme@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: wikikube-worker[1062-1063,1082-1083,1088-1092,1096-1101].eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jayme@cumin1003"
- 08:39 jayme@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: wikikube-worker[1062-1063,1082-1083,1088-1092,1096-1101].eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jayme@cumin1003"
- 08:38 jayme@cumin1003: START - Cookbook sre.dns.netbox
- 08:32 jayme@cumin1003: START - Cookbook sre.dns.netbox
- 08:32 moritzm: installing gst-plugins-base1.0 security updates
- 08:32 jayme@cumin1003: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts wikikube-worker[1102-1112,1166-1168].eqiad.wmnet
- 08:32 jayme@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 08:32 jayme@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: wikikube-worker[1102-1112,1166-1168].eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jayme@cumin1003"
- 08:31 jayme@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: wikikube-worker[1102-1112,1166-1168].eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jayme@cumin1003"
- 08:27 jayme@cumin1003: START - Cookbook sre.dns.netbox
- 08:20 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1157: Security update
- 08:18 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti5006.eqsin.wmnet
- 08:18 musikanimal@deploy1003: Finished scap sync-world: Backport for ext.abuseFilter.edit.js: temporary locking of CodeMirror lineWrapping (T423773 T423756) (duration: 07m 01s)
- 08:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1157 (T419961)', diff saved to https://phabricator.wikimedia.org/P91250 and previous config saved to /var/cache/conftool/dbconfig/20260421-081717-fceratto.json
- 08:14 elukey: bootstrapping pki intermediate discovery2026
- 08:14 musikanimal@deploy1003: musikanimal: Continuing with sync
- 08:12 musikanimal@deploy1003: musikanimal: Backport for ext.abuseFilter.edit.js: temporary locking of CodeMirror lineWrapping (T423773 T423756) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 08:10 musikanimal@deploy1003: Started scap sync-world: Backport for ext.abuseFilter.edit.js: temporary locking of CodeMirror lineWrapping (T423773 T423756)
- 08:09 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1157 (T419961)', diff saved to https://phabricator.wikimedia.org/P91249 and previous config saved to /var/cache/conftool/dbconfig/20260421-080936-fceratto.json
- 08:09 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1157.eqiad.wmnet with reason: Maintenance
- 08:08 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti5006.eqsin.wmnet
- 08:06 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host thanos-be1006.eqiad.wmnet with OS bullseye
- 08:04 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1150.eqiad.wmnet with reason: Maintenance
- 08:03 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1159 (T419635)', diff saved to https://phabricator.wikimedia.org/P91248 and previous config saved to /var/cache/conftool/dbconfig/20260421-080314-fceratto.json
- 08:03 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1159.eqiad.wmnet with reason: Maintenance
- 07:51 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1025.eqiad.wmnet,service=s4
- 07:51 fnegri@cumin1003: conftool action : set/weight=100; selector: name=clouddb1025.eqiad.wmnet,service=s4
- 07:49 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1025.eqiad.wmnet,service=s6
- 07:49 fnegri@cumin1003: conftool action : set/weight=100; selector: name=clouddb1025.eqiad.wmnet,service=s6
- 07:48 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for ms-be1064.eqiad.wmnet
- 07:48 mvernon@cumin2002: START - Cookbook sre.hosts.remove-downtime for ms-be1064.eqiad.wmnet
- 07:46 jayme@cumin1003: START - Cookbook sre.hosts.decommission for hosts wikikube-worker[1102-1112,1166-1168].eqiad.wmnet
- 07:46 jayme@cumin1003: START - Cookbook sre.hosts.decommission for hosts wikikube-worker[1062-1063,1082-1083,1088-1092,1096-1101].eqiad.wmnet
- 07:46 jayme@cumin1003: START - Cookbook sre.hosts.decommission for hosts wikikube-worker[1002-1005,1011-1012,1019-1020,1029-1031,1058-1061].eqiad.wmnet
- 07:45 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on thanos-be1006.eqiad.wmnet with reason: host reimage
- 07:39 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool pc1011: Reimage to Trixie
- 07:39 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
- 07:39 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
- 07:39 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool pc1011: Reimage to Trixie
- 07:38 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on thanos-be1006.eqiad.wmnet with reason: host reimage
- 07:17 mvernon@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on ms-be1064.eqiad.wmnet with reason: vacuum overlarge container dbs
- 07:16 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host thanos-be1006.eqiad.wmnet with OS bullseye
- 07:06 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on pc2011.codfw.wmnet,pc1011.eqiad.wmnet with reason: Cloning pc2021 from pc2011
- 07:05 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool pc2011: Cloning pc2021
- 07:05 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
- 07:05 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
- 07:05 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool pc2011: Cloning pc2021
- 07:04 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2144: After reimage
- 07:04 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
- 07:04 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
- 07:04 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2144: After reimage
- 07:03 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.pool (exit_code=99) pool db2144: after reimage to trixie
- 07:03 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2144: after reimage to trixie
- 07:02 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2144.codfw.wmnet with OS trixie
- 06:38 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2144.codfw.wmnet with reason: host reimage
- 06:33 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2144.codfw.wmnet with reason: host reimage
- 06:12 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2144.codfw.wmnet with OS trixie
- 06:12 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2144: Reimage to Trixie
- 06:12 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
- 06:12 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
- 06:12 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2144: Reimage to Trixie
- 06:12 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2144.codfw.wmnet with reason: Reimage to Trixie
- 06:11 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2144.codfw.wmnet,db1151.eqiad.wmnet with reason: Reimage to Trixie
- 05:40 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
- 05:27 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on clouddb1015.eqiad.wmnet with reason: Clone s6 to clouddb1025
- 05:25 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on clouddb1025.eqiad.wmnet with reason: Clone s6
- 04:02 mwpresync@deploy1003: Pruned MediaWiki: 1.46.0-wmf.22 (duration: 02m 30s)
- 02:53 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db2154 (T410589)', diff saved to https://phabricator.wikimedia.org/P91242 and previous config saved to /var/cache/conftool/dbconfig/20260421-025311-ladsgroup.json
- 02:53 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db2154.codfw.wmnet with reason: Maintenance
- 02:52 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2152 (T410589)', diff saved to https://phabricator.wikimedia.org/P91241 and previous config saved to /var/cache/conftool/dbconfig/20260421-025245-ladsgroup.json
- 02:42 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2152', diff saved to https://phabricator.wikimedia.org/P91240 and previous config saved to /var/cache/conftool/dbconfig/20260421-024237-ladsgroup.json
- 02:32 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2152', diff saved to https://phabricator.wikimedia.org/P91239 and previous config saved to /var/cache/conftool/dbconfig/20260421-023228-ladsgroup.json
- 02:22 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2152 (T410589)', diff saved to https://phabricator.wikimedia.org/P91238 and previous config saved to /var/cache/conftool/dbconfig/20260421-022219-ladsgroup.json
- 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 03s)
- 02:01 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
- 01:32 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db2250.codfw.wmnet with OS bookworm
- 01:32 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
- 01:28 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
- 01:24 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2251.codfw.wmnet with OS bookworm
- 01:24 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
- 01:24 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
- 01:23 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2252.codfw.wmnet with OS bookworm
- 01:23 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
- 01:20 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
- 01:18 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2253.codfw.wmnet with OS bookworm
- 01:18 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
- 01:17 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
- 01:12 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2250.codfw.wmnet with reason: host reimage
- 01:07 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2251.codfw.wmnet with reason: host reimage
- 01:03 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2252.codfw.wmnet with reason: host reimage
- 01:02 zabe: marked 543 revisions as bad # T393237
- 00:59 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2253.codfw.wmnet with reason: host reimage
- 00:52 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2250.codfw.wmnet with reason: host reimage
- 00:52 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2251.codfw.wmnet with reason: host reimage
- 00:52 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2252.codfw.wmnet with reason: host reimage
- 00:51 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2253.codfw.wmnet with reason: host reimage
- 00:35 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host db2253.codfw.wmnet with OS bookworm
- 00:35 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host db2252.codfw.wmnet with OS bookworm
- 00:35 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host db2251.codfw.wmnet with OS bookworm
- 00:35 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host db2250.codfw.wmnet with OS bookworm
- 00:34 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db2250.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 00:33 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db2253.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 00:33 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db2252.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 00:32 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db2251.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 00:29 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host db2253.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 00:28 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host db2252.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 00:28 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host db2251.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 00:28 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host db2250.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
2026-04-20
- 23:40 jdlrobson@deploy1003: Finished scap sync-world: Backport for Restore PageImages functionality to Wikisources and Wikibooks (T417538) (duration: 07m 47s)
- 23:39 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host pc2024.codfw.wmnet with OS trixie
- 23:36 jdlrobson@deploy1003: jdlrobson, ignaciorodrguez: Continuing with sync
- 23:34 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host pc2023.codfw.wmnet with OS trixie
- 23:34 jdlrobson@deploy1003: jdlrobson, ignaciorodrguez: Backport for Restore PageImages functionality to Wikisources and Wikibooks (T417538) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 23:32 jdlrobson@deploy1003: Started scap sync-world: Backport for Restore PageImages functionality to Wikisources and Wikibooks (T417538)
- 23:29 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host pc2022.codfw.wmnet with OS trixie
- 23:28 jdlrobson@deploy1003: Finished scap sync-world: Backport for [Mobile Page Previews] Avoid syntax error on older browsers (T423959) (duration: 08m 13s)
- 23:24 jdlrobson@deploy1003: jdlrobson: Continuing with sync
- 23:24 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on pc2024.codfw.wmnet with reason: host reimage
- 23:21 jdlrobson@deploy1003: jdlrobson: Backport for [Mobile Page Previews] Avoid syntax error on older browsers (T423959) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 23:20 jdlrobson@deploy1003: Started scap sync-world: Backport for [Mobile Page Previews] Avoid syntax error on older browsers (T423959)
- 23:19 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on pc2023.codfw.wmnet with reason: host reimage
- 23:16 jdlrobson@deploy1003: Finished scap sync-world: Backport for Revert "Skin: Avoid stretching low resolution images" (T421524 T423676) (duration: 05m 56s)
- 23:14 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on pc2022.codfw.wmnet with reason: host reimage
- 23:12 jdlrobson@deploy1003: cscott, jdlrobson: Continuing with sync
- 23:12 jdlrobson@deploy1003: cscott, jdlrobson: Backport for Revert "Skin: Avoid stretching low resolution images" (T421524 T423676) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 23:10 jdlrobson@deploy1003: Started scap sync-world: Backport for Revert "Skin: Avoid stretching low resolution images" (T421524 T423676)
- 23:07 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on pc2024.codfw.wmnet with reason: host reimage
- 23:07 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on pc2023.codfw.wmnet with reason: host reimage
- 23:06 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on pc2022.codfw.wmnet with reason: host reimage
- 22:53 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host pc2024.codfw.wmnet with OS trixie
- 22:52 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host pc2023.codfw.wmnet with OS trixie
- 22:52 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host pc2022.codfw.wmnet with OS trixie
- 21:59 rzl@deploy1003: Finished scap sync-world: https://gerrit.wikimedia.org/r/1275463 T423311 T423624 (duration: 03m 24s)
- 21:57 rzl@deploy1003: Started scap sync-world: https://gerrit.wikimedia.org/r/1275463 T423311 T423624
- 21:42 maryum: Deployed security fix for T406954
- 21:33 maryum: Deployed security fix for T299359
- 20:16 aude@deploy1003: Finished scap sync-world: Backport for Do not show donate button on affiliate wikis (T423876) (duration: 10m 57s)
- 20:10 aude@deploy1003: aude: Continuing with sync
- 20:08 aude@deploy1003: aude: Backport for Do not show donate button on affiliate wikis (T423876) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 20:05 aude@deploy1003: Started scap sync-world: Backport for Do not show donate button on affiliate wikis (T423876)
- 19:48 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host phab2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 19:46 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2190: Security update
- 19:28 elukey@cumin1003: START - Cookbook sre.hosts.provision for host phab2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 19:04 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host phab2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 19:00 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2190: Security update
- 18:58 dancy@deploy1003: Installation of scap version "4.249.0" completed for 2 hosts
- 18:56 dancy@deploy1003: Installing scap version "4.249.0" for 2 host(s)
- {{safesubst:SAL entry|1=18:55 jforrester@deploy1003: Finished scap sync-world: Backport for Attribution: Clean up API spec descriptions (T422502), [[gerrit:1275476|i18n: Use Template:Doc-markdown template in Attribution qqq.json (T422502)]], Attribution: Documentation copyedits, Attribution: Update contact and add call to action (T422502), [[gerrit:1275478|Attribution: Add localized texts for tre}}
- 18:44 jforrester@deploy1003: pmiazga, jforrester: Continuing with sync
- {{safesubst:SAL entry|1=18:42 jforrester@deploy1003: pmiazga, jforrester: Backport for Attribution: Clean up API spec descriptions (T422502), [[gerrit:1275476|i18n: Use Template:Doc-markdown template in Attribution qqq.json (T422502)]], Attribution: Documentation copyedits, Attribution: Update contact and add call to action (T422502), [[gerrit:1275478|Attribution: Add localized texts for trending}}
- {{safesubst:SAL entry|1=18:25 jforrester@deploy1003: Started scap sync-world: Backport for Attribution: Clean up API spec descriptions (T422502), [[gerrit:1275476|i18n: Use Template:Doc-markdown template in Attribution qqq.json (T422502)]], Attribution: Documentation copyedits, Attribution: Update contact and add call to action (T422502), [[gerrit:1275478|Attribution: Add localized texts for tren}}
- 18:11 Amir1: drop of langlinks table on testcommonswiki (T421914)
- 18:07 herron@dns1004: END - running authdns-update
- 18:05 herron@dns1004: START - running authdns-update
- 17:56 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host thanos-be1005.eqiad.wmnet with OS bullseye
- 17:47 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-video: apply
- 17:46 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-video: apply
- 17:46 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-timeline: apply
- 17:45 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-timeline: apply
- 17:45 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
- 17:45 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-syntaxhighlight: apply
- 17:44 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-media: apply
- 17:44 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-media: apply
- 17:43 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-constraints: apply
- 17:43 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-constraints: apply
- 17:42 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox: apply
- 17:41 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox: apply
- 17:38 elukey@cumin1003: START - Cookbook sre.hosts.provision for host phab2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 17:37 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host phab2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 17:36 elukey@cumin1003: START - Cookbook sre.hosts.provision for host phab2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 17:35 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudelastic1012.eqiad.wmnet with OS trixie
- 17:34 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on thanos-be1005.eqiad.wmnet with reason: host reimage
- 17:27 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-video: apply
- 17:27 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on thanos-be1005.eqiad.wmnet with reason: host reimage
- 17:26 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-video: apply
- 17:26 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-timeline: apply
- 17:25 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-timeline: apply
- 17:25 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
- 17:24 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-syntaxhighlight: apply
- 17:24 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-media: apply
- 17:24 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-media: apply
- 17:23 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-constraints: apply
- 17:23 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-constraints: apply
- 17:22 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox: apply
- 17:21 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox: apply
- 17:18 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-video: apply
- 17:18 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudelastic1012.eqiad.wmnet with reason: host reimage
- 17:18 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-video: apply
- 17:18 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-timeline: apply
- 17:18 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-timeline: apply
- 17:18 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
- 17:17 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-syntaxhighlight: apply
- 17:17 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-media: apply
- 17:17 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-media: apply
- 17:17 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-constraints: apply
- 17:17 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-constraints: apply
- 17:17 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox: apply
- 17:16 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox: apply
- 17:15 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host thanos-be1005.eqiad.wmnet with OS bullseye
- 17:14 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudelastic1012.eqiad.wmnet with reason: host reimage
- 17:02 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1012.eqiad.wmnet with OS trixie
- 16:55 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1161 (T419635)', diff saved to https://phabricator.wikimedia.org/P91231 and previous config saved to /var/cache/conftool/dbconfig/20260420-165459-fceratto.json
- 16:54 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1016,1020].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
- 16:54 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1161.eqiad.wmnet with reason: Maintenance
- 16:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1159 (T419635)', diff saved to https://phabricator.wikimedia.org/P91230 and previous config saved to /var/cache/conftool/dbconfig/20260420-165423-fceratto.json
- 16:52 moritzm: installing imagemagick security updates
- 16:48 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti5006.eqsin.wmnet with OS bookworm
- 16:48 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM aux-k8s-etcd1003.eqiad.wmnet
- 16:44 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM aux-k8s-etcd1003.eqiad.wmnet
- 16:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1159', diff saved to https://phabricator.wikimedia.org/P91229 and previous config saved to /var/cache/conftool/dbconfig/20260420-164415-fceratto.json
- 16:38 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM aux-k8s-etcd1003.eqiad.wmnet
- 16:36 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2036: Moving to another rack
- 16:34 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM aux-k8s-etcd1003.eqiad.wmnet
- 16:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1159', diff saved to https://phabricator.wikimedia.org/P91227 and previous config saved to /var/cache/conftool/dbconfig/20260420-163407-fceratto.json
- 16:33 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM aux-k8s-etcd1003.eqiad.wmnet
- 16:29 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM aux-k8s-etcd1003.eqiad.wmnet
- 16:29 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM backupmon1001.eqiad.wmnet
- 16:27 marostegui@dns1004: END - running authdns-update
- 16:26 marostegui: Switchover m3 proxy (phabricator)
- 16:26 marostegui@dns1004: START - running authdns-update
- 16:25 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti5006.eqsin.wmnet with reason: host reimage
- 16:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1159 (T419635)', diff saved to https://phabricator.wikimedia.org/P91226 and previous config saved to /var/cache/conftool/dbconfig/20260420-162359-fceratto.json
- 16:21 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1166: Security update
- 16:19 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM backupmon1001.eqiad.wmnet
- 16:17 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti5006.eqsin.wmnet with reason: host reimage
- 16:17 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
- 16:17 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
- 16:06 bking@cumin2002: conftool action : set/pooled=no; selector: name=cloudelastic1012.eqiad.wmnet
- 15:57 moritzm: installing libvirt security updates
- 15:55 sukhe: sudo cumin -b31 "A:cp and not P{cp2041* or cp2042*}" "run-puppet-agent --enable 'merging CR 1272869'"
- 15:51 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2036: Moving to another rack
- 15:50 cgoubert@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2188.codfw.wmnet
- 15:50 cgoubert@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2188.codfw.wmnet
- 15:50 marostegui@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool es2036: Moving to another rack
- 15:50 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2036: Moving to another rack
- 15:50 cgoubert@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for wikikube-worker2188.codfw.wmnet
- 15:50 cgoubert@cumin1003: START - Cookbook sre.hosts.remove-downtime for wikikube-worker2188.codfw.wmnet
- 15:48 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti5006.eqsin.wmnet with OS bookworm
- 15:41 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 15:41 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 15:36 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host cloudelastic1012.eqiad.wmnet
- 15:36 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host es2036
- 15:36 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host es2036
- 15:36 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dbstore1010.eqiad.wmnet with OS bookworm
- 15:36 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
- 15:36 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 15:35 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1166: Security update
- 15:34 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 15:34 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 15:33 jhancock@cumin2002: START - Cookbook sre.dns.netbox
- 15:27 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
- 15:25 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudelastic1012.eqiad.wmnet
- 15:25 fceratto@cumin1003: END (FAIL) - Cookbook sre.mysql.pool (exit_code=99) pool db1166: Security update
- 15:23 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1159 (T419635)', diff saved to https://phabricator.wikimedia.org/P91217 and previous config saved to /var/cache/conftool/dbconfig/20260420-152341-fceratto.json
- 15:23 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1159.eqiad.wmnet with reason: Maintenance
- 15:20 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1166: Security update
- 15:14 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es2036.codfw.wmnet with reason: Moved to anotehr rack
- 15:13 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2036: Moving to another rack
- 15:13 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2036: Moving to another rack
- 15:11 jhancock@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['wikikube-worker2188']
- 15:11 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.pool (exit_code=99) pool db1151: repool after maintenance
- 15:11 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1151: repool after maintenance
- 15:09 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dbstore1010.eqiad.wmnet with reason: host reimage
- 15:05 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host thanos-be2006.codfw.wmnet with OS bullseye
- 15:05 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dbstore1010.eqiad.wmnet with reason: host reimage
- 15:03 trueg@deploy1003: helmfile [staging] DONE helmfile.d/services/rdf-streaming-updater: apply
- 15:03 trueg@deploy1003: helmfile [staging] START helmfile.d/services/rdf-streaming-updater: apply
- 14:52 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
- 14:51 jhancock@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['wikikube-worker2188']
- 14:46 cwhite@deploy1003: Finished deploy [performance/arc-lamp@bd7b2ab]: T413127 (duration: 00m 08s)
- 14:45 cwhite@deploy1003: Started deploy [performance/arc-lamp@bd7b2ab]: T413127
- 14:45 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on thanos-be2006.codfw.wmnet with reason: host reimage
- 14:43 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.pool (exit_code=99) pool db1151: after reimage to trixie
- 14:43 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1151: after reimage to trixie
- 14:42 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1151.eqiad.wmnet with OS trixie
- 14:40 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host dbstore1010.eqiad.wmnet with OS bookworm
- 14:37 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host dbstore1010.eqiad.wmnet with OS bookworm
- 14:37 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host dbstore1010.eqiad.wmnet with OS bookworm
- 14:36 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on thanos-be2006.codfw.wmnet with reason: host reimage
- 14:35 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dbstore1010.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 14:26 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
- 14:21 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudelastic1012.eqiad.wmnet
- 14:21 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1166 (T419961)', diff saved to https://phabricator.wikimedia.org/P91215 and previous config saved to /var/cache/conftool/dbconfig/20260420-142120-fceratto.json
- 14:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1166.eqiad.wmnet with reason: Maintenance
- 14:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1157 (T419961)', diff saved to https://phabricator.wikimedia.org/P91214 and previous config saved to /var/cache/conftool/dbconfig/20260420-142050-fceratto.json
- 14:19 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudelastic1012.eqiad.wmnet
- 14:19 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1151.eqiad.wmnet with reason: host reimage
- 14:15 jclark@cumin1003: START - Cookbook sre.hosts.provision for host dbstore1010.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 14:14 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1151.eqiad.wmnet with reason: host reimage
- 14:14 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host thanos-be2006.codfw.wmnet with OS bullseye
- 14:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1157', diff saved to https://phabricator.wikimedia.org/P91213 and previous config saved to /var/cache/conftool/dbconfig/20260420-141042-fceratto.json
- 14:02 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
- 14:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1253 (T419635)', diff saved to https://phabricator.wikimedia.org/P91212 and previous config saved to /var/cache/conftool/dbconfig/20260420-140203-fceratto.json
- 14:02 urandom: upgrade envoyproxy, restbase — T419637 & T410975
- 14:00 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1151.eqiad.wmnet with OS trixie
- 14:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1157', diff saved to https://phabricator.wikimedia.org/P91211 and previous config saved to /var/cache/conftool/dbconfig/20260420-140033-fceratto.json
- 14:00 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1151: Reimage to Trixie
- 14:00 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
- 14:00 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
- 14:00 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1151: Reimage to Trixie
- 14:00 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1151.eqiad.wmnet with reason: Reimage to Trixie
- 14:00 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
- 13:59 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2144.codfw.wmnet,db1151.eqiad.wmnet with reason: Reimage to Trixie
- 13:52 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db2152 (T410589)', diff saved to https://phabricator.wikimedia.org/P91209 and previous config saved to /var/cache/conftool/dbconfig/20260420-135255-ladsgroup.json
- 13:52 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db2152.codfw.wmnet with reason: Maintenance
- 13:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1253', diff saved to https://phabricator.wikimedia.org/P91208 and previous config saved to /var/cache/conftool/dbconfig/20260420-135155-fceratto.json
- 13:50 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 13:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1157 (T419961)', diff saved to https://phabricator.wikimedia.org/P91207 and previous config saved to /var/cache/conftool/dbconfig/20260420-135025-fceratto.json
- 13:47 jclark@cumin1003: START - Cookbook sre.dns.netbox
- 13:45 jclark@cumin1003: START - Cookbook sre.hosts.provision for host dbstore1010.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 13:44 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 13:44 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding dbstore1010 to eqiad - jclark@cumin1003"
- 13:44 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding dbstore1010 to eqiad - jclark@cumin1003"
- 13:41 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1157 (T419961)', diff saved to https://phabricator.wikimedia.org/P91206 and previous config saved to /var/cache/conftool/dbconfig/20260420-134158-fceratto.json
- 13:41 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1157.eqiad.wmnet with reason: Maintenance
- 13:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1253', diff saved to https://phabricator.wikimedia.org/P91205 and previous config saved to /var/cache/conftool/dbconfig/20260420-134148-fceratto.json
- 13:37 jclark@cumin1003: START - Cookbook sre.dns.netbox
- 13:35 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1150.eqiad.wmnet with reason: Maintenance
- 13:34 Lucas_WMDE: UTC afternoon backport+config window done
- 13:32 urandom: decommissioning Cassandra, aqs1014 [a,b] — T412830
- 13:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1253 (T419635)', diff saved to https://phabricator.wikimedia.org/P91204 and previous config saved to /var/cache/conftool/dbconfig/20260420-133139-fceratto.json
- 13:30 eevans@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on aqs1014.eqiad.wmnet with reason: Decommissioning — T412830
- 13:29 phuedx@deploy1003: Finished scap sync-world: Backport for PHP SDK: Split measurement of unknown experiments (T422112) (duration: 07m 51s)
- 13:29 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1253 (T419635)', diff saved to https://phabricator.wikimedia.org/P91203 and previous config saved to /var/cache/conftool/dbconfig/20260420-132926-fceratto.json
- 13:29 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1253.eqiad.wmnet with reason: Maintenance
- 13:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1236 (T419635)', diff saved to https://phabricator.wikimedia.org/P91202 and previous config saved to /var/cache/conftool/dbconfig/20260420-132901-fceratto.json
- 13:26 phuedx@deploy1003: phuedx: Continuing with sync
- 13:23 phuedx@deploy1003: phuedx: Backport for PHP SDK: Split measurement of unknown experiments (T422112) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 13:22 phuedx@deploy1003: Started scap sync-world: Backport for PHP SDK: Split measurement of unknown experiments (T422112)
- 13:20 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for Remove unused JWT for bot password temporary config (T422367 T415007), Enable ReadingLists beta feature for all Wikipedia wikis (T420881) (duration: 08m 21s)
- 13:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1236', diff saved to https://phabricator.wikimedia.org/P91200 and previous config saved to /var/cache/conftool/dbconfig/20260420-131853-fceratto.json
- 13:16 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, aude, d3r1ck01: Continuing with sync
- 13:15 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, aude, d3r1ck01: Backport for Remove unused JWT for bot password temporary config (T422367 T415007), Enable ReadingLists beta feature for all Wikipedia wikis (T420881) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 13:12 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for Remove unused JWT for bot password temporary config (T422367 T415007), Enable ReadingLists beta feature for all Wikipedia wikis (T420881)
- 13:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1236', diff saved to https://phabricator.wikimedia.org/P91199 and previous config saved to /var/cache/conftool/dbconfig/20260420-130845-fceratto.json
- 12:58 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1002-1005,1011-1012,1019-1020,1029-1031,1058-1063,1082-1083,1088-1092,1096,1098-1112,1166-1168].eqiad.wmnet
- 12:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1236 (T419635)', diff saved to https://phabricator.wikimedia.org/P91198 and previous config saved to /var/cache/conftool/dbconfig/20260420-125837-fceratto.json
- 12:56 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1236 (T419635)', diff saved to https://phabricator.wikimedia.org/P91197 and previous config saved to /var/cache/conftool/dbconfig/20260420-125624-fceratto.json
- 12:56 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1236.eqiad.wmnet with reason: Maintenance
- 12:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1231 (T419635)', diff saved to https://phabricator.wikimedia.org/P91196 and previous config saved to /var/cache/conftool/dbconfig/20260420-125559-fceratto.json
- 12:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1231', diff saved to https://phabricator.wikimedia.org/P91195 and previous config saved to /var/cache/conftool/dbconfig/20260420-124550-fceratto.json
- 12:35 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts moss-be[1001-1002].eqiad.wmnet
- 12:35 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 12:35 mvernon@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: moss-be[1001-1002].eqiad.wmnet decommissioned, removing all IPs except the asset tag one - mvernon@cumin2002"
- 12:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1231', diff saved to https://phabricator.wikimedia.org/P91194 and previous config saved to /var/cache/conftool/dbconfig/20260420-123542-fceratto.json
- 12:31 mvernon@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: moss-be[1001-1002].eqiad.wmnet decommissioned, removing all IPs except the asset tag one - mvernon@cumin2002"
- 12:28 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1002-1005,1011-1012,1019-1020,1029-1031,1058-1063,1082-1083,1088-1092,1096,1098-1112,1166-1168].eqiad.wmnet
- 12:26 mvernon@cumin2002: START - Cookbook sre.dns.netbox
- 12:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1231 (T419635)', diff saved to https://phabricator.wikimedia.org/P91193 and previous config saved to /var/cache/conftool/dbconfig/20260420-122534-fceratto.json
- 12:23 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1231 (T419635)', diff saved to https://phabricator.wikimedia.org/P91192 and previous config saved to /var/cache/conftool/dbconfig/20260420-122321-fceratto.json
- 12:23 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1231.eqiad.wmnet with reason: Maintenance
- 12:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1227 (T419635)', diff saved to https://phabricator.wikimedia.org/P91191 and previous config saved to /var/cache/conftool/dbconfig/20260420-122256-fceratto.json
- 12:17 zabe: Deployed patch for T423821
- 12:16 moritzm: remove ganeti5006 from eqsin01 Ganeti cluster (running classic Ganeti) T421863
- 12:15 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1328-1334,1360-1374].eqiad.wmnet
- 12:15 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1328-1334,1360-1374].eqiad.wmnet
- 12:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1227', diff saved to https://phabricator.wikimedia.org/P91190 and previous config saved to /var/cache/conftool/dbconfig/20260420-121247-fceratto.json
- 12:11 mvernon@cumin2002: START - Cookbook sre.hosts.decommission for hosts moss-be[1001-1002].eqiad.wmnet
- 12:10 moritzm: installing edk2 security updates
- 12:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1227', diff saved to https://phabricator.wikimedia.org/P91189 and previous config saved to /var/cache/conftool/dbconfig/20260420-120239-fceratto.json
- 11:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1227 (T419635)', diff saved to https://phabricator.wikimedia.org/P91188 and previous config saved to /var/cache/conftool/dbconfig/20260420-115231-fceratto.json
- 11:49 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti5006.eqsin.wmnet
- 11:17 fnegri@cumin1003: conftool action : set/pooled=no; selector: name=clouddb1025.eqiad.wmnet,service=x4
- 10:52 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1227 (T419635)', diff saved to https://phabricator.wikimedia.org/P91187 and previous config saved to /var/cache/conftool/dbconfig/20260420-105213-fceratto.json
- 10:52 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1227.eqiad.wmnet with reason: Maintenance
- 10:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1202 (T419635)', diff saved to https://phabricator.wikimedia.org/P91186 and previous config saved to /var/cache/conftool/dbconfig/20260420-105148-fceratto.json
- 10:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1202', diff saved to https://phabricator.wikimedia.org/P91185 and previous config saved to /var/cache/conftool/dbconfig/20260420-104141-fceratto.json
- 10:32 trueg@deploy1003: helmfile [staging] DONE helmfile.d/services/rdf-streaming-updater: apply
- 10:32 trueg@deploy1003: helmfile [staging] START helmfile.d/services/rdf-streaming-updater: apply
- 10:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1202', diff saved to https://phabricator.wikimedia.org/P91184 and previous config saved to /var/cache/conftool/dbconfig/20260420-103133-fceratto.json
- 10:26 kamila@deploy1003: Finished scap sync-world: ICU 72 upgrade (duration: 51m 35s)
- 10:24 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts bast1003.wikimedia.org
- 10:24 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 10:24 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: bast1003.wikimedia.org decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
- 10:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1202 (T419635)', diff saved to https://phabricator.wikimedia.org/P91183 and previous config saved to /var/cache/conftool/dbconfig/20260420-102125-fceratto.json
- 10:19 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1202 (T419635)', diff saved to https://phabricator.wikimedia.org/P91182 and previous config saved to /var/cache/conftool/dbconfig/20260420-101913-fceratto.json
- 10:19 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1202.eqiad.wmnet with reason: Maintenance
- 10:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1194 (T419635)', diff saved to https://phabricator.wikimedia.org/P91181 and previous config saved to /var/cache/conftool/dbconfig/20260420-101847-fceratto.json
- 10:15 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: bast1003.wikimedia.org decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
- 10:14 kamila@deploy1003: kamila: Continuing with sync
- 10:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1194', diff saved to https://phabricator.wikimedia.org/P91180 and previous config saved to /var/cache/conftool/dbconfig/20260420-100839-fceratto.json
- 10:07 jmm@cumin2002: START - Cookbook sre.dns.netbox
- 10:04 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2190 (T419961)', diff saved to https://phabricator.wikimedia.org/P91179 and previous config saved to /var/cache/conftool/dbconfig/20260420-100423-fceratto.json
- 10:04 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2190.codfw.wmnet with reason: Maintenance
- 10:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177 (T419961)', diff saved to https://phabricator.wikimedia.org/P91178 and previous config saved to /var/cache/conftool/dbconfig/20260420-100402-fceratto.json
- 10:02 Emperor: ceph orch host drain moss-be1002 T418901
- 10:02 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1165: after reimage to trixie
- 09:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1194', diff saved to https://phabricator.wikimedia.org/P91176 and previous config saved to /var/cache/conftool/dbconfig/20260420-095831-fceratto.json
- 09:58 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts bast1003.wikimedia.org
- 09:53 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P91175 and previous config saved to /var/cache/conftool/dbconfig/20260420-095354-fceratto.json
- 09:52 kamila@deploy1003: kamila: ICU 72 upgrade synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1194 (T419635)', diff saved to https://phabricator.wikimedia.org/P91174 and previous config saved to /var/cache/conftool/dbconfig/20260420-094823-fceratto.json
- 09:48 klausman@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'llm' for release 'main' .
- 09:46 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1194 (T419635)', diff saved to https://phabricator.wikimedia.org/P91172 and previous config saved to /var/cache/conftool/dbconfig/20260420-094612-fceratto.json
- 09:46 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1194.eqiad.wmnet with reason: Maintenance
- 09:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1191 (T419635)', diff saved to https://phabricator.wikimedia.org/P91171 and previous config saved to /var/cache/conftool/dbconfig/20260420-094546-fceratto.json
- 09:43 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P91170 and previous config saved to /var/cache/conftool/dbconfig/20260420-094345-fceratto.json
- 09:43 Emperor: ceph orch host drain moss-be1001 T418901
- 09:40 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM puppetboard1003.eqiad.wmnet
- 09:36 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM puppetboard1003.eqiad.wmnet
- 09:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1191', diff saved to https://phabricator.wikimedia.org/P91169 and previous config saved to /var/cache/conftool/dbconfig/20260420-093538-fceratto.json
- 09:35 kamila@deploy1003: Started scap sync-world: ICU 72 upgrade
- 09:33 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177 (T419961)', diff saved to https://phabricator.wikimedia.org/P91168 and previous config saved to /var/cache/conftool/dbconfig/20260420-093337-fceratto.json
- 09:33 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM puppetboard2003.codfw.wmnet
- 09:29 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM puppetboard2003.codfw.wmnet
- 09:26 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti5006.eqsin.wmnet
- 09:26 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti5006.eqsin.wmnet
- 09:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1191', diff saved to https://phabricator.wikimedia.org/P91166 and previous config saved to /var/cache/conftool/dbconfig/20260420-092530-fceratto.json
- 09:25 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM testvm2008.wikimedia.org
- 09:25 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti5006.eqsin.wmnet
- 09:24 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2177 (T419961)', diff saved to https://phabricator.wikimedia.org/P91165 and previous config saved to /var/cache/conftool/dbconfig/20260420-092448-fceratto.json
- 09:24 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2177.codfw.wmnet with reason: Maintenance
- 09:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156 (T419961)', diff saved to https://phabricator.wikimedia.org/P91164 and previous config saved to /var/cache/conftool/dbconfig/20260420-092417-fceratto.json
- 09:24 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts testvm2006.codfw.wmnet
- 09:24 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 09:24 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: testvm2006.codfw.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
- 09:22 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: testvm2006.codfw.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
- 09:21 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
- 09:21 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM testvm2008.wikimedia.org
- 09:19 klausman@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
- 09:18 klausman@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'llm' for release 'main' .
- 09:16 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1165: after reimage to trixie
- 09:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1191 (T419635)', diff saved to https://phabricator.wikimedia.org/P91162 and previous config saved to /var/cache/conftool/dbconfig/20260420-091522-fceratto.json
- 09:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P91161 and previous config saved to /var/cache/conftool/dbconfig/20260420-091409-fceratto.json
- 09:14 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1165.eqiad.wmnet with OS trixie
- 09:13 trueg@deploy1003: helmfile [staging] DONE helmfile.d/services/rdf-streaming-updater: apply
- 09:13 trueg@deploy1003: helmfile [staging] START helmfile.d/services/rdf-streaming-updater: apply
- 09:13 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1191 (T419635)', diff saved to https://phabricator.wikimedia.org/P91160 and previous config saved to /var/cache/conftool/dbconfig/20260420-091310-fceratto.json
- 09:13 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1191.eqiad.wmnet with reason: Maintenance
- 09:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1174 (T419635)', diff saved to https://phabricator.wikimedia.org/P91159 and previous config saved to /var/cache/conftool/dbconfig/20260420-091233-fceratto.json
- 09:11 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM testvm2007.codfw.wmnet
- 09:11 trueg@deploy1003: helmfile [staging] DONE helmfile.d/services/rdf-streaming-updater: apply
- 09:10 trueg@deploy1003: helmfile [staging] START helmfile.d/services/rdf-streaming-updater: apply
- 09:08 jmm@cumin2002: START - Cookbook sre.dns.netbox
- 09:07 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM testvm2007.codfw.wmnet
- 09:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P91158 and previous config saved to /var/cache/conftool/dbconfig/20260420-090401-fceratto.json
- 09:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P91157 and previous config saved to /var/cache/conftool/dbconfig/20260420-090225-fceratto.json
- 08:53 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156 (T419961)', diff saved to https://phabricator.wikimedia.org/P91156 and previous config saved to /var/cache/conftool/dbconfig/20260420-085349-fceratto.json
- 08:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P91155 and previous config saved to /var/cache/conftool/dbconfig/20260420-085217-fceratto.json
- 08:49 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1165.eqiad.wmnet with reason: host reimage
- 08:48 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts testvm2006.codfw.wmnet
- 08:45 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2156 (T419961)', diff saved to https://phabricator.wikimedia.org/P91154 and previous config saved to /var/cache/conftool/dbconfig/20260420-084512-fceratto.json
- 08:45 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2156.codfw.wmnet with reason: Maintenance
- 08:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149 (T419961)', diff saved to https://phabricator.wikimedia.org/P91153 and previous config saved to /var/cache/conftool/dbconfig/20260420-084440-fceratto.json
- 08:43 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM testvm2005.codfw.wmnet
- 08:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1174 (T419635)', diff saved to https://phabricator.wikimedia.org/P91152 and previous config saved to /var/cache/conftool/dbconfig/20260420-084209-fceratto.json
- 08:41 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1165.eqiad.wmnet with reason: host reimage
- 08:41 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts atlas5001.wikimedia.org
- 08:41 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 08:41 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: atlas5001.wikimedia.org decommissioned, removing all IPs except the asset tag one - ayounsi@cumin1003"
- 08:41 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: atlas5001.wikimedia.org decommissioned, removing all IPs except the asset tag one - ayounsi@cumin1003"
- 08:39 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1174 (T419635)', diff saved to https://phabricator.wikimedia.org/P91151 and previous config saved to /var/cache/conftool/dbconfig/20260420-083957-fceratto.json
- 08:39 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1174.eqiad.wmnet with reason: Maintenance
- 08:39 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM testvm2005.codfw.wmnet
- 08:39 gkyziridis@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
- 08:34 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
- 08:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P91150 and previous config saved to /var/cache/conftool/dbconfig/20260420-083432-fceratto.json
- 08:32 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts testvm2004.codfw.wmnet
- 08:32 jmm@cumin2002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
- 08:30 ayounsi@cumin1003: START - Cookbook sre.hosts.decommission for hosts atlas5001.wikimedia.org
- 08:30 jmm@cumin2002: START - Cookbook sre.dns.netbox
- 08:28 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1165.eqiad.wmnet with OS trixie
- 08:27 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1165: Reimage to Trixie
- 08:27 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1165: Reimage to Trixie
- 08:27 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1165.eqiad.wmnet with reason: Reimage to Trixie
- 08:26 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2229: after reimage to trixie
- 08:26 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts testvm2004.codfw.wmnet
- 08:26 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1171.eqiad.wmnet with reason: Maintenance
- 08:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1170 (T419635)', diff saved to https://phabricator.wikimedia.org/P91147 and previous config saved to /var/cache/conftool/dbconfig/20260420-082555-fceratto.json
- 08:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P91146 and previous config saved to /var/cache/conftool/dbconfig/20260420-082424-fceratto.json
- 08:23 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb1015.eqiad.wmnet,db[1155,1165].eqiad.wmnet with reason: Reimage to Trixie
- 08:23 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts bast5004.wikimedia.org
- 08:22 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 08:22 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: bast5004.wikimedia.org decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
- 08:22 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: bast5004.wikimedia.org decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
- 08:19 filippo@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM cloudcumin1001.eqiad.wmnet
- 08:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1170', diff saved to https://phabricator.wikimedia.org/P91145 and previous config saved to /var/cache/conftool/dbconfig/20260420-081547-fceratto.json
- 08:15 filippo@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM cloudcumin1001.eqiad.wmnet
- 08:15 cgoubert@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on wikikube-worker2188.codfw.wmnet with reason: dcops intervention
- 08:14 cgoubert@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2188.codfw.wmnet
- 08:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149 (T419961)', diff saved to https://phabricator.wikimedia.org/P91144 and previous config saved to /var/cache/conftool/dbconfig/20260420-081416-fceratto.json
- 08:14 cgoubert@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2188.codfw.wmnet
- 08:13 filippo@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM cloudcumin2001.codfw.wmnet
- 08:09 jmm@cumin2002: START - Cookbook sre.dns.netbox
- 08:07 filippo@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM cloudcumin2001.codfw.wmnet
- 08:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1170', diff saved to https://phabricator.wikimedia.org/P91142 and previous config saved to /var/cache/conftool/dbconfig/20260420-080539-fceratto.json
- 08:05 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2149 (T419961)', diff saved to https://phabricator.wikimedia.org/P91141 and previous config saved to /var/cache/conftool/dbconfig/20260420-080529-fceratto.json
- 08:05 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2149.codfw.wmnet with reason: Maintenance
- 08:04 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts bast5004.wikimedia.org
- 08:01 marostegui: Removed categorylinks_icu72 from s3 with a sleep, this will around 1.5 hours T422546
- 07:59 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 12389
- 07:59 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'configure' for AS: 12389
- 07:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1170 (T419635)', diff saved to https://phabricator.wikimedia.org/P91139 and previous config saved to /var/cache/conftool/dbconfig/20260420-075524-fceratto.json
- 07:51 marostegui: Removed categorylinks_icu72 from s5 T422546
- 07:41 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2229: after reimage to trixie
- 07:40 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1170 (T419635)', diff saved to https://phabricator.wikimedia.org/P91137 and previous config saved to /var/cache/conftool/dbconfig/20260420-074031-fceratto.json
- 07:40 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1170.eqiad.wmnet with reason: Maintenance
- 07:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1158 (T419635)', diff saved to https://phabricator.wikimedia.org/P91136 and previous config saved to /var/cache/conftool/dbconfig/20260420-074005-fceratto.json
- 07:38 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2229.codfw.wmnet with OS trixie
- 07:31 marostegui: Removed categorylinks_icu72 from s7 T422546
- 07:30 marostegui: Removed categorylinks_icu72 from s2 T422546
- 07:30 marostegui: Removed categorylinks_icu72 from s12 T422546
- 07:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P91135 and previous config saved to /var/cache/conftool/dbconfig/20260420-072957-fceratto.json
- 07:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P91134 and previous config saved to /var/cache/conftool/dbconfig/20260420-071949-fceratto.json
- 07:14 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2229.codfw.wmnet with reason: host reimage
- 07:10 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2229.codfw.wmnet with reason: host reimage
- 07:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1158 (T419635)', diff saved to https://phabricator.wikimedia.org/P91133 and previous config saved to /var/cache/conftool/dbconfig/20260420-070941-fceratto.json
- 07:07 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1158 (T419635)', diff saved to https://phabricator.wikimedia.org/P91132 and previous config saved to /var/cache/conftool/dbconfig/20260420-070728-fceratto.json
- 07:07 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1014,1018].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
- 07:07 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1158.eqiad.wmnet with reason: Maintenance
- 06:57 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2151: repool after maintenance
- 06:52 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2229.codfw.wmnet with OS trixie
- 06:50 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2229: Reimage to Trixie
- 06:50 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2229: Reimage to Trixie
- 06:50 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2229.codfw.wmnet with reason: Reimage to Trixie
- 06:40 marostegui@cumin1003: dbctl commit (dc=all): 'Depool db2229 T423837', diff saved to https://phabricator.wikimedia.org/P91129 and previous config saved to /var/cache/conftool/dbconfig/20260420-064042-marostegui.json
- 06:40 marostegui@cumin1003: dbctl commit (dc=all): 'Promote db2214 to s6 primary T423837', diff saved to https://phabricator.wikimedia.org/P91128 and previous config saved to /var/cache/conftool/dbconfig/20260420-064006-marostegui.json
- 06:39 marostegui: Starting s6 codfw failover from db2229 to db2214 - T423837
- 06:36 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 21 hosts with reason: Primary switchover s6 T423837
- 06:35 marostegui@cumin1003: dbctl commit (dc=all): 'Set db2214 with weight 0 T423837', diff saved to https://phabricator.wikimedia.org/P91127 and previous config saved to /var/cache/conftool/dbconfig/20260420-063553-marostegui.json
- 06:27 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2151: repool after maintenance
- 06:26 marostegui@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db2151: after reimage to trixie
- 06:22 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2151: after reimage to trixie
- 06:22 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2151.codfw.wmnet with OS trixie
- 06:06 marostegui: Removed categorylinks_icu72 from s1 and s6 T422546
- 05:59 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2151.codfw.wmnet with reason: host reimage
- 05:53 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2151.codfw.wmnet with reason: host reimage
- 05:33 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2151.codfw.wmnet with OS trixie
- 05:32 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2151: Reimage to Trixie
- 05:32 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2151: Reimage to Trixie
- 05:32 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2151.codfw.wmnet with reason: Reimage to Trixie
- 03:19 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 03:19 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 03:05 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
- 03:05 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
- 00:24 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 00:24 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
2026-04-19
- 18:20 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 18:20 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 17:50 zabe@deploy1003: Finished scap sync-world: Backport for Temporarily switch back to file read old schema (T423065) (duration: 33m 41s)
- 17:36 zabe@deploy1003: zabe: Continuing with sync
- 17:34 zabe@deploy1003: zabe: Backport for Temporarily switch back to file read old schema (T423065) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 17:16 zabe@deploy1003: Started scap sync-world: Backport for Temporarily switch back to file read old schema (T423065)
- 16:02 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 16:02 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 13:42 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 13:42 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 06:59 mvernon@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on ms-be1066.eqiad.wmnet with reason: vacuum overlarge container dbs
- 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 11s)
- 02:01 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
2026-04-17
- 23:55 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1070.eqiad.wmnet with OS bookworm
- 23:55 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
- 23:48 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
- 23:31 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1070.eqiad.wmnet with reason: host reimage
- 23:26 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1067.eqiad.wmnet with OS bookworm
- 23:26 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
- 23:25 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
- 23:24 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1070.eqiad.wmnet with reason: host reimage
- 23:10 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 23:10 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 23:08 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1067.eqiad.wmnet with reason: host reimage
- 23:07 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mc1070.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 23:05 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1067.eqiad.wmnet with reason: host reimage
- 23:00 jclark@cumin1003: START - Cookbook sre.hosts.provision for host mc1070.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 22:56 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1056.eqiad.wmnet with OS bookworm
- 22:56 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
- 22:56 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
- 22:51 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mc1070.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 22:50 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host mc1070.eqiad.wmnet with OS bookworm
- 22:49 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db2253.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 22:49 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db2251.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 22:49 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db2252.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 22:48 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db2250.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 22:45 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mc1067.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 22:41 jclark@cumin1003: START - Cookbook sre.hosts.provision for host mc1070.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 22:40 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1072.eqiad.wmnet with OS bookworm
- 22:40 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
- 22:40 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
- 22:39 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1056.eqiad.wmnet with reason: host reimage
- 22:38 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host db2253.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 22:38 jclark@cumin1003: START - Cookbook sre.hosts.provision for host mc1067.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 22:38 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host db2252.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 22:38 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host mc1067.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 22:38 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host db2251.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 22:37 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host db2250.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 22:36 jclark@cumin1003: START - Cookbook sre.hosts.provision for host mc1067.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 22:35 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host mc1067.eqiad.wmnet with OS bookworm
- 22:33 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1056.eqiad.wmnet with reason: host reimage
- 22:24 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1072.eqiad.wmnet with reason: host reimage
- 22:23 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host mc1070.eqiad.wmnet with OS bookworm
- 22:20 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1072.eqiad.wmnet with reason: host reimage
- 22:19 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host pc2024.codfw.wmnet with OS trixie
- 22:19 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
- 22:15 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host mc1067.eqiad.wmnet with OS bookworm
- 22:13 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
- 22:13 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host mc1056.eqiad.wmnet with OS bookworm
- 22:09 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host mc1072.eqiad.wmnet with OS bookworm
- 22:09 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1071.eqiad.wmnet with OS bookworm
- 22:09 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
- 22:08 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
- 22:08 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mc1072.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 22:02 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host pc2022.codfw.wmnet with OS trixie
- 22:02 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
- 22:01 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host pc2023.codfw.wmnet with OS trixie
- 22:01 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
- 22:01 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
- 21:59 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on pc2024.codfw.wmnet with reason: host reimage
- 21:58 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
- 21:57 jclark@cumin1003: START - Cookbook sre.hosts.provision for host mc1072.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 21:54 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host pc2021.codfw.wmnet with OS trixie
- 21:54 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
- 21:54 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
- 21:51 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1071.eqiad.wmnet with reason: host reimage
- 21:47 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on pc2022.codfw.wmnet with reason: host reimage
- 21:45 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1065.eqiad.wmnet with OS bookworm
- 21:42 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on pc2023.codfw.wmnet with reason: host reimage
- 21:42 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host mc1056.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 21:41 jclark@cumin1003: START - Cookbook sre.hosts.provision for host mc1056.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 21:39 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on pc2021.codfw.wmnet with reason: host reimage
- 21:39 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host mc1056.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 21:37 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on pc2024.codfw.wmnet with reason: host reimage
- 21:36 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on pc2023.codfw.wmnet with reason: host reimage
- 21:35 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1069.eqiad.wmnet with OS bookworm
- 21:35 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
- 21:35 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on pc2022.codfw.wmnet with reason: host reimage
- 21:35 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
- 21:34 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1068.eqiad.wmnet with OS bookworm
- 21:34 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
- 21:34 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1071.eqiad.wmnet with reason: host reimage
- 21:33 jclark@cumin1003: START - Cookbook sre.hosts.provision for host mc1056.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 21:32 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on pc2021.codfw.wmnet with reason: host reimage
- 21:28 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1065.eqiad.wmnet with reason: host reimage
- 21:27 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
- 21:24 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host mc1056.eqiad.wmnet with OS bookworm
- 21:23 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host mc1071.eqiad.wmnet with OS bookworm
- 21:21 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1065.eqiad.wmnet with reason: host reimage
- 21:18 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1069.eqiad.wmnet with reason: host reimage
- 21:16 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host pc2024.codfw.wmnet with OS trixie
- 21:16 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1063.eqiad.wmnet with OS bookworm
- 21:16 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
- 21:16 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host pc2023.codfw.wmnet with OS trixie
- 21:16 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
- 21:15 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host pc2022.codfw.wmnet with OS trixie
- 21:15 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host pc2021.codfw.wmnet with OS trixie
- 21:14 jhancock@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts ['pc2021']
- 21:14 jhancock@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['pc2021']
- 21:13 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1066.eqiad.wmnet with OS bookworm
- 21:13 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
- 21:12 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1069.eqiad.wmnet with reason: host reimage
- 21:12 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
- 21:10 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1068.eqiad.wmnet with reason: host reimage
- 21:10 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host pc2024.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 21:10 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host pc2023.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 21:09 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host mc1065.eqiad.wmnet with OS bookworm
- 21:09 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host pc2022.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 21:07 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host mc1065.eqiad.wmnet with OS bookworm
- 21:07 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host pc2021.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 21:06 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1068.eqiad.wmnet with reason: host reimage
- 21:04 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host pc2024.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 21:04 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host pc2023.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 21:03 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host pc2022.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 21:03 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host pc2021.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 21:03 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host mc1070.eqiad.wmnet with OS bookworm
- 21:02 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1058.eqiad.wmnet with OS bookworm
- 21:02 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
- 21:02 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host mc1069.eqiad.wmnet with OS bookworm
- 21:02 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
- 21:00 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1059.eqiad.wmnet with OS bookworm
- 21:00 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
- 20:59 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
- 20:59 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1063.eqiad.wmnet with reason: host reimage
- 20:56 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host mc1068.eqiad.wmnet with OS bookworm
- 20:55 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host mc1067.eqiad.wmnet with OS bookworm
- 20:54 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1066.eqiad.wmnet with reason: host reimage
- 20:54 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1057.eqiad.wmnet with OS bookworm
- 20:54 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
- 20:53 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
- 20:51 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host pc2022.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 20:50 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1062.eqiad.wmnet with OS bookworm
- 20:50 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
- 20:50 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1065.eqiad.wmnet with reason: host reimage
- 20:50 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
- 20:48 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host pc2024.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 20:48 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host pc2023.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 20:48 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1066.eqiad.wmnet with reason: host reimage
- 20:47 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host pc2021.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 20:47 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1061.eqiad.wmnet with OS bookworm
- 20:47 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
- 20:46 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
- 20:45 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1058.eqiad.wmnet with reason: host reimage
- 20:43 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1064.eqiad.wmnet with OS bookworm
- 20:43 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
- 20:43 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
- 20:42 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1065.eqiad.wmnet with reason: host reimage
- 20:42 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1059.eqiad.wmnet with reason: host reimage
- 20:40 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1058.eqiad.wmnet with reason: host reimage
- 20:39 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1063.eqiad.wmnet with reason: host reimage
- 20:38 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host pc2024.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 20:37 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host pc2023.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 20:37 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host mc1066.eqiad.wmnet with OS bookworm
- 20:37 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1055.eqiad.wmnet with OS bookworm
- 20:37 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host pc2022.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 20:37 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
- 20:37 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host pc2021.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 20:36 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
- 20:36 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1057.eqiad.wmnet with reason: host reimage
- 20:35 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host pc2024
- 20:34 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host pc2024
- 20:34 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host pc2023
- 20:34 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host pc2023
- 20:34 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host pc2022
- 20:34 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host pc2022
- 20:34 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host pc2021
- 20:33 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host pc2021
- 20:32 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1062.eqiad.wmnet with reason: host reimage
- 20:31 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host mc1065.eqiad.wmnet with OS bookworm
- 20:31 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 20:31 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding pc2021 to codfw - jhancock@cumin2002"
- 20:31 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding pc2021 to codfw - jhancock@cumin2002"
- 20:29 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host mc1058.eqiad.wmnet with OS bookworm
- 20:28 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host mc1063.eqiad.wmnet with OS bookworm
- 20:28 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1061.eqiad.wmnet with reason: host reimage
- 20:28 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1060.eqiad.wmnet with OS bookworm
- 20:28 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
- 20:28 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
- 20:25 jhancock@cumin2002: START - Cookbook sre.dns.netbox
- 20:24 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1064.eqiad.wmnet with reason: host reimage
- 20:23 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db2253
- 20:23 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host db2253
- 20:23 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db2252
- 20:22 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host db2252
- 20:22 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db2251
- 20:22 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host db2251
- 20:22 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db2250
- 20:22 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host db2250
- 20:21 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 20:21 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding db2250 to codfw - jhancock@cumin2002"
- 20:21 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding db2250 to codfw - jhancock@cumin2002"
- 20:21 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1062.eqiad.wmnet with reason: host reimage
- 20:20 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1061.eqiad.wmnet with reason: host reimage
- 20:20 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1055.eqiad.wmnet with reason: host reimage
- 20:20 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1059.eqiad.wmnet with reason: host reimage
- 20:20 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1064.eqiad.wmnet with reason: host reimage
- 20:17 jhancock@cumin2002: START - Cookbook sre.dns.netbox
- 20:17 jhancock@cumin2002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
- 20:15 jhancock@cumin2002: START - Cookbook sre.dns.netbox
- 20:14 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1057.eqiad.wmnet with reason: host reimage
- 20:14 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1055.eqiad.wmnet with reason: host reimage
- 20:13 mutante: planet1003, planet2003 - rebooting on ganeti level for T422596
- 20:10 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host mc1062.eqiad.wmnet with OS bookworm
- 20:10 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1060.eqiad.wmnet with reason: host reimage
- 20:10 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host mc1061.eqiad.wmnet with OS bookworm
- 20:09 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host mc1064.eqiad.wmnet with OS bookworm
- 20:09 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host mc1059.eqiad.wmnet with OS bookworm
- 20:06 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1060.eqiad.wmnet with reason: host reimage
- 20:04 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host mc1057.eqiad.wmnet with OS bookworm
- 20:03 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host mc1056.eqiad.wmnet with OS bookworm
- 20:03 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host mc1055.eqiad.wmnet with OS bookworm
- 19:59 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mc1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 19:53 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mc1055.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 19:50 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host mc1060.eqiad.wmnet with OS bookworm
- 19:50 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mc1056.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 19:49 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mc1060.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 19:48 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mc1057.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 19:48 jclark@cumin1003: START - Cookbook sre.hosts.provision for host mc1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 19:47 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mc1059.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 19:46 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mc1061.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 19:44 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mc1062.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 19:44 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mc1065.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 19:44 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mc1064.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 19:41 jclark@cumin1003: START - Cookbook sre.hosts.provision for host mc1055.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 19:40 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mc1063.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 19:38 jclark@cumin1003: START - Cookbook sre.hosts.provision for host mc1056.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 19:37 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host mc1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 19:37 jclark@cumin1003: START - Cookbook sre.hosts.provision for host mc1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 19:36 jclark@cumin1003: START - Cookbook sre.hosts.provision for host mc1057.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 19:36 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mc1066.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 19:36 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host mc1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 19:35 jclark@cumin1003: START - Cookbook sre.hosts.provision for host mc1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 19:35 jclark@cumin1003: START - Cookbook sre.hosts.provision for host mc1060.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 19:35 jclark@cumin1003: START - Cookbook sre.hosts.provision for host mc1059.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 19:35 jclark@cumin1003: START - Cookbook sre.hosts.provision for host mc1061.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 19:33 jclark@cumin1003: START - Cookbook sre.hosts.provision for host mc1062.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 19:33 jclark@cumin1003: START - Cookbook sre.hosts.provision for host mc1065.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 19:33 jclark@cumin1003: START - Cookbook sre.hosts.provision for host mc1063.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 19:33 jclark@cumin1003: START - Cookbook sre.hosts.provision for host mc1064.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 19:33 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mc1067.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 19:33 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mc1068.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 19:33 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 19:33 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mc1070.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 19:32 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 19:32 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mc1069.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 19:32 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mc1071.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 19:25 jclark@cumin1003: START - Cookbook sre.hosts.provision for host mc1066.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 19:24 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host mc1072.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 19:22 jclark@cumin1003: START - Cookbook sre.hosts.provision for host mc1067.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 19:22 jclark@cumin1003: START - Cookbook sre.hosts.provision for host mc1068.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 19:22 jclark@cumin1003: START - Cookbook sre.hosts.provision for host mc1069.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 19:21 jclark@cumin1003: START - Cookbook sre.hosts.provision for host mc1070.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 19:21 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 19:21 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding phab1006 to eqiad - jclark@cumin1003"
- 19:21 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding phab1006 to eqiad - jclark@cumin1003"
- 19:21 jclark@cumin1003: START - Cookbook sre.hosts.provision for host mc1071.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 19:18 jclark@cumin1003: START - Cookbook sre.hosts.provision for host mc1072.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 19:17 jclark@cumin1003: START - Cookbook sre.dns.netbox
- 19:17 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 19:17 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding phab1006 to eqiad - jclark@cumin1003"
- 19:17 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding phab1006 to eqiad - jclark@cumin1003"
- 19:12 jclark@cumin1003: START - Cookbook sre.dns.netbox
- 17:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2222 (T419635)', diff saved to https://phabricator.wikimedia.org/P91116 and previous config saved to /var/cache/conftool/dbconfig/20260417-172835-fceratto.json
- 17:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2222', diff saved to https://phabricator.wikimedia.org/P91114 and previous config saved to /var/cache/conftool/dbconfig/20260417-171827-fceratto.json
- 17:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2222', diff saved to https://phabricator.wikimedia.org/P91113 and previous config saved to /var/cache/conftool/dbconfig/20260417-170819-fceratto.json
- 16:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2222 (T419635)', diff saved to https://phabricator.wikimedia.org/P91112 and previous config saved to /var/cache/conftool/dbconfig/20260417-165811-fceratto.json
- 16:56 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2222 (T419635)', diff saved to https://phabricator.wikimedia.org/P91111 and previous config saved to /var/cache/conftool/dbconfig/20260417-165559-fceratto.json
- 16:55 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2222.codfw.wmnet with reason: Maintenance
- 16:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2221 (T419635)', diff saved to https://phabricator.wikimedia.org/P91110 and previous config saved to /var/cache/conftool/dbconfig/20260417-165544-fceratto.json
- 16:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2221', diff saved to https://phabricator.wikimedia.org/P91108 and previous config saved to /var/cache/conftool/dbconfig/20260417-164536-fceratto.json
- 16:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2221', diff saved to https://phabricator.wikimedia.org/P91107 and previous config saved to /var/cache/conftool/dbconfig/20260417-163528-fceratto.json
- 16:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2221 (T419635)', diff saved to https://phabricator.wikimedia.org/P91105 and previous config saved to /var/cache/conftool/dbconfig/20260417-162520-fceratto.json
- 16:23 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2221 (T419635)', diff saved to https://phabricator.wikimedia.org/P91104 and previous config saved to /var/cache/conftool/dbconfig/20260417-162307-fceratto.json
- 16:23 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2221.codfw.wmnet with reason: Maintenance
- 16:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2218 (T419635)', diff saved to https://phabricator.wikimedia.org/P91103 and previous config saved to /var/cache/conftool/dbconfig/20260417-162253-fceratto.json
- 16:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2218', diff saved to https://phabricator.wikimedia.org/P91102 and previous config saved to /var/cache/conftool/dbconfig/20260417-161245-fceratto.json
- 16:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259 (T419961)', diff saved to https://phabricator.wikimedia.org/P91101 and previous config saved to /var/cache/conftool/dbconfig/20260417-160418-fceratto.json
- 16:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2218', diff saved to https://phabricator.wikimedia.org/P91100 and previous config saved to /var/cache/conftool/dbconfig/20260417-160236-fceratto.json
- 16:02 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
- 16:01 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
- 15:59 jiji@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
- 15:59 jiji@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
- 15:59 jiji@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
- 15:58 jiji@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
- 15:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259', diff saved to https://phabricator.wikimedia.org/P91099 and previous config saved to /var/cache/conftool/dbconfig/20260417-155410-fceratto.json
- 15:54 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host phab2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 15:53 elukey@cumin1003: START - Cookbook sre.hosts.provision for host phab2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 15:53 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host phab2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 15:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2218 (T419635)', diff saved to https://phabricator.wikimedia.org/P91098 and previous config saved to /var/cache/conftool/dbconfig/20260417-155228-fceratto.json
- 15:50 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2218 (T419635)', diff saved to https://phabricator.wikimedia.org/P91097 and previous config saved to /var/cache/conftool/dbconfig/20260417-155015-fceratto.json
- 15:50 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2218.codfw.wmnet with reason: Maintenance
- 15:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2208 (T419635)', diff saved to https://phabricator.wikimedia.org/P91096 and previous config saved to /var/cache/conftool/dbconfig/20260417-155001-fceratto.json
- 15:49 elukey@cumin1003: START - Cookbook sre.hosts.provision for host phab2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 15:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259', diff saved to https://phabricator.wikimedia.org/P91095 and previous config saved to /var/cache/conftool/dbconfig/20260417-154402-fceratto.json
- 15:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2208', diff saved to https://phabricator.wikimedia.org/P91094 and previous config saved to /var/cache/conftool/dbconfig/20260417-153953-fceratto.json
- 15:33 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259 (T419961)', diff saved to https://phabricator.wikimedia.org/P91093 and previous config saved to /var/cache/conftool/dbconfig/20260417-153354-fceratto.json
- 15:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2208', diff saved to https://phabricator.wikimedia.org/P91092 and previous config saved to /var/cache/conftool/dbconfig/20260417-152944-fceratto.json
- 15:27 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for enwikinews: Move override for $wgFlaggedRevsHandleIncludes to InitialiseSettings.php (T423512) (duration: 06m 51s)
- 15:26 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1259 (T419961)', diff saved to https://phabricator.wikimedia.org/P91091 and previous config saved to /var/cache/conftool/dbconfig/20260417-152620-fceratto.json
- 15:26 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1259.eqiad.wmnet with reason: Maintenance
- 15:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254 (T419961)', diff saved to https://phabricator.wikimedia.org/P91090 and previous config saved to /var/cache/conftool/dbconfig/20260417-152549-fceratto.json
- 15:25 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 15:25 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 15:24 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 15:24 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 15:23 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, asmartkitten: Continuing with sync
- 15:23 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 15:23 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 15:22 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, asmartkitten: Backport for enwikinews: Move override for $wgFlaggedRevsHandleIncludes to InitialiseSettings.php (T423512) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 15:22 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 15:22 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 15:20 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for enwikinews: Move override for $wgFlaggedRevsHandleIncludes to InitialiseSettings.php (T423512)
- 15:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2208 (T419635)', diff saved to https://phabricator.wikimedia.org/P91089 and previous config saved to /var/cache/conftool/dbconfig/20260417-151936-fceratto.json
- 15:18 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 15:18 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 15:17 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 15:17 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 15:17 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2208 (T419635)', diff saved to https://phabricator.wikimedia.org/P91088 and previous config saved to /var/cache/conftool/dbconfig/20260417-151723-fceratto.json
- 15:17 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2208.codfw.wmnet with reason: Maintenance
- 15:16 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2200.codfw.wmnet with reason: Maintenance
- 15:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254', diff saved to https://phabricator.wikimedia.org/P91087 and previous config saved to /var/cache/conftool/dbconfig/20260417-151541-fceratto.json
- 15:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254', diff saved to https://phabricator.wikimedia.org/P91086 and previous config saved to /var/cache/conftool/dbconfig/20260417-150532-fceratto.json
- 15:04 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2198.codfw.wmnet with reason: Maintenance
- 15:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2182 (T419635)', diff saved to https://phabricator.wikimedia.org/P91085 and previous config saved to /var/cache/conftool/dbconfig/20260417-150440-fceratto.json
- 14:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254 (T419961)', diff saved to https://phabricator.wikimedia.org/P91084 and previous config saved to /var/cache/conftool/dbconfig/20260417-145524-fceratto.json
- 14:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2182', diff saved to https://phabricator.wikimedia.org/P91083 and previous config saved to /var/cache/conftool/dbconfig/20260417-145432-fceratto.json
- 14:51 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
- 14:51 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
- 14:48 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
- 14:48 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
- 14:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1254 (T419961)', diff saved to https://phabricator.wikimedia.org/P91082 and previous config saved to /var/cache/conftool/dbconfig/20260417-144819-fceratto.json
- 14:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1254.eqiad.wmnet with reason: Maintenance
- 14:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2182', diff saved to https://phabricator.wikimedia.org/P91081 and previous config saved to /var/cache/conftool/dbconfig/20260417-144424-fceratto.json
- 14:43 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1239.eqiad.wmnet with reason: Maintenance
- 14:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233 (T419961)', diff saved to https://phabricator.wikimedia.org/P91080 and previous config saved to /var/cache/conftool/dbconfig/20260417-144247-fceratto.json
- 14:36 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
- 14:36 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
- 14:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2182 (T419635)', diff saved to https://phabricator.wikimedia.org/P91079 and previous config saved to /var/cache/conftool/dbconfig/20260417-143416-fceratto.json
- 14:33 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
- 14:33 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
- 14:33 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
- 14:32 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
- 14:32 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
- 14:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P91078 and previous config saved to /var/cache/conftool/dbconfig/20260417-143238-fceratto.json
- 14:32 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2182 (T419635)', diff saved to https://phabricator.wikimedia.org/P91077 and previous config saved to /var/cache/conftool/dbconfig/20260417-143204-fceratto.json
- 14:31 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2182.codfw.wmnet with reason: Maintenance
- 14:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2168 (T419635)', diff saved to https://phabricator.wikimedia.org/P91076 and previous config saved to /var/cache/conftool/dbconfig/20260417-143139-fceratto.json
- 14:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P91075 and previous config saved to /var/cache/conftool/dbconfig/20260417-142230-fceratto.json
- 14:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2168', diff saved to https://phabricator.wikimedia.org/P91074 and previous config saved to /var/cache/conftool/dbconfig/20260417-142130-fceratto.json
- 14:18 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
- 14:16 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
- 14:16 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
- 14:14 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
- 14:13 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
- 14:12 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
- 14:12 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
- 14:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233 (T419961)', diff saved to https://phabricator.wikimedia.org/P91073 and previous config saved to /var/cache/conftool/dbconfig/20260417-141222-fceratto.json
- 14:12 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
- 14:11 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
- 14:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2168', diff saved to https://phabricator.wikimedia.org/P91072 and previous config saved to /var/cache/conftool/dbconfig/20260417-141123-fceratto.json
- 14:10 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
- 14:09 urandom: decommissioning Cassandra, aqs1011 [a,b] — T412830
- 14:06 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-restart-varnish (exit_code=0) rolling restart of Varnish on 1 hosts matching query P{cp3073.*}
- 14:06 eevans@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on aqs1011.eqiad.wmnet with reason: Bootstrapping — T412830
- 14:05 fabfur@cumin1003: START - Cookbook sre.cdn.roll-restart-varnish rolling restart of Varnish on 1 hosts matching query P{cp3073.*}
- 14:04 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1233 (T419961)', diff saved to https://phabricator.wikimedia.org/P91071 and previous config saved to /var/cache/conftool/dbconfig/20260417-140454-fceratto.json
- 14:04 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1233.eqiad.wmnet with reason: Maintenance
- 14:04 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-restart-varnish (exit_code=0) rolling restart of Varnish on 1 hosts matching query P{cp3072.*}
- 14:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229 (T419961)', diff saved to https://phabricator.wikimedia.org/P91070 and previous config saved to /var/cache/conftool/dbconfig/20260417-140424-fceratto.json
- 14:04 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
- 14:03 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
- 14:03 fabfur@cumin1003: START - Cookbook sre.cdn.roll-restart-varnish rolling restart of Varnish on 1 hosts matching query P{cp3072.*}
- 14:02 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-restart-varnish (exit_code=0) rolling restart of Varnish on 1 hosts matching query P{cp3070.*}
- 14:01 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
- 14:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2168 (T419635)', diff saved to https://phabricator.wikimedia.org/P91069 and previous config saved to /var/cache/conftool/dbconfig/20260417-140115-fceratto.json
- 14:01 fabfur@cumin1003: START - Cookbook sre.cdn.roll-restart-varnish rolling restart of Varnish on 1 hosts matching query P{cp3070.*}
- 14:00 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-restart-varnish (exit_code=0) rolling restart of Varnish on 1 hosts matching query P{cp3069.*}
- 14:00 fabfur: restart varnish on cp3069, cp3070, cp3072, cp3073 to clear alerts
- 14:00 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2168 (T419635)', diff saved to https://phabricator.wikimedia.org/P91068 and previous config saved to /var/cache/conftool/dbconfig/20260417-140003-fceratto.json
- 13:59 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2168.codfw.wmnet with reason: Maintenance
- 13:59 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
- 13:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2159 (T419635)', diff saved to https://phabricator.wikimedia.org/P91067 and previous config saved to /var/cache/conftool/dbconfig/20260417-135938-fceratto.json
- 13:58 fabfur@cumin1003: START - Cookbook sre.cdn.roll-restart-varnish rolling restart of Varnish on 1 hosts matching query P{cp3069.*}
- 13:57 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-restart-varnish (exit_code=0) rolling restart of Varnish on 1 hosts matching query P{cp3066.*}
- 13:54 fabfur@cumin1003: START - Cookbook sre.cdn.roll-restart-varnish rolling restart of Varnish on 1 hosts matching query P{cp3066.*}
- 13:54 fabfur: restarting varnish on cp3066 to clear alerts
- 13:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P91066 and previous config saved to /var/cache/conftool/dbconfig/20260417-135416-fceratto.json
- 13:52 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
- 13:52 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
- 13:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2159', diff saved to https://phabricator.wikimedia.org/P91065 and previous config saved to /var/cache/conftool/dbconfig/20260417-134930-fceratto.json
- 13:44 jmm@dns1004: END - running authdns-update
- 13:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P91064 and previous config saved to /var/cache/conftool/dbconfig/20260417-134408-fceratto.json
- 13:43 jmm@dns1004: START - running authdns-update
- 13:42 inflatador: bking@apt1002 sudo -E reprepro -C component/opensearch2 include trixie-wikimedia /home/bking/wmf-opensearch-search-plugins-2.19.5+5-trixie/wmf-opensearch-search-plugins_2.19.5+5_amd64.changes
- 13:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2159', diff saved to https://phabricator.wikimedia.org/P91063 and previous config saved to /var/cache/conftool/dbconfig/20260417-133923-fceratto.json
- 13:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229 (T419961)', diff saved to https://phabricator.wikimedia.org/P91062 and previous config saved to /var/cache/conftool/dbconfig/20260417-133359-fceratto.json
- 13:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2159 (T419635)', diff saved to https://phabricator.wikimedia.org/P91061 and previous config saved to /var/cache/conftool/dbconfig/20260417-132914-fceratto.json
- 13:28 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2159 (T419635)', diff saved to https://phabricator.wikimedia.org/P91060 and previous config saved to /var/cache/conftool/dbconfig/20260417-132802-fceratto.json
- 13:27 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2159.codfw.wmnet with reason: Maintenance
- 13:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2150 (T419635)', diff saved to https://phabricator.wikimedia.org/P91059 and previous config saved to /var/cache/conftool/dbconfig/20260417-132738-fceratto.json
- 13:27 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
- 13:26 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
- 13:26 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1229 (T419961)', diff saved to https://phabricator.wikimedia.org/P91058 and previous config saved to /var/cache/conftool/dbconfig/20260417-132628-fceratto.json
- 13:26 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1229.eqiad.wmnet with reason: Maintenance
- 13:26 jiji@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM wikikube-worker-exp2001.codfw.wmnet
- 13:22 jiji@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM wikikube-worker-exp2001.codfw.wmnet
- 13:20 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1225.eqiad.wmnet with reason: Maintenance
- 13:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197 (T419961)', diff saved to https://phabricator.wikimedia.org/P91057 and previous config saved to /var/cache/conftool/dbconfig/20260417-132034-fceratto.json
- 13:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2150', diff saved to https://phabricator.wikimedia.org/P91056 and previous config saved to /var/cache/conftool/dbconfig/20260417-131730-fceratto.json
- 13:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P91055 and previous config saved to /var/cache/conftool/dbconfig/20260417-131026-fceratto.json
- 13:07 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2150', diff saved to https://phabricator.wikimedia.org/P91054 and previous config saved to /var/cache/conftool/dbconfig/20260417-130722-fceratto.json
- 13:07 jiji@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM wikikube-worker-exp1001.eqiad.wmnet
- 13:00 jiji@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM wikikube-worker-exp1001.eqiad.wmnet
- 13:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P91053 and previous config saved to /var/cache/conftool/dbconfig/20260417-130018-fceratto.json
- 12:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2150 (T419635)', diff saved to https://phabricator.wikimedia.org/P91052 and previous config saved to /var/cache/conftool/dbconfig/20260417-125714-fceratto.json
- 12:55 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2150 (T419635)', diff saved to https://phabricator.wikimedia.org/P91051 and previous config saved to /var/cache/conftool/dbconfig/20260417-125501-fceratto.json
- 12:54 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2150.codfw.wmnet with reason: Maintenance
- 12:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197 (T419961)', diff saved to https://phabricator.wikimedia.org/P91050 and previous config saved to /var/cache/conftool/dbconfig/20260417-125009-fceratto.json
- 12:41 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1197 (T419961)', diff saved to https://phabricator.wikimedia.org/P91049 and previous config saved to /var/cache/conftool/dbconfig/20260417-124149-fceratto.json
- 12:41 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1197.eqiad.wmnet with reason: Maintenance
- 12:41 jiji@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
- 12:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188 (T419961)', diff saved to https://phabricator.wikimedia.org/P91048 and previous config saved to /var/cache/conftool/dbconfig/20260417-124120-fceratto.json
- 12:41 jiji@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
- 12:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P91047 and previous config saved to /var/cache/conftool/dbconfig/20260417-123111-fceratto.json
- 12:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P91046 and previous config saved to /var/cache/conftool/dbconfig/20260417-122104-fceratto.json
- 12:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188 (T419961)', diff saved to https://phabricator.wikimedia.org/P91045 and previous config saved to /var/cache/conftool/dbconfig/20260417-121056-fceratto.json
- 12:02 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1188 (T419961)', diff saved to https://phabricator.wikimedia.org/P91044 and previous config saved to /var/cache/conftool/dbconfig/20260417-120255-fceratto.json
- 12:02 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1188.eqiad.wmnet with reason: Maintenance
- 12:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182 (T419961)', diff saved to https://phabricator.wikimedia.org/P91043 and previous config saved to /var/cache/conftool/dbconfig/20260417-120226-fceratto.json
- 11:55 jiji@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
- 11:54 jiji@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
- 11:53 jiji@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
- 11:53 jiji@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
- 11:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P91042 and previous config saved to /var/cache/conftool/dbconfig/20260417-115218-fceratto.json
- 11:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P91041 and previous config saved to /var/cache/conftool/dbconfig/20260417-114210-fceratto.json
- 11:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182 (T419961)', diff saved to https://phabricator.wikimedia.org/P91040 and previous config saved to /var/cache/conftool/dbconfig/20260417-113201-fceratto.json
- 11:23 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1182 (T419961)', diff saved to https://phabricator.wikimedia.org/P91039 and previous config saved to /var/cache/conftool/dbconfig/20260417-112333-fceratto.json
- 11:23 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1182.eqiad.wmnet with reason: Maintenance
- 11:23 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162 (T419961)', diff saved to Unable to send diff to phaste and previous config saved to /var/cache/conftool/dbconfig/20260417-112259-fceratto.json
- 11:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P91037 and previous config saved to /var/cache/conftool/dbconfig/20260417-111250-fceratto.json
- 11:11 jynus@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ms-backup2002.codfw.wmnet
- 11:11 jynus@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 11:11 jynus@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ms-backup2002.codfw.wmnet decommissioned, removing all IPs except the asset tag one - jynus@cumin1003"
- 11:08 jynus@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ms-backup2002.codfw.wmnet decommissioned, removing all IPs except the asset tag one - jynus@cumin1003"
- 11:03 jynus@cumin1003: START - Cookbook sre.dns.netbox
- 11:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P91036 and previous config saved to /var/cache/conftool/dbconfig/20260417-110242-fceratto.json
- 10:55 jynus@cumin1003: START - Cookbook sre.hosts.decommission for hosts ms-backup2002.codfw.wmnet
- 10:54 jynus@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ms-backup2001.codfw.wmnet
- 10:54 jynus@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 10:54 jynus@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ms-backup2001.codfw.wmnet decommissioned, removing all IPs except the asset tag one - jynus@cumin1003"
- 10:53 jynus@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ms-backup2001.codfw.wmnet decommissioned, removing all IPs except the asset tag one - jynus@cumin1003"
- 10:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162 (T419961)', diff saved to https://phabricator.wikimedia.org/P91035 and previous config saved to /var/cache/conftool/dbconfig/20260417-105234-fceratto.json
- 10:48 jynus@cumin1003: START - Cookbook sre.dns.netbox
- 10:43 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1162 (T419961)', diff saved to https://phabricator.wikimedia.org/P91034 and previous config saved to /var/cache/conftool/dbconfig/20260417-104327-fceratto.json
- 10:43 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1162.eqiad.wmnet with reason: Maintenance
- 10:43 jynus@cumin1003: START - Cookbook sre.hosts.decommission for hosts ms-backup2001.codfw.wmnet
- 10:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156 (T419961)', diff saved to https://phabricator.wikimedia.org/P91033 and previous config saved to /var/cache/conftool/dbconfig/20260417-104257-fceratto.json
- 10:37 jynus@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ms-backup1002.eqiad.wmnet
- 10:37 jynus@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 10:37 jynus@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ms-backup1002.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jynus@cumin1003"
- 10:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P91032 and previous config saved to /var/cache/conftool/dbconfig/20260417-103249-fceratto.json
- 10:31 jynus@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ms-backup1002.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jynus@cumin1003"
- 10:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P91031 and previous config saved to /var/cache/conftool/dbconfig/20260417-102241-fceratto.json
- 10:20 jynus@cumin1003: START - Cookbook sre.dns.netbox
- 10:13 jynus@cumin1003: START - Cookbook sre.hosts.decommission for hosts ms-backup1002.eqiad.wmnet
- 10:13 jynus@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ms-backup1001.eqiad.wmnet
- 10:13 jynus@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 10:12 jynus@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ms-backup1001.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jynus@cumin1003"
- 10:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156 (T419961)', diff saved to https://phabricator.wikimedia.org/P91030 and previous config saved to /var/cache/conftool/dbconfig/20260417-101233-fceratto.json
- 10:11 jynus@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ms-backup1001.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jynus@cumin1003"
- 10:04 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1156 (T419961)', diff saved to https://phabricator.wikimedia.org/P91029 and previous config saved to /var/cache/conftool/dbconfig/20260417-100401-fceratto.json
- 10:03 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1014,1018].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
- 10:03 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1156.eqiad.wmnet with reason: Maintenance
- 10:00 jynus@cumin1003: START - Cookbook sre.dns.netbox
- 09:55 jynus@cumin1003: START - Cookbook sre.hosts.decommission for hosts ms-backup1001.eqiad.wmnet
- 09:54 marostegui: pool esams
- 09:53 marostegui@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: pool esams [reason: no reason specified, no task ID specified]
- 09:53 marostegui@cumin1003: START - Cookbook sre.dns.admin DNS admin: pool esams [reason: no reason specified, no task ID specified]
- 09:44 moritzm: initialise eqsin02 Ganeti cluster T421863
- 09:36 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device lsw1-f3-codfw
- 09:36 ayounsi@cumin1003: START - Cookbook sre.network.tls for network device lsw1-f3-codfw
- 09:36 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device ssw1-f1-codfw
- 09:36 ayounsi@cumin1003: START - Cookbook sre.network.tls for network device ssw1-f1-codfw
- 09:35 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device ssw1-e1-codfw
- 09:35 ayounsi@cumin1003: START - Cookbook sre.network.tls for network device ssw1-e1-codfw
- 09:35 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device lsw1-e1-codfw
- 09:35 ayounsi@cumin1003: START - Cookbook sre.network.tls for network device lsw1-e1-codfw
- 09:35 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device lsw1-e3-codfw
- 09:35 ayounsi@cumin1003: START - Cookbook sre.network.tls for network device lsw1-e3-codfw
- 08:51 topranks: depool esams due to connectivity issues
- 08:51 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: depool esams [reason: no reason specified, no task ID specified]
- 08:51 cmooney@cumin1003: START - Cookbook sre.dns.admin DNS admin: depool esams [reason: no reason specified, no task ID specified]
- 08:50 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
- 08:48 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
- 08:46 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
- 08:45 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
- 08:44 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
- 08:42 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
- 08:42 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
- 08:39 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
- 08:39 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
- 08:37 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
- 07:33 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1201: after reimage to trixie
- 07:11 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
- 07:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1263 (T419635)', diff saved to https://phabricator.wikimedia.org/P91025 and previous config saved to /var/cache/conftool/dbconfig/20260417-071048-fceratto.json
- 07:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1263', diff saved to https://phabricator.wikimedia.org/P91023 and previous config saved to /var/cache/conftool/dbconfig/20260417-070039-fceratto.json
- 06:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1263', diff saved to https://phabricator.wikimedia.org/P91022 and previous config saved to /var/cache/conftool/dbconfig/20260417-065031-fceratto.json
- 06:49 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2158: repool after maintenance
- 06:48 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1201: after reimage to trixie
- 06:46 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1201.eqiad.wmnet with OS trixie
- 06:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1263 (T419635)', diff saved to https://phabricator.wikimedia.org/P91019 and previous config saved to /var/cache/conftool/dbconfig/20260417-064023-fceratto.json
- 06:24 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1201.eqiad.wmnet with reason: host reimage
- 06:22 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1201.eqiad.wmnet with reason: host reimage
- 06:08 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1201.eqiad.wmnet with OS trixie
- 06:07 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1201: Reimage to Trixie
- 06:06 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1201: Reimage to Trixie
- 06:06 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1201.eqiad.wmnet with reason: Reimage to Trixie
- 06:04 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2158: repool after maintenance
- 06:00 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2158.codfw.wmnet with OS trixie
- 05:39 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2158.codfw.wmnet with reason: host reimage
- 05:33 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2158.codfw.wmnet with reason: host reimage
- 05:16 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2158.codfw.wmnet with OS trixie
- 05:13 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2158: Reimage to Trixie
- 05:13 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2158: Reimage to Trixie
- 05:13 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2158.codfw.wmnet with reason: Reimage to Trixie
- 04:45 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1263 (T419635)', diff saved to https://phabricator.wikimedia.org/P91013 and previous config saved to /var/cache/conftool/dbconfig/20260417-044543-fceratto.json
- 04:45 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1263.eqiad.wmnet with reason: Maintenance
- 04:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1262 (T419635)', diff saved to https://phabricator.wikimedia.org/P91012 and previous config saved to /var/cache/conftool/dbconfig/20260417-044518-fceratto.json
- 04:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1262', diff saved to https://phabricator.wikimedia.org/P91011 and previous config saved to /var/cache/conftool/dbconfig/20260417-043510-fceratto.json
- 04:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1262', diff saved to https://phabricator.wikimedia.org/P91010 and previous config saved to /var/cache/conftool/dbconfig/20260417-042502-fceratto.json
- 04:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1262 (T419635)', diff saved to https://phabricator.wikimedia.org/P91009 and previous config saved to /var/cache/conftool/dbconfig/20260417-041454-fceratto.json
- 02:16 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1262 (T419635)', diff saved to https://phabricator.wikimedia.org/P91008 and previous config saved to /var/cache/conftool/dbconfig/20260417-021624-fceratto.json
- 02:16 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1262.eqiad.wmnet with reason: Maintenance
- 02:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1261 (T419635)', diff saved to https://phabricator.wikimedia.org/P91007 and previous config saved to /var/cache/conftool/dbconfig/20260417-021558-fceratto.json
- 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 25s)
- 02:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1261', diff saved to https://phabricator.wikimedia.org/P91006 and previous config saved to /var/cache/conftool/dbconfig/20260417-020550-fceratto.json
- 02:01 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
- 01:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1261', diff saved to https://phabricator.wikimedia.org/P91005 and previous config saved to /var/cache/conftool/dbconfig/20260417-015542-fceratto.json
- 01:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1261 (T419635)', diff saved to https://phabricator.wikimedia.org/P91004 and previous config saved to /var/cache/conftool/dbconfig/20260417-014534-fceratto.json
- 00:10 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2006.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
- 00:03 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2006.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
- 00:03 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2006.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
2026-04-16
- 23:56 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2006.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
- 23:51 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1261 (T419635)', diff saved to https://phabricator.wikimedia.org/P91003 and previous config saved to /var/cache/conftool/dbconfig/20260416-235123-fceratto.json
- 23:51 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1261.eqiad.wmnet with reason: Maintenance
- 23:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1260 (T419635)', diff saved to https://phabricator.wikimedia.org/P91002 and previous config saved to /var/cache/conftool/dbconfig/20260416-235059-fceratto.json
- 23:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1260', diff saved to https://phabricator.wikimedia.org/P91001 and previous config saved to /var/cache/conftool/dbconfig/20260416-234052-fceratto.json
- 23:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1260', diff saved to https://phabricator.wikimedia.org/P91000 and previous config saved to /var/cache/conftool/dbconfig/20260416-233044-fceratto.json
- 23:25 musikanimal@deploy1003: Finished scap sync-world: Backport for CommonSettings: use CodeMirror instead of CodeEditor in AbuseFilter (T399673) (duration: 06m 35s)
- 23:21 musikanimal@deploy1003: musikanimal: Continuing with sync
- 23:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1260 (T419635)', diff saved to https://phabricator.wikimedia.org/P90999 and previous config saved to /var/cache/conftool/dbconfig/20260416-232036-fceratto.json
- 23:20 musikanimal@deploy1003: musikanimal: Backport for CommonSettings: use CodeMirror instead of CodeEditor in AbuseFilter (T399673) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 23:18 musikanimal@deploy1003: Started scap sync-world: Backport for CommonSettings: use CodeMirror instead of CodeEditor in AbuseFilter (T399673)
- 22:14 James_F: jforrester@deploy1003:/srv/mediawiki-staging$ foreachwikiindblist sul extensions/Wikibase/lib/maintenance/populateSitesTable.php # T423660
- 22:08 cscott@deploy1003: Finished scap sync-world: Backport for ConverterRule: convert `null` to `false` when needed (T423639), Convert language to internal code in tests, ParsoidCachePrewarmJob: Define the title in the req context (T422780), Move language variant parser option setting from Article to WikiPage (T423534) (duration: 09m 41s)
- 22:04 cscott@deploy1003: cscott: Continuing with sync
- 22:00 cscott@deploy1003: cscott: Backport for ConverterRule: convert `null` to `false` when needed (T423639), Convert language to internal code in tests, ParsoidCachePrewarmJob: Define the title in the req context (T422780), Move language variant parser option setting from Article to WikiPage (T423534) synced to the testservers (see https://wikitech
- 21:58 zabe@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
- 21:58 cscott@deploy1003: Started scap sync-world: Backport for ConverterRule: convert `null` to `false` when needed (T423639), Convert language to internal code in tests, ParsoidCachePrewarmJob: Define the title in the req context (T422780), Move language variant parser option setting from Article to WikiPage (T423534)
- 21:57 zabe@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
- 21:33 cscott@deploy1003: Finished scap sync-world: Backport for Deploy PRV to 4 wikis (T423188), [bnwikisource] Enable PageImages on NS:4, NS:100, NS:104, NS:106, NS:114 (duration: 17m 26s)
- 21:29 cscott@deploy1003: cscott, arlolra, bodhisattwa: Continuing with sync
- 21:23 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1260 (T419635)', diff saved to https://phabricator.wikimedia.org/P90997 and previous config saved to /var/cache/conftool/dbconfig/20260416-212348-fceratto.json
- 21:23 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1260.eqiad.wmnet with reason: Maintenance
- 21:23 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1252 (T419635)', diff saved to https://phabricator.wikimedia.org/P90996 and previous config saved to /var/cache/conftool/dbconfig/20260416-212323-fceratto.json
- 21:17 cscott@deploy1003: cscott, arlolra, bodhisattwa: Backport for Deploy PRV to 4 wikis (T423188), [bnwikisource] Enable PageImages on NS:4, NS:100, NS:104, NS:106, NS:114 synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 21:16 cscott@deploy1003: Started scap sync-world: Backport for Deploy PRV to 4 wikis (T423188), [bnwikisource] Enable PageImages on NS:4, NS:100, NS:104, NS:106, NS:114
- 21:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1252', diff saved to https://phabricator.wikimedia.org/P90995 and previous config saved to /var/cache/conftool/dbconfig/20260416-211315-fceratto.json
- 21:03 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1252', diff saved to https://phabricator.wikimedia.org/P90994 and previous config saved to /var/cache/conftool/dbconfig/20260416-210307-fceratto.json
- 20:53 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1252 (T419635)', diff saved to https://phabricator.wikimedia.org/P90993 and previous config saved to /var/cache/conftool/dbconfig/20260416-205258-fceratto.json
- 20:51 stran@deploy1003: Finished scap sync-world: Backport for Deploy IRS to enwiki's Event Talk namespace (T423042), Make abstractwiki a multi-lingual Wikidata client (T420420), Enable attribution.v0-beta in RestSandboxSpecs for all wikis (T419545) (duration: 08m 36s)
- 20:48 stran@deploy1003: aaron, stran, jforrester: Continuing with sync
- 20:44 stran@deploy1003: aaron, stran, jforrester: Backport for Deploy IRS to enwiki's Event Talk namespace (T423042), Make abstractwiki a multi-lingual Wikidata client (T420420), Enable attribution.v0-beta in RestSandboxSpecs for all wikis (T419545) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 20:43 stran@deploy1003: Started scap sync-world: Backport for Deploy IRS to enwiki's Event Talk namespace (T423042), Make abstractwiki a multi-lingual Wikidata client (T420420), Enable attribution.v0-beta in RestSandboxSpecs for all wikis (T419545)
- 20:36 stran@deploy1003: Finished scap sync-world: Backport for Allow the 'ReportIncidentEnabledNamespaces' config to be ovewritten (T423042) (duration: 09m 07s)
- 20:33 stran@deploy1003: stran: Continuing with sync
- 20:29 stran@deploy1003: stran: Backport for Allow the 'ReportIncidentEnabledNamespaces' config to be ovewritten (T423042) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 20:27 stran@deploy1003: Started scap sync-world: Backport for Allow the 'ReportIncidentEnabledNamespaces' config to be ovewritten (T423042)
- 20:17 maryum: Removed private mitigation for T419137
- 20:09 mstyles@deploy1003: Finished scap sync-world: Backport for config: Enable EmailConfirmationBanner on selected wikis (T421366) (duration: 06m 06s)
- 20:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2238 (T419961)', diff saved to https://phabricator.wikimedia.org/P90992 and previous config saved to /var/cache/conftool/dbconfig/20260416-200839-fceratto.json
- 20:05 mstyles@deploy1003: mmartorana, mstyles: Continuing with sync
- 20:05 mstyles@deploy1003: mmartorana, mstyles: Backport for config: Enable EmailConfirmationBanner on selected wikis (T421366) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 20:03 mstyles@deploy1003: Started scap sync-world: Backport for config: Enable EmailConfirmationBanner on selected wikis (T421366)
- 19:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2238', diff saved to https://phabricator.wikimedia.org/P90991 and previous config saved to /var/cache/conftool/dbconfig/20260416-195831-fceratto.json
- 19:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2238', diff saved to https://phabricator.wikimedia.org/P90990 and previous config saved to /var/cache/conftool/dbconfig/20260416-194823-fceratto.json
- 19:48 zabe@deploy1003: Finished scap sync-world: Backport for Set $wgGlobalUsageSharedRepoWiki for testcommonswiki (T421914), Also disable updates for GloballyWantedFiles on testcommonswiki (T421914) (duration: 06m 48s)
- 19:44 zabe@deploy1003: zabe: Continuing with sync
- 19:43 zabe@deploy1003: zabe: Backport for Set $wgGlobalUsageSharedRepoWiki for testcommonswiki (T421914), Also disable updates for GloballyWantedFiles on testcommonswiki (T421914) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 19:41 zabe@deploy1003: Started scap sync-world: Backport for Set $wgGlobalUsageSharedRepoWiki for testcommonswiki (T421914), Also disable updates for GloballyWantedFiles on testcommonswiki (T421914)
- 19:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2238 (T419961)', diff saved to https://phabricator.wikimedia.org/P90989 and previous config saved to /var/cache/conftool/dbconfig/20260416-193814-fceratto.json
- 19:36 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-ctrl2005.codfw.wmnet with OS trixie
- 19:34 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-ctrl2004.codfw.wmnet with OS trixie
- 19:31 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2238 (T419961)', diff saved to https://phabricator.wikimedia.org/P90988 and previous config saved to /var/cache/conftool/dbconfig/20260416-193100-fceratto.json
- 19:30 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2238.codfw.wmnet with reason: Maintenance
- 19:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2226 (T419961)', diff saved to https://phabricator.wikimedia.org/P90987 and previous config saved to /var/cache/conftool/dbconfig/20260416-193028-fceratto.json
- 19:21 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-ctrl2005.codfw.wmnet with reason: host reimage
- 19:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2226', diff saved to https://phabricator.wikimedia.org/P90986 and previous config saved to /var/cache/conftool/dbconfig/20260416-192020-fceratto.json
- 19:19 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-ctrl2004.codfw.wmnet with reason: host reimage
- 19:16 jasmine@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-ctrl2005.codfw.wmnet with reason: host reimage
- 19:15 jasmine@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-ctrl2004.codfw.wmnet with reason: host reimage
- 19:14 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 19:14 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 19:12 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
- 19:12 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
- 19:11 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
- 19:11 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
- 19:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2226', diff saved to https://phabricator.wikimedia.org/P90985 and previous config saved to /var/cache/conftool/dbconfig/20260416-191012-fceratto.json
- 19:03 jasmine@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-ctrl2005.codfw.wmnet with OS trixie
- 19:02 jasmine@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-ctrl2004.codfw.wmnet with OS trixie
- 19:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2226 (T419961)', diff saved to https://phabricator.wikimedia.org/P90984 and previous config saved to /var/cache/conftool/dbconfig/20260416-190004-fceratto.json
- 18:59 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
- 18:57 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1252 (T419635)', diff saved to https://phabricator.wikimedia.org/P90983 and previous config saved to /var/cache/conftool/dbconfig/20260416-185757-fceratto.json
- 18:57 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1252.eqiad.wmnet with reason: Maintenance
- 18:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1249 (T419635)', diff saved to https://phabricator.wikimedia.org/P90982 and previous config saved to /var/cache/conftool/dbconfig/20260416-185731-fceratto.json
- 18:52 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2226 (T419961)', diff saved to https://phabricator.wikimedia.org/P90981 and previous config saved to /var/cache/conftool/dbconfig/20260416-185253-fceratto.json
- 18:52 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2226.codfw.wmnet with reason: Maintenance
- 18:52 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
- 18:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2225 (T419961)', diff saved to https://phabricator.wikimedia.org/P90980 and previous config saved to /var/cache/conftool/dbconfig/20260416-185222-fceratto.json
- 18:49 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
- 18:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1249', diff saved to https://phabricator.wikimedia.org/P90979 and previous config saved to /var/cache/conftool/dbconfig/20260416-184723-fceratto.json
- 18:46 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
- 18:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2225', diff saved to https://phabricator.wikimedia.org/P90978 and previous config saved to /var/cache/conftool/dbconfig/20260416-184213-fceratto.json
- 18:42 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
- 18:39 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
- 18:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1249', diff saved to https://phabricator.wikimedia.org/P90977 and previous config saved to /var/cache/conftool/dbconfig/20260416-183715-fceratto.json
- 18:36 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
- 18:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2225', diff saved to https://phabricator.wikimedia.org/P90976 and previous config saved to /var/cache/conftool/dbconfig/20260416-183205-fceratto.json
- 18:32 dduvall@deploy1003: rebuilt and synchronized wikiversions files: group2 to 1.46.0-wmf.24 refs T420482
- 18:28 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
- 18:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1249 (T419635)', diff saved to https://phabricator.wikimedia.org/P90975 and previous config saved to /var/cache/conftool/dbconfig/20260416-182707-fceratto.json
- 18:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2225 (T419961)', diff saved to https://phabricator.wikimedia.org/P90974 and previous config saved to /var/cache/conftool/dbconfig/20260416-182157-fceratto.json
- 18:14 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2225 (T419961)', diff saved to https://phabricator.wikimedia.org/P90973 and previous config saved to /var/cache/conftool/dbconfig/20260416-181447-fceratto.json
- 18:14 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2225.codfw.wmnet with reason: Maintenance
- 18:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2204 (T419961)', diff saved to https://phabricator.wikimedia.org/P90972 and previous config saved to /var/cache/conftool/dbconfig/20260416-181415-fceratto.json
- 18:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2204', diff saved to https://phabricator.wikimedia.org/P90971 and previous config saved to /var/cache/conftool/dbconfig/20260416-180407-fceratto.json
- 17:53 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2204', diff saved to https://phabricator.wikimedia.org/P90970 and previous config saved to /var/cache/conftool/dbconfig/20260416-175358-fceratto.json
- 17:43 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2204 (T419961)', diff saved to https://phabricator.wikimedia.org/P90969 and previous config saved to /var/cache/conftool/dbconfig/20260416-174350-fceratto.json
- 17:36 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2204 (T419961)', diff saved to https://phabricator.wikimedia.org/P90968 and previous config saved to /var/cache/conftool/dbconfig/20260416-173640-fceratto.json
- 17:36 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2204.codfw.wmnet with reason: Maintenance
- 17:31 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2197.codfw.wmnet with reason: Maintenance
- 17:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2189 (T419961)', diff saved to https://phabricator.wikimedia.org/P90967 and previous config saved to /var/cache/conftool/dbconfig/20260416-173058-fceratto.json
- 17:28 bd808@deploy1003: helmfile [eqiad] DONE helmfile.d/services/developer-portal: apply
- 17:27 bd808@deploy1003: helmfile [eqiad] START helmfile.d/services/developer-portal: apply
- 17:27 bd808@deploy1003: helmfile [codfw] DONE helmfile.d/services/developer-portal: apply
- 17:27 bd808@deploy1003: helmfile [codfw] START helmfile.d/services/developer-portal: apply
- 17:26 bd808@deploy1003: helmfile [staging] DONE helmfile.d/services/developer-portal: apply
- 17:26 bd808@deploy1003: helmfile [staging] START helmfile.d/services/developer-portal: apply
- 17:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2189', diff saved to https://phabricator.wikimedia.org/P90966 and previous config saved to /var/cache/conftool/dbconfig/20260416-172050-fceratto.json
- 17:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2189', diff saved to https://phabricator.wikimedia.org/P90964 and previous config saved to /var/cache/conftool/dbconfig/20260416-171041-fceratto.json
- 17:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2189 (T419961)', diff saved to https://phabricator.wikimedia.org/P90963 and previous config saved to /var/cache/conftool/dbconfig/20260416-170033-fceratto.json
- 16:53 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2189 (T419961)', diff saved to https://phabricator.wikimedia.org/P90962 and previous config saved to /var/cache/conftool/dbconfig/20260416-165326-fceratto.json
- 16:53 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2189.codfw.wmnet with reason: Maintenance
- 16:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2175 (T419961)', diff saved to https://phabricator.wikimedia.org/P90961 and previous config saved to /var/cache/conftool/dbconfig/20260416-165253-fceratto.json
- 16:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2175', diff saved to https://phabricator.wikimedia.org/P90960 and previous config saved to /var/cache/conftool/dbconfig/20260416-164245-fceratto.json
- 16:38 cdanis: 💙cdanis@cumin1003.eqiad.wmnet ~ 🕧☕ sudo cumin 'A:swift-fe' 'enable-puppet "cdanis deploy 8ad070a466 T328872"'
- 16:38 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1249 (T419635)', diff saved to https://phabricator.wikimedia.org/P90959 and previous config saved to /var/cache/conftool/dbconfig/20260416-163800-fceratto.json
- 16:37 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1249.eqiad.wmnet with reason: Maintenance
- 16:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1248 (T419635)', diff saved to https://phabricator.wikimedia.org/P90958 and previous config saved to /var/cache/conftool/dbconfig/20260416-163736-fceratto.json
- 16:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2175', diff saved to https://phabricator.wikimedia.org/P90957 and previous config saved to /var/cache/conftool/dbconfig/20260416-163237-fceratto.json
- 16:30 cdanis: 💙cdanis@cumin1003.eqiad.wmnet ~ 🕧☕ sudo cumin 'A:swift-fe' 'disable-puppet "cdanis deploy 8ad070a466 T328872"'
- 16:27 urandom: upgrade envoyproxy, restbase[1031,2024] (canary) — T419637 & T410975
- 16:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1248', diff saved to https://phabricator.wikimedia.org/P90956 and previous config saved to /var/cache/conftool/dbconfig/20260416-162727-fceratto.json
- 16:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2175 (T419961)', diff saved to https://phabricator.wikimedia.org/P90955 and previous config saved to /var/cache/conftool/dbconfig/20260416-162229-fceratto.json
- 16:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1248', diff saved to https://phabricator.wikimedia.org/P90953 and previous config saved to /var/cache/conftool/dbconfig/20260416-161719-fceratto.json
- 16:15 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2175 (T419961)', diff saved to https://phabricator.wikimedia.org/P90952 and previous config saved to /var/cache/conftool/dbconfig/20260416-161504-fceratto.json
- 16:14 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2175.codfw.wmnet with reason: Maintenance
- 16:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2148 (T419961)', diff saved to https://phabricator.wikimedia.org/P90951 and previous config saved to /var/cache/conftool/dbconfig/20260416-161432-fceratto.json
- 16:11 eevans@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on aqs1010.eqiad.wmnet with reason: Bootstrapping — T412830
- 16:07 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1248 (T419635)', diff saved to https://phabricator.wikimedia.org/P90950 and previous config saved to /var/cache/conftool/dbconfig/20260416-160710-fceratto.json
- 16:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2148', diff saved to https://phabricator.wikimedia.org/P90949 and previous config saved to /var/cache/conftool/dbconfig/20260416-160424-fceratto.json
- 15:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2148', diff saved to https://phabricator.wikimedia.org/P90948 and previous config saved to /var/cache/conftool/dbconfig/20260416-155416-fceratto.json
- 15:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2148 (T419961)', diff saved to https://phabricator.wikimedia.org/P90947 and previous config saved to /var/cache/conftool/dbconfig/20260416-154408-fceratto.json
- 15:35 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2148 (T419961)', diff saved to https://phabricator.wikimedia.org/P90946 and previous config saved to /var/cache/conftool/dbconfig/20260416-153547-fceratto.json
- 15:35 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2148.codfw.wmnet with reason: Maintenance
- 15:35 jhathaway@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on krb2002.codfw.wmnet with reason: T407726
- 15:35 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
- 15:35 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
- 15:34 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
- 15:34 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
- 15:31 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
- 15:30 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
- 15:29 cdanis: 💔cdanis@cumin1003.eqiad.wmnet ~ 🕦☕ sudo cumin 'A:swift-fe' 'disable-puppet "cdanis deploy I3aaec0ca T328872"'
- 15:16 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
- 15:16 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
- 15:15 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
- 15:14 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
- 15:14 moritzm: installing sequoia-sqv security updates
- 15:13 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
- 15:10 daniel@deploy1003: Finished scap sync-world: Backport for API rate limits: add highlimits-user class (T419796) (duration: 10m 47s)
- 15:04 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
- 15:03 daniel@deploy1003: daniel: Continuing with sync
- 15:01 daniel@deploy1003: daniel: Backport for API rate limits: add highlimits-user class (T419796) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 15:00 root@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mr1-codfw,mr1-codfw IPv6,mr1-codfw.oob with reason: router upgrade
- 14:59 daniel@deploy1003: Started scap sync-world: Backport for API rate limits: add highlimits-user class (T419796)
- 14:58 root@cumin2002: DONE (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 1:00:00 on mr1-codfw IPv6,mr-codfw with reason: router upgrade
- 14:58 papaul: ongoing maintenace on mr1-codfw
- 14:56 root@cumin2002: DONE (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 1:00:00 on mr1-codfw IPv6,mr1-codfw.oob,mr-codfw with reason: router upgrade
- 14:56 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
- 14:56 jelto@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host gerrit2002.wikimedia.org
- 14:56 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
- 14:54 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
- 14:54 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
- 14:52 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
- 14:45 jelto@cumin1003: START - Cookbook sre.hosts.reboot-single for host gerrit2002.wikimedia.org
- 14:44 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
- 14:29 jforrester@deploy1003: Finished scap sync-world: Backport for mc: Use MCROUTER_SERVER values rather than local sidepod for WF cache (T423311) (duration: 09m 36s)
- 14:25 jforrester@deploy1003: jforrester: Continuing with sync
- 14:25 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
- 14:21 jforrester@deploy1003: jforrester: Backport for mc: Use MCROUTER_SERVER values rather than local sidepod for WF cache (T423311) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 14:20 jforrester@deploy1003: Started scap sync-world: Backport for mc: Use MCROUTER_SERVER values rather than local sidepod for WF cache (T423311)
- 14:18 mlitn@deploy1003: Finished scap sync-world: Backport for fix: add missing hook registration for create account stats (T422283) (duration: 06m 07s)
- 14:18 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
- 14:15 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1248 (T419635)', diff saved to https://phabricator.wikimedia.org/P90945 and previous config saved to /var/cache/conftool/dbconfig/20260416-141515-fceratto.json
- 14:15 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1248.eqiad.wmnet with reason: Maintenance
- 14:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1247 (T419635)', diff saved to https://phabricator.wikimedia.org/P90944 and previous config saved to /var/cache/conftool/dbconfig/20260416-141450-fceratto.json
- 14:14 mlitn@deploy1003: mlitn, migr: Continuing with sync
- 14:14 mlitn@deploy1003: mlitn, migr: Backport for fix: add missing hook registration for create account stats (T422283) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 14:12 mlitn@deploy1003: Started scap sync-world: Backport for fix: add missing hook registration for create account stats (T422283)
- 14:07 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host testvm2002.codfw.wmnet with OS trixie
- 14:05 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
- 14:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1247', diff saved to https://phabricator.wikimedia.org/P90943 and previous config saved to /var/cache/conftool/dbconfig/20260416-140442-fceratto.json
- 14:04 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
- 14:01 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
- 14:01 mlitn@deploy1003: Finished scap sync-world: Backport for siwikitionary: update logo to localised svg version. (T342173) (duration: 07m 11s)
- 14:01 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
- 14:00 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
- 13:58 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1245.eqiad.wmnet with reason: Maintenance
- 13:57 mlitn@deploy1003: mlitn, robertsky: Continuing with sync
- 13:56 mlitn@deploy1003: mlitn, robertsky: Backport for siwikitionary: update logo to localised svg version. (T342173) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 13:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1210 (T419961)', diff saved to https://phabricator.wikimedia.org/P90942 and previous config saved to /var/cache/conftool/dbconfig/20260416-135549-fceratto.json
- 13:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1247', diff saved to https://phabricator.wikimedia.org/P90941 and previous config saved to /var/cache/conftool/dbconfig/20260416-135434-fceratto.json
- 13:54 mlitn@deploy1003: Started scap sync-world: Backport for siwikitionary: update logo to localised svg version. (T342173)
- 13:54 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on testvm2002.codfw.wmnet with reason: host reimage
- 13:51 mlitn@deploy1003: Finished scap sync-world: Backport for Squashed diff to master (duration: 30m 21s)
- 13:51 jiji@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-debug: apply
- 13:49 jiji@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-debug: apply
- 13:48 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on testvm2002.codfw.wmnet with reason: host reimage
- 13:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1210', diff saved to https://phabricator.wikimedia.org/P90940 and previous config saved to /var/cache/conftool/dbconfig/20260416-134541-fceratto.json
- 13:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1247 (T419635)', diff saved to https://phabricator.wikimedia.org/P90939 and previous config saved to /var/cache/conftool/dbconfig/20260416-134426-fceratto.json
- 13:41 urandom: decommissioning Cassandra [a,b] on aqs1010 — T412830
- 13:39 mlitn@deploy1003: mlitn: Continuing with sync
- 13:38 mlitn@deploy1003: mlitn: Backport for Squashed diff to master synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 13:38 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
- 13:38 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
- 13:36 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
- 13:36 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
- 13:36 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2222 (T410589)', diff saved to https://phabricator.wikimedia.org/P90938 and previous config saved to /var/cache/conftool/dbconfig/20260416-133600-ladsgroup.json
- 13:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1210', diff saved to https://phabricator.wikimedia.org/P90937 and previous config saved to /var/cache/conftool/dbconfig/20260416-133533-fceratto.json
- 13:34 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
- 13:33 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
- 13:32 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host testvm2002.codfw.wmnet with OS trixie
- 13:31 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
- 13:29 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host testvm2004.codfw.wmnet with OS trixie
- 13:28 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
- 13:26 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
- 13:25 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2222', diff saved to https://phabricator.wikimedia.org/P90936 and previous config saved to /var/cache/conftool/dbconfig/20260416-132551-ladsgroup.json
- 13:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1210 (T419961)', diff saved to https://phabricator.wikimedia.org/P90935 and previous config saved to /var/cache/conftool/dbconfig/20260416-132525-fceratto.json
- 13:23 jhathaway@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on krb2002.codfw.wmnet with reason: T407726
- 13:21 mlitn@deploy1003: Started scap sync-world: Backport for Squashed diff to master
- 13:18 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1210 (T419961)', diff saved to https://phabricator.wikimedia.org/P90934 and previous config saved to /var/cache/conftool/dbconfig/20260416-131836-fceratto.json
- 13:18 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1210.eqiad.wmnet with reason: Maintenance
- 13:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1207 (T419961)', diff saved to https://phabricator.wikimedia.org/P90933 and previous config saved to /var/cache/conftool/dbconfig/20260416-131806-fceratto.json
- 13:17 Lucas_WMDE: correction, namespaceDupes sahwikisource run was for T423374, my bad
- 13:17 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host apus-be1005.eqiad.wmnet
- 13:17 lucaswerkmeister-wmde@deploy1003: mwscript-k8s job started: namespaceDupes sahwikisource --fix # T423273
- 13:16 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for etwikiquote: delete unused temporary logo files (T313698), sahwikisource: add Ааптар (author) namespace (T423374) (duration: 10m 59s)
- 13:16 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
- 13:15 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2222', diff saved to https://phabricator.wikimedia.org/P90932 and previous config saved to /var/cache/conftool/dbconfig/20260416-131543-ladsgroup.json
- 13:15 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
- 13:13 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
- 13:12 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
- 13:12 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host apus-be1005.eqiad.wmnet
- 13:10 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
- 13:10 lucaswerkmeister-wmde@deploy1003: anzx, lucaswerkmeister-wmde: Continuing with sync
- 13:09 lucaswerkmeister-wmde@deploy1003: anzx, lucaswerkmeister-wmde: Backport for etwikiquote: delete unused temporary logo files (T313698), sahwikisource: add Ааптар (author) namespace (T423374) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 13:07 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1207', diff saved to https://phabricator.wikimedia.org/P90930 and previous config saved to /var/cache/conftool/dbconfig/20260416-130758-fceratto.json
- 13:05 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2222 (T410589)', diff saved to https://phabricator.wikimedia.org/P90929 and previous config saved to /var/cache/conftool/dbconfig/20260416-130535-ladsgroup.json
- 13:05 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for etwikiquote: delete unused temporary logo files (T313698), sahwikisource: add Ааптар (author) namespace (T423374)
- 13:03 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
- 12:58 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 12:58 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 12:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1207', diff saved to https://phabricator.wikimedia.org/P90928 and previous config saved to /var/cache/conftool/dbconfig/20260416-125750-fceratto.json
- 12:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1207 (T419961)', diff saved to https://phabricator.wikimedia.org/P90927 and previous config saved to /var/cache/conftool/dbconfig/20260416-124742-fceratto.json
- 12:43 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
- 12:41 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
- 12:40 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1207 (T419961)', diff saved to https://phabricator.wikimedia.org/P90926 and previous config saved to /var/cache/conftool/dbconfig/20260416-124032-fceratto.json
- 12:40 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1207.eqiad.wmnet with reason: Maintenance
- 12:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1200 (T419961)', diff saved to https://phabricator.wikimedia.org/P90925 and previous config saved to /var/cache/conftool/dbconfig/20260416-124001-fceratto.json
- 12:38 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
- 12:38 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
- 12:37 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
- 12:36 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
- 12:30 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts moss-be[2001-2002].codfw.wmnet
- 12:30 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 12:30 mvernon@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: moss-be[2001-2002].codfw.wmnet decommissioned, removing all IPs except the asset tag one - mvernon@cumin2002"
- 12:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1200', diff saved to https://phabricator.wikimedia.org/P90924 and previous config saved to /var/cache/conftool/dbconfig/20260416-122953-fceratto.json
- 12:29 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
- 12:28 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
- 12:28 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
- 12:27 mvernon@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: moss-be[2001-2002].codfw.wmnet decommissioned, removing all IPs except the asset tag one - mvernon@cumin2002"
- 12:26 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
- 12:26 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
- 12:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1200', diff saved to https://phabricator.wikimedia.org/P90923 and previous config saved to /var/cache/conftool/dbconfig/20260416-121945-fceratto.json
- 12:19 mvernon@cumin2002: START - Cookbook sre.dns.netbox
- 12:14 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
- 12:11 mvernon@cumin2002: START - Cookbook sre.hosts.decommission for hosts moss-be[2001-2002].codfw.wmnet
- 12:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1200 (T419961)', diff saved to https://phabricator.wikimedia.org/P90922 and previous config saved to /var/cache/conftool/dbconfig/20260416-120935-fceratto.json
- 12:09 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on P{ml-serve1013.eqiad.wmnet} and (A:ml-serve-master-eqiad or A:ml-serve-worker-eqiad)
- 12:09 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1013.eqiad.wmnet
- 12:09 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1013.eqiad.wmnet
- 12:03 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1013.eqiad.wmnet
- 12:02 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host testvm2004.codfw.wmnet with OS trixie
- 12:01 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1200 (T419961)', diff saved to https://phabricator.wikimedia.org/P90921 and previous config saved to /var/cache/conftool/dbconfig/20260416-120104-fceratto.json
- 12:00 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1200.eqiad.wmnet with reason: Maintenance
- 12:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1185 (T419961)', diff saved to https://phabricator.wikimedia.org/P90920 and previous config saved to /var/cache/conftool/dbconfig/20260416-120033-fceratto.json
- 11:56 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
- 11:53 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1013.eqiad.wmnet
- 11:53 klausman@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P{ml-serve1013.eqiad.wmnet} and (A:ml-serve-master-eqiad or A:ml-serve-worker-eqiad)
- 11:53 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
- 11:50 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1247 (T419635)', diff saved to https://phabricator.wikimedia.org/P90919 and previous config saved to /var/cache/conftool/dbconfig/20260416-115055-fceratto.json
- 11:50 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1247.eqiad.wmnet with reason: Maintenance
- 11:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1185', diff saved to https://phabricator.wikimedia.org/P90918 and previous config saved to /var/cache/conftool/dbconfig/20260416-115024-fceratto.json
- 11:43 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-wmde: apply
- 11:42 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-wmde: apply
- 11:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1185', diff saved to https://phabricator.wikimedia.org/P90916 and previous config saved to /var/cache/conftool/dbconfig/20260416-114014-fceratto.json
- 11:38 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on P{ml-serve1012.eqiad.wmnet} and (A:ml-serve-master-eqiad or A:ml-serve-worker-eqiad)
- 11:38 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1012.eqiad.wmnet
- 11:38 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1012.eqiad.wmnet
- 11:33 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-wikidata: apply
- 11:33 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1012.eqiad.wmnet
- 11:32 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-wikidata: apply
- 11:32 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-sre: apply
- 11:31 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-sre: apply
- 11:31 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-search: apply
- 11:31 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-search: apply
- 11:30 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-research: apply
- 11:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1185 (T419961)', diff saved to https://phabricator.wikimedia.org/P90915 and previous config saved to /var/cache/conftool/dbconfig/20260416-113005-fceratto.json
- 11:30 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-research: apply
- 11:29 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-platform-eng: apply
- 11:29 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-platform-eng: apply
- 11:29 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-ml: apply
- 11:28 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-ml: apply
- 11:28 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-main: apply
- 11:28 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-main: apply
- 11:23 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1012.eqiad.wmnet
- 11:23 klausman@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P{ml-serve1012.eqiad.wmnet} and (A:ml-serve-master-eqiad or A:ml-serve-worker-eqiad)
- 11:21 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1185 (T419961)', diff saved to https://phabricator.wikimedia.org/P90914 and previous config saved to /var/cache/conftool/dbconfig/20260416-112136-fceratto.json
- 11:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1185.eqiad.wmnet with reason: Maintenance
- 11:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1161 (T419961)', diff saved to https://phabricator.wikimedia.org/P90913 and previous config saved to /var/cache/conftool/dbconfig/20260416-112105-fceratto.json
- 11:19 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-fr-tech: apply
- 11:19 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-fr-tech: apply
- 11:16 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-analytics-test: apply
- 11:16 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-analytics-test: apply
- 11:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P90911 and previous config saved to /var/cache/conftool/dbconfig/20260416-111058-fceratto.json
- 11:10 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
- 11:07 moritzm: updating debdeploy on bookworm to 0.0.99.15
- 11:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P90910 and previous config saved to /var/cache/conftool/dbconfig/20260416-110049-fceratto.json
- 10:57 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-analytics-product: apply
- 10:57 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-analytics-product: apply
- 10:56 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-test-k8s: apply
- 10:56 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
- 10:55 moritzm: imported debdeploy 0.0.99.15 for bookworm-wikimedia (compat release for Cumin 6)
- 10:52 jmm@cumin2002: DONE (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: wikikube-ctrl2006.codfw.wmnet
- 10:50 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
- 10:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1161 (T419961)', diff saved to https://phabricator.wikimedia.org/P90909 and previous config saved to /var/cache/conftool/dbconfig/20260416-105040-fceratto.json
- 10:50 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
- 10:47 jmm@cumin2002: DONE (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: wikikube-ctrl2005.codfw.wmnet
- 10:47 jmm@cumin2002: DONE (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: wikikube-ctrl2004.codfw.wmnet
- 10:42 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1161 (T419961)', diff saved to https://phabricator.wikimedia.org/P90908 and previous config saved to /var/cache/conftool/dbconfig/20260416-104240-fceratto.json
- 10:42 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1016,1020].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
- 10:42 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1161.eqiad.wmnet with reason: Maintenance
- 10:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1159 (T419961)', diff saved to https://phabricator.wikimedia.org/P90907 and previous config saved to /var/cache/conftool/dbconfig/20260416-104201-fceratto.json
- 10:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1159', diff saved to https://phabricator.wikimedia.org/P90906 and previous config saved to /var/cache/conftool/dbconfig/20260416-103152-fceratto.json
- 10:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1159', diff saved to https://phabricator.wikimedia.org/P90905 and previous config saved to /var/cache/conftool/dbconfig/20260416-102143-fceratto.json
- 10:15 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
- 10:15 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1245.eqiad.wmnet with reason: Maintenance
- 10:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1244 (T419635)', diff saved to https://phabricator.wikimedia.org/P90904 and previous config saved to /var/cache/conftool/dbconfig/20260416-101514-fceratto.json
- 10:15 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
- 10:14 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
- 10:13 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
- 10:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1159 (T419961)', diff saved to https://phabricator.wikimedia.org/P90903 and previous config saved to /var/cache/conftool/dbconfig/20260416-101135-fceratto.json
- 10:09 jynus: backup1014 returns from maintenance, backups and recovery can flow as usual T421719
- 10:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1244', diff saved to https://phabricator.wikimedia.org/P90902 and previous config saved to /var/cache/conftool/dbconfig/20260416-100505-fceratto.json
- 09:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1244', diff saved to https://phabricator.wikimedia.org/P90901 and previous config saved to /var/cache/conftool/dbconfig/20260416-095455-fceratto.json
- 09:54 jmm@cumin2002: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:aux-master-eqiad
- 09:52 moritzm: installing qemu security updates
- 09:47 jynus@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host backup1014
- 09:47 jynus@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host backup1014
- 09:45 jynus@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host backup1014
- 09:45 jynus@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) backup1014.eqiad.wmnet 20.48.64.10.in-addr.arpa 0.2.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
- 09:45 jynus@cumin1003: START - Cookbook sre.dns.wipe-cache backup1014.eqiad.wmnet 20.48.64.10.in-addr.arpa 0.2.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
- 09:45 jynus@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 09:45 jynus@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host backup1014 - jynus@cumin1003"
- 09:45 jynus@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host backup1014 - jynus@cumin1003"
- 09:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1244 (T419635)', diff saved to https://phabricator.wikimedia.org/P90900 and previous config saved to /var/cache/conftool/dbconfig/20260416-094436-fceratto.json
- 09:44 jmm@cumin2002: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:aux-master-eqiad
- 09:42 jmm@cumin2002: DONE (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: cp2042.codfw.wmnet
- 09:42 jmm@cumin2002: DONE (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: cp2041.codfw.wmnet
- 09:41 jynus@cumin1003: START - Cookbook sre.dns.netbox
- 09:40 jynus@cumin1003: START - Cookbook sre.hosts.move-vlan for host backup1014
- 09:37 moritzm: installing imagemagick security updates
- 09:31 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
- 09:29 jynus: setting backup1014 in maintenance, no backup or recovery will run while it T421719
- 09:24 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
- 09:24 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/thumbor: apply
- 09:20 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
- 09:20 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
- 09:18 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/thumbor: apply
- 09:16 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2169: repool after maintenance
- 09:15 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host backup1007
- 09:15 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host backup1007
- 09:15 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host backup1007
- 09:15 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) backup1007.eqiad.wmnet 88.48.64.10.in-addr.arpa 8.8.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
- 09:15 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache backup1007.eqiad.wmnet 88.48.64.10.in-addr.arpa 8.8.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
- 09:14 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 09:14 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host backup1007 - ayounsi@cumin1003"
- 09:14 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host backup1007 - ayounsi@cumin1003"
- 09:13 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/thumbor: apply
- 09:13 jmm@deploy1003: helmfile [staging] START helmfile.d/services/thumbor: apply
- 09:11 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1159 (T419961)', diff saved to https://phabricator.wikimedia.org/P90898 and previous config saved to /var/cache/conftool/dbconfig/20260416-091115-fceratto.json
- 09:11 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1159.eqiad.wmnet with reason: Maintenance
- 09:11 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
- 09:10 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host backup1007
- 09:03 ayounsi@cumin1003: END (FAIL) - Cookbook sre.hosts.move-vlan (exit_code=99) for host backup1007
- 09:03 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host backup1007
- 08:56 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
- 08:56 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
- 08:55 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
- 08:55 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
- 08:51 jmm@dns1004: END - running authdns-update
- 08:50 jmm@dns1004: START - running authdns-update
- 08:43 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1225.eqiad.wmnet with reason: Maintenance
- 08:43 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1201 (T419961)', diff saved to https://phabricator.wikimedia.org/P90895 and previous config saved to /var/cache/conftool/dbconfig/20260416-084331-fceratto.json
- 08:36 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on backup[1007,1014].eqiad.wmnet with reason: maintenance
- 08:33 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1201', diff saved to https://phabricator.wikimedia.org/P90894 and previous config saved to /var/cache/conftool/dbconfig/20260416-083323-fceratto.json
- 08:31 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2169: repool after maintenance
- 08:27 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2169.codfw.wmnet with OS trixie
- 08:23 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1201', diff saved to https://phabricator.wikimedia.org/P90892 and previous config saved to /var/cache/conftool/dbconfig/20260416-082314-fceratto.json
- 08:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1201 (T419961)', diff saved to https://phabricator.wikimedia.org/P90891 and previous config saved to /var/cache/conftool/dbconfig/20260416-081305-fceratto.json
- 08:05 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2169.codfw.wmnet with reason: host reimage
- 08:04 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1201 (T419961)', diff saved to https://phabricator.wikimedia.org/P90890 and previous config saved to /var/cache/conftool/dbconfig/20260416-080445-fceratto.json
- 08:04 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1201.eqiad.wmnet with reason: Maintenance
- 08:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1165 (T419961)', diff saved to https://phabricator.wikimedia.org/P90889 and previous config saved to /var/cache/conftool/dbconfig/20260416-080420-fceratto.json
- 08:00 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2169.codfw.wmnet with reason: host reimage
- 07:55 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1244 (T419635)', diff saved to https://phabricator.wikimedia.org/P90888 and previous config saved to /var/cache/conftool/dbconfig/20260416-075522-fceratto.json
- 07:55 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1244.eqiad.wmnet with reason: Maintenance
- 07:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1243 (T419635)', diff saved to https://phabricator.wikimedia.org/P90887 and previous config saved to /var/cache/conftool/dbconfig/20260416-075457-fceratto.json
- 07:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1165', diff saved to https://phabricator.wikimedia.org/P90886 and previous config saved to /var/cache/conftool/dbconfig/20260416-075410-fceratto.json
- 07:45 jmm@cumin2002: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:aux-worker-eqiad
- 07:45 jmm@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker1009.eqiad.wmnet
- 07:45 jmm@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker1009.eqiad.wmnet
- 07:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1243', diff saved to https://phabricator.wikimedia.org/P90885 and previous config saved to /var/cache/conftool/dbconfig/20260416-074448-fceratto.json
- 07:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1165', diff saved to https://phabricator.wikimedia.org/P90884 and previous config saved to /var/cache/conftool/dbconfig/20260416-074402-fceratto.json
- 07:41 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2169.codfw.wmnet with OS trixie
- 07:40 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2169: Reimage to Trixie
- 07:40 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2169: Reimage to Trixie
- 07:40 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2169.codfw.wmnet with reason: Reimage to Trixie
- 07:39 jmm@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker1009.eqiad.wmnet
- 07:39 jmm@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker1009.eqiad.wmnet
- 07:39 jmm@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker1008.eqiad.wmnet
- 07:39 jmm@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker1008.eqiad.wmnet
- 07:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1243', diff saved to https://phabricator.wikimedia.org/P90882 and previous config saved to /var/cache/conftool/dbconfig/20260416-073440-fceratto.json
- 07:33 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1165 (T419961)', diff saved to https://phabricator.wikimedia.org/P90881 and previous config saved to /var/cache/conftool/dbconfig/20260416-073354-fceratto.json
- 07:33 jmm@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker1008.eqiad.wmnet
- 07:33 jmm@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker1008.eqiad.wmnet
- 07:32 jmm@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker1007.eqiad.wmnet
- 07:32 jmm@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker1007.eqiad.wmnet
- 07:27 jmm@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker1007.eqiad.wmnet
- 07:27 jmm@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker1007.eqiad.wmnet
- 07:26 jmm@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker1006.eqiad.wmnet
- 07:26 jmm@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker1006.eqiad.wmnet
- 07:26 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1165 (T419961)', diff saved to https://phabricator.wikimedia.org/P90880 and previous config saved to /var/cache/conftool/dbconfig/20260416-072650-fceratto.json
- 07:26 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb1015.eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
- 07:26 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1165.eqiad.wmnet with reason: Maintenance
- 07:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1243 (T419635)', diff saved to https://phabricator.wikimedia.org/P90879 and previous config saved to /var/cache/conftool/dbconfig/20260416-072432-fceratto.json
- 07:23 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2193: after reimage to trixie
- 07:21 jmm@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker1006.eqiad.wmnet
- 07:16 jmm@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker1006.eqiad.wmnet
- 06:59 jmm@cumin2002: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:aux-worker-eqiad
- 06:55 moritzm: imported opensearch-madvise 0.2+deb13u1 to component/opensearch2 of trixie-wikimedia T422860
- 06:40 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for wikikube-worker2280.codfw.wmnet
- 06:40 jayme@cumin2002: START - Cookbook sre.hosts.remove-downtime for wikikube-worker2280.codfw.wmnet
- 06:40 jayme@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2280.codfw.wmnet
- 06:40 jayme@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2280.codfw.wmnet
- 06:38 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2193: after reimage to trixie
- 06:37 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2193.codfw.wmnet with OS trixie
- 06:14 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2193.codfw.wmnet with reason: host reimage
- 06:08 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2193.codfw.wmnet with reason: host reimage
- 05:49 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2193.codfw.wmnet with OS trixie
- 05:48 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2193: Reimage to Trixie
- 05:47 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2193: Reimage to Trixie
- 05:47 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2193.codfw.wmnet with reason: Reimage to Trixie
- 05:37 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1243 (T419635)', diff saved to https://phabricator.wikimedia.org/P90873 and previous config saved to /var/cache/conftool/dbconfig/20260416-053659-fceratto.json
- 05:36 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1243.eqiad.wmnet with reason: Maintenance
- 05:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1242 (T419635)', diff saved to https://phabricator.wikimedia.org/P90872 and previous config saved to /var/cache/conftool/dbconfig/20260416-053635-fceratto.json
- 05:31 marostegui@cumin1003: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts clouddb1019.eqiad.wmnet
- 05:31 marostegui@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 05:31 marostegui@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: clouddb1019.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
- 05:30 marostegui@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: clouddb1019.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
- 05:27 marostegui@cumin1003: START - Cookbook sre.dns.netbox
- 05:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1242', diff saved to https://phabricator.wikimedia.org/P90871 and previous config saved to /var/cache/conftool/dbconfig/20260416-052626-fceratto.json
- 05:22 marostegui@cumin1003: START - Cookbook sre.hosts.decommission for hosts clouddb1019.eqiad.wmnet
- 05:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1242', diff saved to https://phabricator.wikimedia.org/P90870 and previous config saved to /var/cache/conftool/dbconfig/20260416-051618-fceratto.json
- 05:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1242 (T419635)', diff saved to https://phabricator.wikimedia.org/P90869 and previous config saved to /var/cache/conftool/dbconfig/20260416-050609-fceratto.json
- 03:19 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1242 (T419635)', diff saved to https://phabricator.wikimedia.org/P90868 and previous config saved to /var/cache/conftool/dbconfig/20260416-031934-fceratto.json
- 03:19 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1242.eqiad.wmnet with reason: Maintenance
- 03:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1241 (T419635)', diff saved to https://phabricator.wikimedia.org/P90867 and previous config saved to /var/cache/conftool/dbconfig/20260416-031910-fceratto.json
- 03:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1241', diff saved to https://phabricator.wikimedia.org/P90866 and previous config saved to /var/cache/conftool/dbconfig/20260416-030902-fceratto.json
- 02:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1241', diff saved to https://phabricator.wikimedia.org/P90865 and previous config saved to /var/cache/conftool/dbconfig/20260416-025853-fceratto.json
- 02:53 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
- 02:52 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1253 (T410589)', diff saved to https://phabricator.wikimedia.org/P90864 and previous config saved to /var/cache/conftool/dbconfig/20260416-025247-ladsgroup.json
- 02:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1241 (T419635)', diff saved to https://phabricator.wikimedia.org/P90863 and previous config saved to /var/cache/conftool/dbconfig/20260416-024845-fceratto.json
- 02:42 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1253', diff saved to https://phabricator.wikimedia.org/P90862 and previous config saved to /var/cache/conftool/dbconfig/20260416-024239-ladsgroup.json
- 02:32 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1253', diff saved to https://phabricator.wikimedia.org/P90861 and previous config saved to /var/cache/conftool/dbconfig/20260416-023231-ladsgroup.json
- 02:22 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1253 (T410589)', diff saved to https://phabricator.wikimedia.org/P90860 and previous config saved to /var/cache/conftool/dbconfig/20260416-022223-ladsgroup.json
- 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 16s)
- 02:01 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
- 01:27 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db2222 (T410589)', diff saved to https://phabricator.wikimedia.org/P90859 and previous config saved to /var/cache/conftool/dbconfig/20260416-012755-ladsgroup.json
- 01:27 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db2222.codfw.wmnet with reason: Maintenance
- 01:27 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2221 (T410589)', diff saved to https://phabricator.wikimedia.org/P90858 and previous config saved to /var/cache/conftool/dbconfig/20260416-012730-ladsgroup.json
- 01:17 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2221', diff saved to https://phabricator.wikimedia.org/P90857 and previous config saved to /var/cache/conftool/dbconfig/20260416-011722-ladsgroup.json
- 01:07 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2221', diff saved to https://phabricator.wikimedia.org/P90856 and previous config saved to /var/cache/conftool/dbconfig/20260416-010714-ladsgroup.json
- 01:02 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1241 (T419635)', diff saved to https://phabricator.wikimedia.org/P90855 and previous config saved to /var/cache/conftool/dbconfig/20260416-010218-fceratto.json
- 01:02 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1241.eqiad.wmnet with reason: Maintenance
- 01:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1238 (T419635)', diff saved to https://phabricator.wikimedia.org/P90854 and previous config saved to /var/cache/conftool/dbconfig/20260416-010154-fceratto.json
- 00:57 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2221 (T410589)', diff saved to https://phabricator.wikimedia.org/P90853 and previous config saved to /var/cache/conftool/dbconfig/20260416-005706-ladsgroup.json
- 00:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1238', diff saved to https://phabricator.wikimedia.org/P90852 and previous config saved to /var/cache/conftool/dbconfig/20260416-005146-fceratto.json
- 00:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1238', diff saved to https://phabricator.wikimedia.org/P90851 and previous config saved to /var/cache/conftool/dbconfig/20260416-004138-fceratto.json
- 00:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1238 (T419635)', diff saved to https://phabricator.wikimedia.org/P90850 and previous config saved to /var/cache/conftool/dbconfig/20260416-003130-fceratto.json
2026-04-15
- 23:35 cscott@deploy1003: Finished scap sync-world: Backport for Exclude parser functions from SpecialLintTemplateErrors (T420102) (duration: 32m 47s)
- 23:23 cscott@deploy1003: cscott: Continuing with sync
- 23:20 cscott@deploy1003: cscott: Backport for Exclude parser functions from SpecialLintTemplateErrors (T420102) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 23:05 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
- 23:05 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
- 23:03 cscott@deploy1003: Started scap sync-world: Backport for Exclude parser functions from SpecialLintTemplateErrors (T420102)
- 22:57 cscott@deploy1003: Finished scap sync-world: Backport for Pass preferred LanguageConverter variant explicitly instead of implicitly (T415435), Make variant into a parser option for parsoid language conversion (T415435) (duration: 16m 00s)
- 22:53 cscott@deploy1003: cscott: Continuing with sync
- 22:43 cscott@deploy1003: cscott: Backport for Pass preferred LanguageConverter variant explicitly instead of implicitly (T415435), Make variant into a parser option for parsoid language conversion (T415435) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 22:43 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1238 (T419635)', diff saved to https://phabricator.wikimedia.org/P90849 and previous config saved to /var/cache/conftool/dbconfig/20260415-224305-fceratto.json
- 22:42 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1238.eqiad.wmnet with reason: Maintenance
- 22:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1221 (T419635)', diff saved to https://phabricator.wikimedia.org/P90848 and previous config saved to /var/cache/conftool/dbconfig/20260415-224241-fceratto.json
- 22:41 cscott@deploy1003: Started scap sync-world: Backport for Pass preferred LanguageConverter variant explicitly instead of implicitly (T415435), Make variant into a parser option for parsoid language conversion (T415435)
- 22:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1221', diff saved to https://phabricator.wikimedia.org/P90847 and previous config saved to /var/cache/conftool/dbconfig/20260415-223233-fceratto.json
- 22:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1221', diff saved to https://phabricator.wikimedia.org/P90846 and previous config saved to /var/cache/conftool/dbconfig/20260415-222225-fceratto.json
- 22:15 jforrester@deploy1003: Finished scap sync-world: Backport for PageRenderingHandler: Don't run repo-mode lang check in non-repo world either (T423515) (duration: 08m 48s)
- 22:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1221 (T419635)', diff saved to https://phabricator.wikimedia.org/P90845 and previous config saved to /var/cache/conftool/dbconfig/20260415-221216-fceratto.json
- 22:11 jforrester@deploy1003: jforrester: Continuing with sync
- 22:08 jforrester@deploy1003: jforrester: Backport for PageRenderingHandler: Don't run repo-mode lang check in non-repo world either (T423515) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 22:06 jforrester@deploy1003: Started scap sync-world: Backport for PageRenderingHandler: Don't run repo-mode lang check in non-repo world either (T423515)
- 21:29 eevans@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for aqs1027.eqiad.wmnet
- 21:29 eevans@cumin1003: START - Cookbook sre.hosts.remove-downtime for aqs1027.eqiad.wmnet
- 21:14 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
- 21:13 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
- 21:13 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
- 21:13 dduvall@deploy1003: rebuilt and synchronized wikiversions files: group1 to 1.46.0-wmf.24 refs T420482
- 21:13 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
- 21:12 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
- 21:12 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
- 21:07 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
- 21:06 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
- 21:06 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
- 21:06 bking@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on cloudelastic1012.eqiad.wmnet with reason: still fixing Puppet
- 21:06 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
- 21:05 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
- 21:05 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
- 21:05 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
- 21:05 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
- 21:05 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
- 21:05 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
- 21:04 swfrench@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
- 21:04 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
- 21:04 swfrench@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
- 21:03 swfrench@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
- 21:03 swfrench@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
- 21:02 swfrench@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
- 20:58 jforrester@deploy1003: Finished scap sync-world: Backport for PageRenderingHandler: Handle Wikibase's OutOfBoundsException for "we don't have a label" (T423514), PageRenderingHandler: Don't run repo-mode stuff in non-repo world (T423515) (duration: 06m 08s)
- 20:54 jforrester@deploy1003: jforrester: Continuing with sync
- 20:54 jforrester@deploy1003: jforrester: Backport for PageRenderingHandler: Handle Wikibase's OutOfBoundsException for "we don't have a label" (T423514), PageRenderingHandler: Don't run repo-mode stuff in non-repo world (T423515) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 20:52 jforrester@deploy1003: Started scap sync-world: Backport for PageRenderingHandler: Handle Wikibase's OutOfBoundsException for "we don't have a label" (T423514), PageRenderingHandler: Don't run repo-mode stuff in non-repo world (T423515)
- 20:46 jforrester@deploy1003: Finished scap sync-world: Backport for Drop 1.5x logos (T246054), Enwikinews: disable lingering FlaggedRevs template processing (T423512), Record file usage from TemplateStyles pages (T413707) (duration: 09m 15s)
- 20:42 jforrester@deploy1003: jforrester, bawolff, pppery: Continuing with sync
- 20:42 topranks: enable BGP over GRE between cr1-drmrs and cr2-eqiad
- 20:38 jforrester@deploy1003: jforrester, bawolff, pppery: Backport for Drop 1.5x logos (T246054), Enwikinews: disable lingering FlaggedRevs template processing (T423512), Record file usage from TemplateStyles pages (T413707) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 20:37 jforrester@deploy1003: Started scap sync-world: Backport for Drop 1.5x logos (T246054), Enwikinews: disable lingering FlaggedRevs template processing (T423512), Record file usage from TemplateStyles pages (T413707)
- 20:36 cmooney@dns2005: END - running authdns-update
- 20:35 cmooney@dns2005: START - running authdns-update
- 20:34 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 20:34 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: generate v6 reverse records for 2a02:ec80:600:fe0a::1/64 - cmooney@cumin1003"
- 20:33 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: generate v6 reverse records for 2a02:ec80:600:fe0a::1/64 - cmooney@cumin1003"
- 20:31 mstyles@deploy1003: Finished scap sync-world: Backport for Force Reauth (T419621), Rename Test Kitchen Experiment (T420007), Rename Test Kitchen Experiment (T420007), Rename Test Kitchen Experiment (T420007), Rename Test Kitchen Experiment (T420007) (duration: 07m 48s)
- 20:30 cmooney@cumin1003: START - Cookbook sre.dns.netbox
- 20:27 mstyles@deploy1003: mstyles: Continuing with sync
- 20:26 topranks: enable ospf on GRE cr1-drmrs <-> cr2-eqiad
- 20:25 mstyles@deploy1003: mstyles: Backport for Force Reauth (T419621), Rename Test Kitchen Experiment (T420007), Rename Test Kitchen Experiment (T420007), Rename Test Kitchen Experiment (T420007), Rename Test Kitchen Experiment (T420007) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now
- 20:23 mstyles@deploy1003: Started scap sync-world: Backport for Force Reauth (T419621), Rename Test Kitchen Experiment (T420007), Rename Test Kitchen Experiment (T420007), Rename Test Kitchen Experiment (T420007), Rename Test Kitchen Experiment (T420007)
- 20:20 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
- 20:19 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
- 20:17 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1221 (T419635)', diff saved to https://phabricator.wikimedia.org/P90844 and previous config saved to /var/cache/conftool/dbconfig/20260415-201700-fceratto.json
- 20:16 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1015,1024-1025].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
- 20:16 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1221.eqiad.wmnet with reason: Maintenance
- 20:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1199 (T419635)', diff saved to https://phabricator.wikimedia.org/P90843 and previous config saved to /var/cache/conftool/dbconfig/20260415-201613-fceratto.json
- 20:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1199', diff saved to https://phabricator.wikimedia.org/P90842 and previous config saved to /var/cache/conftool/dbconfig/20260415-200605-fceratto.json
- 20:05 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 20:05 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add ipv4 dns names for eqiad-drmrs gre tunnel - cmooney@cumin1003"
- 20:00 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add ipv4 dns names for eqiad-drmrs gre tunnel - cmooney@cumin1003"
- 19:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1199', diff saved to https://phabricator.wikimedia.org/P90841 and previous config saved to /var/cache/conftool/dbconfig/20260415-195556-fceratto.json
- 19:55 cmooney@cumin1003: START - Cookbook sre.dns.netbox
- 19:55 topranks: add static routes on cr1-drmrs and cr2-eqiad for arelion GRE far-side IPv4 addresses
- 19:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1199 (T419635)', diff saved to https://phabricator.wikimedia.org/P90840 and previous config saved to /var/cache/conftool/dbconfig/20260415-194548-fceratto.json
- 19:38 topranks: add GRE tunnel to cr2-eqiad towards cr1-drmrs
- 19:37 topranks: add GRE tunnel to cr1-drmrs towards cr2-eqiad
- 18:50 dduvall@deploy1003: rebuilt and synchronized wikiversions files: group0 to 1.46.0-wmf.24 refs T420482
- 18:43 dduvall: rolling back due to steady `Term with languageCode "en" not found` errors (cc T420482)
- 18:27 eevans@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on aqs1027.eqiad.wmnet with reason: Bootstrapping — T412830
- 18:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2228 (T419961)', diff saved to https://phabricator.wikimedia.org/P90839 and previous config saved to /var/cache/conftool/dbconfig/20260415-181833-fceratto.json
- 18:15 dduvall@deploy1003: rebuilt and synchronized wikiversions files: group1 to 1.46.0-wmf.24 refs T420482
- 18:08 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp1115.*
- 18:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2228', diff saved to https://phabricator.wikimedia.org/P90837 and previous config saved to /var/cache/conftool/dbconfig/20260415-180825-fceratto.json
- 18:01 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp1115.eqiad.wmnet with OS trixie
- 18:01 brett@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - brett@cumin2002"
- 17:58 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/page-analytics: apply
- 17:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2228', diff saved to https://phabricator.wikimedia.org/P90836 and previous config saved to /var/cache/conftool/dbconfig/20260415-175817-fceratto.json
- 17:58 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/page-analytics: apply
- 17:57 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/media-analytics: apply
- 17:57 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/media-analytics: apply
- 17:57 brett@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - brett@cumin2002"
- 17:56 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/geo-analytics: apply
- 17:56 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/geo-analytics: apply
- 17:56 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/editor-analytics: apply
- 17:56 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/editor-analytics: apply
- 17:55 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/edit-analytics: apply
- 17:55 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/edit-analytics: apply
- 17:54 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/device-analytics: apply
- 17:54 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/device-analytics: apply
- 17:53 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/data-gateway: apply
- 17:53 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/data-gateway: apply
- 17:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2228 (T419961)', diff saved to https://phabricator.wikimedia.org/P90835 and previous config saved to /var/cache/conftool/dbconfig/20260415-174808-fceratto.json
- 17:48 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/page-analytics: apply
- 17:47 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/page-analytics: apply
- 17:47 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/media-analytics: apply
- 17:46 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/media-analytics: apply
- 17:46 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/geo-analytics: apply
- 17:46 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/geo-analytics: apply
- 17:45 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/editor-analytics: apply
- 17:45 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/editor-analytics: apply
- 17:44 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/edit-analytics: apply
- 17:44 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/edit-analytics: apply
- 17:44 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/device-analytics: apply
- 17:43 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/device-analytics: apply
- 17:43 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/data-gateway: apply
- 17:43 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/data-gateway: apply
- 17:42 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db1253 (T410589)', diff saved to https://phabricator.wikimedia.org/P90834 and previous config saved to /var/cache/conftool/dbconfig/20260415-174236-ladsgroup.json
- 17:42 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db1253.eqiad.wmnet with reason: Maintenance
- 17:42 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1236 (T410589)', diff saved to https://phabricator.wikimedia.org/P90833 and previous config saved to /var/cache/conftool/dbconfig/20260415-174212-ladsgroup.json
- 17:41 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2228 (T419961)', diff saved to https://phabricator.wikimedia.org/P90832 and previous config saved to /var/cache/conftool/dbconfig/20260415-174107-fceratto.json
- 17:40 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2228.codfw.wmnet with reason: Maintenance
- 17:40 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/page-analytics: apply
- 17:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2223 (T419961)', diff saved to https://phabricator.wikimedia.org/P90831 and previous config saved to /var/cache/conftool/dbconfig/20260415-174035-fceratto.json
- 17:40 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/page-analytics: apply
- 17:39 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/media-analytics: apply
- 17:39 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/media-analytics: apply
- 17:39 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/geo-analytics: apply
- 17:39 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/geo-analytics: apply
- 17:38 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/editor-analytics: apply
- 17:38 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/editor-analytics: apply
- 17:37 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/edit-analytics: apply
- 17:37 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/edit-analytics: apply
- 17:37 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/device-analytics: apply
- 17:37 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/device-analytics: apply
- 17:36 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/data-gateway: apply
- 17:36 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/data-gateway: apply
- 17:36 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1199 (T419635)', diff saved to https://phabricator.wikimedia.org/P90830 and previous config saved to /var/cache/conftool/dbconfig/20260415-173602-fceratto.json
- 17:35 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1199.eqiad.wmnet with reason: Maintenance
- 17:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1190 (T419635)', diff saved to https://phabricator.wikimedia.org/P90829 and previous config saved to /var/cache/conftool/dbconfig/20260415-173525-fceratto.json
- 17:34 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp1115.eqiad.wmnet with reason: host reimage
- 17:32 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1236', diff saved to https://phabricator.wikimedia.org/P90828 and previous config saved to /var/cache/conftool/dbconfig/20260415-173203-ladsgroup.json
- 17:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2223', diff saved to https://phabricator.wikimedia.org/P90827 and previous config saved to /var/cache/conftool/dbconfig/20260415-173027-fceratto.json
- 17:29 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp1115.eqiad.wmnet with reason: host reimage
- 17:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1190', diff saved to https://phabricator.wikimedia.org/P90826 and previous config saved to /var/cache/conftool/dbconfig/20260415-172517-fceratto.json
- 17:21 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1236', diff saved to https://phabricator.wikimedia.org/P90825 and previous config saved to /var/cache/conftool/dbconfig/20260415-172155-ladsgroup.json
- 17:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2223', diff saved to https://phabricator.wikimedia.org/P90824 and previous config saved to /var/cache/conftool/dbconfig/20260415-172019-fceratto.json
- 17:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1190', diff saved to https://phabricator.wikimedia.org/P90823 and previous config saved to /var/cache/conftool/dbconfig/20260415-171509-fceratto.json
- 17:12 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp1115.eqiad.wmnet with OS trixie
- 17:11 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1236 (T410589)', diff saved to https://phabricator.wikimedia.org/P90822 and previous config saved to /var/cache/conftool/dbconfig/20260415-171147-ladsgroup.json
- 17:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2223 (T419961)', diff saved to https://phabricator.wikimedia.org/P90821 and previous config saved to /var/cache/conftool/dbconfig/20260415-171011-fceratto.json
- 17:09 kamila@deploy1003: Finished scap sync-world: Backport for Revert "Temporarily add shellbox-icu to $wgShellboxUrls" (T422546), Revert "Enable $wgTempCategoryCollations for testwiki." (T422546) (duration: 16m 10s)
- 17:05 kamila@deploy1003: kamila: Continuing with sync
- 17:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1190 (T419635)', diff saved to https://phabricator.wikimedia.org/P90820 and previous config saved to /var/cache/conftool/dbconfig/20260415-170501-fceratto.json
- 17:03 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2223 (T419961)', diff saved to https://phabricator.wikimedia.org/P90819 and previous config saved to /var/cache/conftool/dbconfig/20260415-170310-fceratto.json
- 17:03 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2223.codfw.wmnet with reason: Maintenance
- 17:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2211 (T419961)', diff saved to https://phabricator.wikimedia.org/P90818 and previous config saved to /var/cache/conftool/dbconfig/20260415-170239-fceratto.json
- 16:55 kamila@deploy1003: kamila: Backport for Revert "Temporarily add shellbox-icu to $wgShellboxUrls" (T422546), Revert "Enable $wgTempCategoryCollations for testwiki." (T422546) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 16:53 kamila@deploy1003: Started scap sync-world: Backport for Revert "Temporarily add shellbox-icu to $wgShellboxUrls" (T422546), Revert "Enable $wgTempCategoryCollations for testwiki." (T422546)
- 16:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2211', diff saved to https://phabricator.wikimedia.org/P90817 and previous config saved to /var/cache/conftool/dbconfig/20260415-165231-fceratto.json
- 16:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2211', diff saved to https://phabricator.wikimedia.org/P90816 and previous config saved to /var/cache/conftool/dbconfig/20260415-164223-fceratto.json
- 16:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2211 (T419961)', diff saved to https://phabricator.wikimedia.org/P90815 and previous config saved to /var/cache/conftool/dbconfig/20260415-163215-fceratto.json
- 16:25 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2211 (T419961)', diff saved to https://phabricator.wikimedia.org/P90814 and previous config saved to /var/cache/conftool/dbconfig/20260415-162513-fceratto.json
- 16:25 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2211.codfw.wmnet with reason: Maintenance
- 16:20 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2201.codfw.wmnet with reason: Maintenance
- 16:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2192 (T419961)', diff saved to https://phabricator.wikimedia.org/P90813 and previous config saved to /var/cache/conftool/dbconfig/20260415-161936-fceratto.json
- 16:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2192', diff saved to https://phabricator.wikimedia.org/P90810 and previous config saved to /var/cache/conftool/dbconfig/20260415-160928-fceratto.json
- 15:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2192', diff saved to https://phabricator.wikimedia.org/P90809 and previous config saved to /var/cache/conftool/dbconfig/20260415-155920-fceratto.json
- 15:56 jynus@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for backup1012.eqiad.wmnet
- 15:56 jynus@cumin1003: START - Cookbook sre.hosts.remove-downtime for backup1012.eqiad.wmnet
- 15:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2192 (T419961)', diff saved to https://phabricator.wikimedia.org/P90807 and previous config saved to /var/cache/conftool/dbconfig/20260415-154911-fceratto.json
- 15:43 blake@deploy1003: Finished scap sync-world: Backport for ProductionServices: re-add poolcounter1007.eqiad. (T420171) (duration: 06m 09s)
- 15:42 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2192 (T419961)', diff saved to https://phabricator.wikimedia.org/P90806 and previous config saved to /var/cache/conftool/dbconfig/20260415-154210-fceratto.json
- 15:42 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2192.codfw.wmnet with reason: Maintenance
- 15:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2178 (T419961)', diff saved to https://phabricator.wikimedia.org/P90805 and previous config saved to /var/cache/conftool/dbconfig/20260415-154138-fceratto.json
- 15:39 blake@deploy1003: blake: Continuing with sync
- 15:39 blake@deploy1003: blake: Backport for ProductionServices: re-add poolcounter1007.eqiad. (T420171) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 15:37 blake@deploy1003: Started scap sync-world: Backport for ProductionServices: re-add poolcounter1007.eqiad. (T420171)
- 15:35 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host poolcounter1007.eqiad.wmnet
- 15:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2178', diff saved to https://phabricator.wikimedia.org/P90804 and previous config saved to /var/cache/conftool/dbconfig/20260415-153130-fceratto.json
- 15:31 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host poolcounter1007.eqiad.wmnet
- 15:31 blake@deploy1003: Finished scap sync-world: Backport for ProductionServices: remove poolcounter1007.eqiad, add 1006 (T420171) (duration: 06m 19s)
- 15:30 Emperor: update & restart envoy on ms swift frontends T410975 T419637
- 15:30 Emperor: update & restart envoy on thanos frontends T410975 T419637
- 15:27 blake@deploy1003: blake: Continuing with sync
- 15:26 blake@deploy1003: blake: Backport for ProductionServices: remove poolcounter1007.eqiad, add 1006 (T420171) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 15:26 Emperor: update & restart envoy on apus frontends T410975 T419637
- 15:24 jmm@cumin2002: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:aux-master-codfw
- 15:24 blake@deploy1003: Started scap sync-world: Backport for ProductionServices: remove poolcounter1007.eqiad, add 1006 (T420171)
- 15:24 Emperor: update & restart envoy on apus frontends T423065 T382824
- 15:22 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host poolcounter1006.eqiad.wmnet
- 15:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2178', diff saved to https://phabricator.wikimedia.org/P90803 and previous config saved to /var/cache/conftool/dbconfig/20260415-152122-fceratto.json
- 15:19 moritzm: installing Dovecot security updates on mx-out*
- 15:18 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host poolcounter1006.eqiad.wmnet
- 15:18 blake@deploy1003: Finished scap sync-world: Backport for ProductionServices: remove poolcounter1006.eqiad (T420171) (duration: 06m 59s)
- 15:14 blake@deploy1003: blake: Continuing with sync
- 15:14 jmm@cumin2002: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:aux-master-codfw
- 15:13 blake@deploy1003: blake: Backport for ProductionServices: remove poolcounter1006.eqiad (T420171) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 15:12 moritzm: installing inetutils security updates
- 15:11 blake@deploy1003: Started scap sync-world: Backport for ProductionServices: remove poolcounter1006.eqiad (T420171)
- 15:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2178 (T419961)', diff saved to https://phabricator.wikimedia.org/P90802 and previous config saved to /var/cache/conftool/dbconfig/20260415-151114-fceratto.json
- 15:08 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-test-k8s: apply
- 15:06 samtar@deploy1003: Finished scap sync-world: Backport for Revert^2 "lbwiki: Set minimum requirement of 10 edits for wgAutoConfirmCount" (duration: 06m 54s)
- 15:04 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2178 (T419961)', diff saved to https://phabricator.wikimedia.org/P90801 and previous config saved to /var/cache/conftool/dbconfig/20260415-150415-fceratto.json
- 15:04 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2178.codfw.wmnet with reason: Maintenance
- 15:03 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2171 (T419961)', diff saved to https://phabricator.wikimedia.org/P90800 and previous config saved to /var/cache/conftool/dbconfig/20260415-150344-fceratto.json
- 15:02 samtar@deploy1003: samtar: Continuing with sync
- 15:02 samtar@deploy1003: samtar: Backport for Revert^2 "lbwiki: Set minimum requirement of 10 edits for wgAutoConfirmCount" synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 15:01 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 15:01 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 14:59 samtar@deploy1003: Started scap sync-world: Backport for Revert^2 "lbwiki: Set minimum requirement of 10 edits for wgAutoConfirmCount"
- 14:59 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1190 (T419635)', diff saved to https://phabricator.wikimedia.org/P90799 and previous config saved to /var/cache/conftool/dbconfig/20260415-145918-fceratto.json
- 14:59 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1190.eqiad.wmnet with reason: Maintenance
- 14:57 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
- 14:57 elukey@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'sync'.
- 14:57 elukey@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'sync'.
- 14:56 elukey@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'sync'.
- 14:56 elukey@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'sync'.
- 14:55 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on backup1012.eqiad.wmnet with reason: maintenance
- 14:53 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2171', diff saved to https://phabricator.wikimedia.org/P90798 and previous config saved to /var/cache/conftool/dbconfig/20260415-145335-fceratto.json
- 14:53 samtar@deploy1003: Finished scap sync-world: Backport for Revert "lbwiki: Set minimum requirement of 10 edits for wgAutoConfirmCount" (duration: 06m 12s)
- 14:52 cgoubert@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker1273.eqiad.wmnet
- 14:51 cgoubert@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker1273.eqiad.wmnet
- 14:49 samtar@deploy1003: samtar: Continuing with sync
- 14:49 samtar@deploy1003: samtar: Backport for Revert "lbwiki: Set minimum requirement of 10 edits for wgAutoConfirmCount" synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 14:47 samtar@deploy1003: Started scap sync-world: Backport for Revert "lbwiki: Set minimum requirement of 10 edits for wgAutoConfirmCount"
- 14:46 cgoubert@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2280.codfw.wmnet
- 14:45 cgoubert@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2280.codfw.wmnet
- 14:43 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifeeds: apply
- 14:43 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2171', diff saved to https://phabricator.wikimedia.org/P90797 and previous config saved to /var/cache/conftool/dbconfig/20260415-144327-fceratto.json
- 14:42 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/wikifeeds: apply
- 14:42 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifeeds: apply
- 14:42 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifeeds: apply
- 14:41 cgoubert@cumin1003: END (FAIL) - Cookbook sre.k8s.pool-depool-node (exit_code=99) depool for host wikikube-worker2280.codfw.wmnet
- 14:40 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 14:40 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 14:39 cgoubert@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2280.codfw.wmnet
- 14:36 jgiannelos@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifeeds: apply
- 14:36 jgiannelos@deploy1003: helmfile [staging] START helmfile.d/services/wikifeeds: apply
- 14:33 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2171 (T419961)', diff saved to https://phabricator.wikimedia.org/P90796 and previous config saved to /var/cache/conftool/dbconfig/20260415-143319-fceratto.json
- 14:26 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2171 (T419961)', diff saved to https://phabricator.wikimedia.org/P90795 and previous config saved to /var/cache/conftool/dbconfig/20260415-142615-fceratto.json
- 14:26 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2171.codfw.wmnet with reason: Maintenance
- 14:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2157 (T419961)', diff saved to https://phabricator.wikimedia.org/P90794 and previous config saved to /var/cache/conftool/dbconfig/20260415-142543-fceratto.json
- 14:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2157', diff saved to https://phabricator.wikimedia.org/P90792 and previous config saved to /var/cache/conftool/dbconfig/20260415-141535-fceratto.json
- 14:06 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
- 14:06 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
- 14:06 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
- 14:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2157', diff saved to https://phabricator.wikimedia.org/P90790 and previous config saved to /var/cache/conftool/dbconfig/20260415-140527-fceratto.json
- 14:04 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
- 14:04 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
- 13:56 jmm@cumin2002: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:aux-worker-codfw
- 13:56 jmm@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker2009.codfw.wmnet
- 13:56 jmm@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker2009.codfw.wmnet
- 13:55 samtar@deploy1003: samtar, codenamenoreste: Continuing with sync
- 13:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2157 (T419961)', diff saved to https://phabricator.wikimedia.org/P90789 and previous config saved to /var/cache/conftool/dbconfig/20260415-135519-fceratto.json
- 13:53 samtar@deploy1003: samtar, codenamenoreste: Backport for lbwiki: Set minimum requirement of 10 edits for wgAutoConfirmCount (T423102) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 13:51 samtar@deploy1003: Started scap sync-world: Backport for lbwiki: Set minimum requirement of 10 edits for wgAutoConfirmCount (T423102)
- 13:51 jmm@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker2009.codfw.wmnet
- 13:50 jmm@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker2009.codfw.wmnet
- 13:50 jmm@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker2008.codfw.wmnet
- 13:50 jmm@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker2008.codfw.wmnet
- 13:47 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2157 (T419961)', diff saved to https://phabricator.wikimedia.org/P90788 and previous config saved to /var/cache/conftool/dbconfig/20260415-134704-fceratto.json
- 13:46 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2157.codfw.wmnet with reason: Maintenance
- 13:45 jmm@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker2008.codfw.wmnet
- 13:45 jmm@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker2008.codfw.wmnet
- 13:44 jmm@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker2007.codfw.wmnet
- 13:44 jmm@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker2007.codfw.wmnet
- 13:44 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2068.codfw.wmnet with OS bullseye
- 13:39 jmm@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker2007.codfw.wmnet
- 13:34 jmm@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker2007.codfw.wmnet
- 13:34 jmm@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker2006.codfw.wmnet
- 13:34 jmm@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker2006.codfw.wmnet
- 13:29 jmm@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker2006.codfw.wmnet
- 13:28 jmm@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker2006.codfw.wmnet
- 13:25 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2068.codfw.wmnet with reason: host reimage
- 13:21 eevans@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for aqs1026.eqiad.wmnet
- 13:21 eevans@cumin1003: START - Cookbook sre.hosts.remove-downtime for aqs1026.eqiad.wmnet
- 13:19 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2068.codfw.wmnet with reason: host reimage
- 13:18 jgiannelos@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifeeds: apply
- 13:17 jgiannelos@deploy1003: helmfile [staging] START helmfile.d/services/wikifeeds: apply
- 13:17 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2197.codfw.wmnet with reason: Maintenance
- 13:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2193 (T419961)', diff saved to https://phabricator.wikimedia.org/P90787 and previous config saved to /var/cache/conftool/dbconfig/20260415-131657-fceratto.json
- 13:16 kartik@deploy1003: Finished scap sync-world: Backport for Register ArticleGuidance extension and enable in labs (T423295) (duration: 12m 02s)
- 13:12 kartik@deploy1003: sbisson, kartik: Continuing with sync
- 13:08 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db2221 (T410589)', diff saved to https://phabricator.wikimedia.org/P90786 and previous config saved to /var/cache/conftool/dbconfig/20260415-130849-ladsgroup.json
- 13:08 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db2221.codfw.wmnet with reason: Maintenance
- 13:08 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2218 (T410589)', diff saved to https://phabricator.wikimedia.org/P90785 and previous config saved to /var/cache/conftool/dbconfig/20260415-130836-ladsgroup.json
- 13:08 jmm@cumin2002: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:aux-worker-codfw
- 13:08 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1150.eqiad.wmnet with reason: Maintenance
- 13:07 fceratto@cumin1003: DONE (ERROR) - Cookbook sre.hosts.downtime (exit_code=97) for 1 day, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1014,1018].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
- 13:07 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1158.eqiad.wmnet with reason: Maintenance
- 13:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2193', diff saved to https://phabricator.wikimedia.org/P90784 and previous config saved to /var/cache/conftool/dbconfig/20260415-130649-fceratto.json
- 13:06 kartik@deploy1003: sbisson, kartik: Backport for Register ArticleGuidance extension and enable in labs (T423295) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 13:04 kartik@deploy1003: Started scap sync-world: Backport for Register ArticleGuidance extension and enable in labs (T423295)
- 13:00 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2068.codfw.wmnet with OS bullseye
- 12:58 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2218', diff saved to https://phabricator.wikimedia.org/P90783 and previous config saved to /var/cache/conftool/dbconfig/20260415-125828-ladsgroup.json
- 12:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2193', diff saved to https://phabricator.wikimedia.org/P90782 and previous config saved to /var/cache/conftool/dbconfig/20260415-125640-fceratto.json
- 12:48 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2218', diff saved to https://phabricator.wikimedia.org/P90781 and previous config saved to /var/cache/conftool/dbconfig/20260415-124819-ladsgroup.json
- 12:48 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2069.codfw.wmnet with OS bullseye
- 12:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2193 (T419961)', diff saved to https://phabricator.wikimedia.org/P90780 and previous config saved to /var/cache/conftool/dbconfig/20260415-124633-fceratto.json
- 12:46 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.update-replication (exit_code=0)
- 12:45 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication
- 12:45 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.update-replication (exit_code=0)
- 12:45 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication
- 12:45 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.update-replication (exit_code=0)
- 12:45 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication
- 12:44 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.update-replication (exit_code=0)
- 12:44 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication
- 12:44 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.update-replication (exit_code=0)
- 12:44 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication
- 12:43 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.update-replication (exit_code=0)
- 12:43 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication
- 12:43 dreamyjazz@deploy1003: Finished scap sync-world: Backport for VisualEditor hCaptcha: Clear challenge container for new render (T423294) (duration: 08m 11s)
- 12:41 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.update-replication (exit_code=0)
- 12:41 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication
- 12:41 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.update-replication (exit_code=0)
- 12:40 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication
- 12:40 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.update-replication (exit_code=0)
- 12:40 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication
- 12:39 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2193 (T419961)', diff saved to https://phabricator.wikimedia.org/P90779 and previous config saved to /var/cache/conftool/dbconfig/20260415-123937-fceratto.json
- 12:39 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2193.codfw.wmnet with reason: Maintenance
- 12:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2169 (T419961)', diff saved to https://phabricator.wikimedia.org/P90778 and previous config saved to /var/cache/conftool/dbconfig/20260415-123915-fceratto.json
- 12:39 dreamyjazz@deploy1003: dreamyjazz: Continuing with sync
- 12:38 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.update-replication (exit_code=0)
- 12:38 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication
- 12:38 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2218 (T410589)', diff saved to https://phabricator.wikimedia.org/P90777 and previous config saved to /var/cache/conftool/dbconfig/20260415-123811-ladsgroup.json
- 12:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1226 (T419635)', diff saved to https://phabricator.wikimedia.org/P90776 and previous config saved to /var/cache/conftool/dbconfig/20260415-123803-fceratto.json
- 12:37 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.update-replication (exit_code=0)
- 12:37 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication
- 12:37 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.update-replication (exit_code=0)
- 12:37 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication
- 12:36 dreamyjazz@deploy1003: dreamyjazz: Backport for VisualEditor hCaptcha: Clear challenge container for new render (T423294) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 12:36 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.update-replication (exit_code=0)
- 12:34 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication
- 12:34 dreamyjazz@deploy1003: Started scap sync-world: Backport for VisualEditor hCaptcha: Clear challenge container for new render (T423294)
- 12:34 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.update-replication (exit_code=0)
- 12:34 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication
- 12:34 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.update-replication (exit_code=0)
- 12:33 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication
- 12:32 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.update-replication (exit_code=0)
- 12:32 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication
- 12:31 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.update-replication (exit_code=0)
- 12:31 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication
- 12:30 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.update-replication (exit_code=0)
- 12:30 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication
- 12:30 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.update-replication (exit_code=0)
- 12:29 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication
- 12:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2169', diff saved to https://phabricator.wikimedia.org/P90775 and previous config saved to /var/cache/conftool/dbconfig/20260415-122907-fceratto.json
- 12:28 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2069.codfw.wmnet with reason: host reimage
- 12:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1226', diff saved to https://phabricator.wikimedia.org/P90774 and previous config saved to /var/cache/conftool/dbconfig/20260415-122756-fceratto.json
- 12:27 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.update-replication (exit_code=0)
- 12:27 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication
- 12:26 kart_: Updated cxserver to 2026-04-14-071531-production
- 12:25 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.update-replication (exit_code=0)
- 12:25 kartik@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
- 12:25 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication
- 12:25 fceratto@cumin1003: END (FAIL) - Cookbook sre.mysql.update-replication (exit_code=99)
- 12:25 kartik@deploy1003: helmfile [eqiad] START helmfile.d/services/cxserver: apply
- 12:23 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication
- 12:22 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.update-replication (exit_code=0)
- 12:22 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication
- 12:22 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2069.codfw.wmnet with reason: host reimage
- 12:22 kartik@deploy1003: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
- 12:21 kartik@deploy1003: helmfile [codfw] START helmfile.d/services/cxserver: apply
- 12:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2169', diff saved to https://phabricator.wikimedia.org/P90773 and previous config saved to /var/cache/conftool/dbconfig/20260415-121859-fceratto.json
- 12:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1226', diff saved to https://phabricator.wikimedia.org/P90772 and previous config saved to /var/cache/conftool/dbconfig/20260415-121748-fceratto.json
- 12:11 kartik@deploy1003: helmfile [staging] DONE helmfile.d/services/cxserver: apply
- 12:11 kartik@deploy1003: helmfile [staging] START helmfile.d/services/cxserver: apply
- 12:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2169 (T419961)', diff saved to https://phabricator.wikimedia.org/P90771 and previous config saved to /var/cache/conftool/dbconfig/20260415-120851-fceratto.json
- 12:07 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1226 (T419635)', diff saved to https://phabricator.wikimedia.org/P90770 and previous config saved to /var/cache/conftool/dbconfig/20260415-120739-fceratto.json
- 12:04 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2069.codfw.wmnet with OS bullseye
- 12:03 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1226 (T419635)', diff saved to https://phabricator.wikimedia.org/P90769 and previous config saved to /var/cache/conftool/dbconfig/20260415-120331-fceratto.json
- 12:03 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1226.eqiad.wmnet with reason: Maintenance
- 12:03 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1214 (T419635)', diff saved to https://phabricator.wikimedia.org/P90768 and previous config saved to /var/cache/conftool/dbconfig/20260415-120305-fceratto.json
- 12:01 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2169 (T419961)', diff saved to https://phabricator.wikimedia.org/P90767 and previous config saved to /var/cache/conftool/dbconfig/20260415-120138-fceratto.json
- 12:01 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2169.codfw.wmnet with reason: Maintenance
- 12:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2158 (T419961)', diff saved to https://phabricator.wikimedia.org/P90766 and previous config saved to /var/cache/conftool/dbconfig/20260415-120117-fceratto.json
- 12:00 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2069.codfw.wmnet with OS trixie
- 11:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1214', diff saved to https://phabricator.wikimedia.org/P90765 and previous config saved to /var/cache/conftool/dbconfig/20260415-115257-fceratto.json
- 11:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2158', diff saved to https://phabricator.wikimedia.org/P90764 and previous config saved to /var/cache/conftool/dbconfig/20260415-115109-fceratto.json
- 11:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1214', diff saved to https://phabricator.wikimedia.org/P90762 and previous config saved to /var/cache/conftool/dbconfig/20260415-114249-fceratto.json
- 11:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2158', diff saved to https://phabricator.wikimedia.org/P90761 and previous config saved to /var/cache/conftool/dbconfig/20260415-114101-fceratto.json
- 11:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1214 (T419635)', diff saved to https://phabricator.wikimedia.org/P90758 and previous config saved to /var/cache/conftool/dbconfig/20260415-113241-fceratto.json
- 11:31 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
- 11:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2158 (T419961)', diff saved to https://phabricator.wikimedia.org/P90757 and previous config saved to /var/cache/conftool/dbconfig/20260415-113053-fceratto.json
- 11:30 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
- 11:30 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
- 11:29 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1214 (T419635)', diff saved to https://phabricator.wikimedia.org/P90756 and previous config saved to /var/cache/conftool/dbconfig/20260415-112937-fceratto.json
- 11:29 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1214.eqiad.wmnet with reason: Maintenance
- 11:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1203 (T419635)', diff saved to https://phabricator.wikimedia.org/P90755 and previous config saved to /var/cache/conftool/dbconfig/20260415-112913-fceratto.json
- 11:27 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
- 11:24 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2158 (T419961)', diff saved to https://phabricator.wikimedia.org/P90754 and previous config saved to /var/cache/conftool/dbconfig/20260415-112445-fceratto.json
- 11:24 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2158.codfw.wmnet with reason: Maintenance
- 11:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2151 (T419961)', diff saved to https://phabricator.wikimedia.org/P90753 and previous config saved to /var/cache/conftool/dbconfig/20260415-112413-fceratto.json
- 11:20 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
- 11:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1203', diff saved to https://phabricator.wikimedia.org/P90752 and previous config saved to /var/cache/conftool/dbconfig/20260415-111905-fceratto.json
- 11:18 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
- 11:17 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
- 11:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2151', diff saved to https://phabricator.wikimedia.org/P90751 and previous config saved to /var/cache/conftool/dbconfig/20260415-111405-fceratto.json
- 11:12 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
- 11:11 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
- 11:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1203', diff saved to https://phabricator.wikimedia.org/P90750 and previous config saved to /var/cache/conftool/dbconfig/20260415-110856-fceratto.json
- 11:05 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2069.codfw.wmnet with reason: host reimage
- 11:03 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2151', diff saved to https://phabricator.wikimedia.org/P90749 and previous config saved to /var/cache/conftool/dbconfig/20260415-110357-fceratto.json
- 11:01 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
- 10:59 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2069.codfw.wmnet with reason: host reimage
- 10:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1203 (T419635)', diff saved to https://phabricator.wikimedia.org/P90748 and previous config saved to /var/cache/conftool/dbconfig/20260415-105848-fceratto.json
- 10:53 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2151 (T419961)', diff saved to https://phabricator.wikimedia.org/P90747 and previous config saved to /var/cache/conftool/dbconfig/20260415-105349-fceratto.json
- 10:53 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1203 (T419635)', diff saved to https://phabricator.wikimedia.org/P90746 and previous config saved to /var/cache/conftool/dbconfig/20260415-105338-fceratto.json
- 10:53 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1203.eqiad.wmnet with reason: Maintenance
- 10:53 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1193 (T419635)', diff saved to https://phabricator.wikimedia.org/P90745 and previous config saved to /var/cache/conftool/dbconfig/20260415-105314-fceratto.json
- 10:45 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2151 (T419961)', diff saved to https://phabricator.wikimedia.org/P90744 and previous config saved to /var/cache/conftool/dbconfig/20260415-104535-fceratto.json
- 10:45 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2151.codfw.wmnet with reason: Maintenance
- 10:44 taavi@dns1004: END - running authdns-update
- 10:43 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1193', diff saved to https://phabricator.wikimedia.org/P90743 and previous config saved to /var/cache/conftool/dbconfig/20260415-104306-fceratto.json
- 10:42 taavi@dns1004: START - running authdns-update
- 10:39 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2069.codfw.wmnet with OS trixie
- 10:37 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 365 days, 0:00:00 on dborch1001.wikimedia.org with reason: T416582
- 10:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1193', diff saved to https://phabricator.wikimedia.org/P90742 and previous config saved to /var/cache/conftool/dbconfig/20260415-103258-fceratto.json
- 10:29 mvernon@cumin2002: END (FAIL) - Cookbook sre.swift.convert-disks (exit_code=99) for host ms-be2069
- 10:29 mvernon@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host ms-be2069.codfw.wmnet with OS bullseye
- 10:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1193 (T419635)', diff saved to https://phabricator.wikimedia.org/P90741 and previous config saved to /var/cache/conftool/dbconfig/20260415-102250-fceratto.json
- 10:19 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1193 (T419635)', diff saved to https://phabricator.wikimedia.org/P90740 and previous config saved to /var/cache/conftool/dbconfig/20260415-101942-fceratto.json
- 10:19 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1193.eqiad.wmnet with reason: Maintenance
- 10:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1192 (T419635)', diff saved to https://phabricator.wikimedia.org/P90739 and previous config saved to /var/cache/conftool/dbconfig/20260415-101917-fceratto.json
- 10:10 jayme@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2280.codfw.wmnet
- 10:10 jayme@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2280.codfw.wmnet
- 10:10 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for wikikube-worker2280.codfw.wmnet
- 10:10 jayme@cumin2002: START - Cookbook sre.hosts.remove-downtime for wikikube-worker2280.codfw.wmnet
- 10:10 elukey: upgrade spicerack on cumin nodes
- 10:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1192', diff saved to https://phabricator.wikimedia.org/P90738 and previous config saved to /var/cache/conftool/dbconfig/20260415-100908-fceratto.json
- 10:08 elukey: uploaded spicerack_12.4.0 to apt.wikimedia.org bookworm-wikimedia
- 10:00 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
- 09:59 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2069.codfw.wmnet with reason: host reimage
- 09:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1192', diff saved to https://phabricator.wikimedia.org/P90737 and previous config saved to /var/cache/conftool/dbconfig/20260415-095901-fceratto.json
- 09:58 jayme@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on wikikube-worker2280.codfw.wmnet with reason: hardware issues
- 09:56 jayme@cumin2002: END (FAIL) - Cookbook sre.k8s.pool-depool-node (exit_code=99) depool for host wikikube-worker2280.codfw.wmnet
- 09:53 jayme@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2280.codfw.wmnet
- 09:53 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2069.codfw.wmnet with reason: host reimage
- 09:51 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
- 09:50 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
- 09:50 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
- 09:49 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db1236 (T410589)', diff saved to https://phabricator.wikimedia.org/P90736 and previous config saved to /var/cache/conftool/dbconfig/20260415-094902-ladsgroup.json
- 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1192 (T419635)', diff saved to https://phabricator.wikimedia.org/P90735 and previous config saved to /var/cache/conftool/dbconfig/20260415-094852-fceratto.json
- 09:48 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db1236.eqiad.wmnet with reason: Maintenance
- 09:48 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1231 (T410589)', diff saved to https://phabricator.wikimedia.org/P90734 and previous config saved to /var/cache/conftool/dbconfig/20260415-094831-ladsgroup.json
- 09:45 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1192 (T419635)', diff saved to https://phabricator.wikimedia.org/P90733 and previous config saved to /var/cache/conftool/dbconfig/20260415-094544-fceratto.json
- 09:45 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1192.eqiad.wmnet with reason: Maintenance
- 09:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1178 (T419635)', diff saved to https://phabricator.wikimedia.org/P90732 and previous config saved to /var/cache/conftool/dbconfig/20260415-094519-fceratto.json
- 09:41 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
- 09:38 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1231', diff saved to https://phabricator.wikimedia.org/P90731 and previous config saved to /var/cache/conftool/dbconfig/20260415-093823-ladsgroup.json
- 09:35 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2069.codfw.wmnet with OS bullseye
- 09:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1178', diff saved to https://phabricator.wikimedia.org/P90730 and previous config saved to /var/cache/conftool/dbconfig/20260415-093511-fceratto.json
- 09:28 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1231', diff saved to https://phabricator.wikimedia.org/P90729 and previous config saved to /var/cache/conftool/dbconfig/20260415-092815-ladsgroup.json
- 09:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1178', diff saved to https://phabricator.wikimedia.org/P90728 and previous config saved to /var/cache/conftool/dbconfig/20260415-092502-fceratto.json
- 09:18 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1231 (T410589)', diff saved to https://phabricator.wikimedia.org/P90727 and previous config saved to /var/cache/conftool/dbconfig/20260415-091807-ladsgroup.json
- 09:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1178 (T419635)', diff saved to https://phabricator.wikimedia.org/P90726 and previous config saved to /var/cache/conftool/dbconfig/20260415-091454-fceratto.json
- 09:09 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1178 (T419635)', diff saved to https://phabricator.wikimedia.org/P90725 and previous config saved to /var/cache/conftool/dbconfig/20260415-090945-fceratto.json
- 09:09 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1178.eqiad.wmnet with reason: Maintenance
- 09:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1177 (T419635)', diff saved to https://phabricator.wikimedia.org/P90724 and previous config saved to /var/cache/conftool/dbconfig/20260415-090920-fceratto.json
- 08:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1177', diff saved to https://phabricator.wikimedia.org/P90723 and previous config saved to /var/cache/conftool/dbconfig/20260415-085912-fceratto.json
- 08:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1177', diff saved to https://phabricator.wikimedia.org/P90722 and previous config saved to /var/cache/conftool/dbconfig/20260415-084904-fceratto.json
- 08:43 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
- 08:43 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
- 08:41 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
- 08:40 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
- 08:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1177 (T419635)', diff saved to https://phabricator.wikimedia.org/P90721 and previous config saved to /var/cache/conftool/dbconfig/20260415-083857-fceratto.json
- 08:38 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
- 08:35 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1177 (T419635)', diff saved to https://phabricator.wikimedia.org/P90720 and previous config saved to /var/cache/conftool/dbconfig/20260415-083547-fceratto.json
- 08:35 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1177.eqiad.wmnet with reason: Maintenance
- 08:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1172 (T419635)', diff saved to https://phabricator.wikimedia.org/P90719 and previous config saved to /var/cache/conftool/dbconfig/20260415-083522-fceratto.json
- 08:34 mvernon@cumin2002: START - Cookbook sre.swift.convert-disks for host ms-be2069
- 08:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1172', diff saved to https://phabricator.wikimedia.org/P90718 and previous config saved to /var/cache/conftool/dbconfig/20260415-082514-fceratto.json
- 08:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1172', diff saved to https://phabricator.wikimedia.org/P90717 and previous config saved to /var/cache/conftool/dbconfig/20260415-081506-fceratto.json
- 08:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1172 (T419635)', diff saved to https://phabricator.wikimedia.org/P90716 and previous config saved to /var/cache/conftool/dbconfig/20260415-080458-fceratto.json
- 08:01 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1172 (T419635)', diff saved to https://phabricator.wikimedia.org/P90715 and previous config saved to /var/cache/conftool/dbconfig/20260415-080150-fceratto.json
- 08:01 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1172.eqiad.wmnet with reason: Maintenance
- 08:00 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1171.eqiad.wmnet with reason: Maintenance
- 08:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1167 (T419635)', diff saved to https://phabricator.wikimedia.org/P90714 and previous config saved to /var/cache/conftool/dbconfig/20260415-075959-fceratto.json
- 07:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1167', diff saved to https://phabricator.wikimedia.org/P90713 and previous config saved to /var/cache/conftool/dbconfig/20260415-074951-fceratto.json
- 07:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1167', diff saved to https://phabricator.wikimedia.org/P90712 and previous config saved to /var/cache/conftool/dbconfig/20260415-073942-fceratto.json
- 07:32 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be2006.codfw.wmnet
- 07:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1167 (T419635)', diff saved to https://phabricator.wikimedia.org/P90711 and previous config saved to /var/cache/conftool/dbconfig/20260415-072935-fceratto.json
- 07:26 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1167 (T419635)', diff saved to https://phabricator.wikimedia.org/P90710 and previous config saved to /var/cache/conftool/dbconfig/20260415-072626-fceratto.json
- 07:26 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1016,1020].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
- 07:25 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1167.eqiad.wmnet with reason: Maintenance
- 07:25 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host thanos-be2006.codfw.wmnet
- 07:23 Emperor: discard /srv/log/swift/server.log.5.gz on thanos-be2006 to free disk space
- 07:17 Emperor: discard /srv/log/swift/server.log.1 on thanos-be2006 to free disk space
- 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 14s)
- 02:01 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
- 01:51 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db1231 (T410589)', diff saved to https://phabricator.wikimedia.org/P90709 and previous config saved to /var/cache/conftool/dbconfig/20260415-015138-ladsgroup.json
- 01:51 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db1231.eqiad.wmnet with reason: Maintenance
- 01:51 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1227 (T410589)', diff saved to https://phabricator.wikimedia.org/P90708 and previous config saved to /var/cache/conftool/dbconfig/20260415-015113-ladsgroup.json
- 01:41 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1227', diff saved to https://phabricator.wikimedia.org/P90707 and previous config saved to /var/cache/conftool/dbconfig/20260415-014104-ladsgroup.json
- 01:30 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1227', diff saved to https://phabricator.wikimedia.org/P90706 and previous config saved to /var/cache/conftool/dbconfig/20260415-013056-ladsgroup.json
- 01:20 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1227 (T410589)', diff saved to https://phabricator.wikimedia.org/P90705 and previous config saved to /var/cache/conftool/dbconfig/20260415-012048-ladsgroup.json
- 01:00 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db2218 (T410589)', diff saved to https://phabricator.wikimedia.org/P90704 and previous config saved to /var/cache/conftool/dbconfig/20260415-010004-ladsgroup.json
- 00:59 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db2218.codfw.wmnet with reason: Maintenance
- 00:59 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2208 (T410589)', diff saved to https://phabricator.wikimedia.org/P90703 and previous config saved to /var/cache/conftool/dbconfig/20260415-005940-ladsgroup.json
- 00:49 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2208', diff saved to https://phabricator.wikimedia.org/P90702 and previous config saved to /var/cache/conftool/dbconfig/20260415-004932-ladsgroup.json
- 00:39 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2208', diff saved to https://phabricator.wikimedia.org/P90701 and previous config saved to /var/cache/conftool/dbconfig/20260415-003923-ladsgroup.json
- 00:29 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2208 (T410589)', diff saved to https://phabricator.wikimedia.org/P90700 and previous config saved to /var/cache/conftool/dbconfig/20260415-002915-ladsgroup.json
- 00:16 ladsgroup@deploy1003: Finished scap sync-world: Backport for Api: Remove deprecation warning for missing rvslots (T412637), Api: Remove deprecation warning for missing rvslots (T412637) (duration: 06m 41s)
- 00:13 ladsgroup@deploy1003: ladsgroup: Continuing with sync
- 00:12 ladsgroup@deploy1003: ladsgroup: Backport for Api: Remove deprecation warning for missing rvslots (T412637), Api: Remove deprecation warning for missing rvslots (T412637) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 00:10 ladsgroup@deploy1003: Started scap sync-world: Backport for Api: Remove deprecation warning for missing rvslots (T412637), Api: Remove deprecation warning for missing rvslots (T412637)
2026-04-14
- 23:11 Amir1: optimizing globalblocks table on s7 (T423349)
- 22:44 jasmine@dns1004: END - running authdns-update
- 22:43 jasmine@dns1004: START - running authdns-update
- 21:12 bvibber@deploy1003: Finished scap sync-world: Backport for Enable ReaderExperiments for itwiki, plwiki (T423173) (duration: 09m 48s)
- 21:08 bvibber@deploy1003: bvibber: Continuing with sync
- 21:04 bvibber@deploy1003: bvibber: Backport for Enable ReaderExperiments for itwiki, plwiki (T423173) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 21:02 bvibber@deploy1003: Started scap sync-world: Backport for Enable ReaderExperiments for itwiki, plwiki (T423173)
- 20:57 catrope@deploy1003: Finished scap sync-world: Backport for Update wikimaniawiki namespace search (T423278), Enforce 2FA requirements for phase 1 groups (T423118) (duration: 07m 28s)
- 20:53 catrope@deploy1003: catrope, robertsky: Continuing with sync
- 20:51 catrope@deploy1003: catrope, robertsky: Backport for Update wikimaniawiki namespace search (T423278), Enforce 2FA requirements for phase 1 groups (T423118) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 20:49 catrope@deploy1003: Started scap sync-world: Backport for Update wikimaniawiki namespace search (T423278), Enforce 2FA requirements for phase 1 groups (T423118)
- 20:40 cscott@deploy1003: Finished scap sync-world: Backport for ParsoidLanguageConverter: convert inside <indicator> (T422961), LanguageConverter: Allow disabling top-level variant "guess" (T419328) (duration: 10m 18s)
- 20:36 cscott@deploy1003: cscott: Continuing with sync
- 20:32 cscott@deploy1003: cscott: Backport for ParsoidLanguageConverter: convert inside <indicator> (T422961), LanguageConverter: Allow disabling top-level variant "guess" (T419328) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 20:30 cscott@deploy1003: Started scap sync-world: Backport for ParsoidLanguageConverter: convert inside <indicator> (T422961), LanguageConverter: Allow disabling top-level variant "guess" (T419328)
- 20:16 mstyles@deploy1003: Finished scap sync-world: Backport for Route email confirmation funnel through Test Kitchen experiment (T420007) (duration: 09m 25s)
- 20:12 mstyles@deploy1003: mstyles: Continuing with sync
- 20:09 mstyles@deploy1003: mstyles: Backport for Route email confirmation funnel through Test Kitchen experiment (T420007) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 20:07 mstyles@deploy1003: Started scap sync-world: Backport for Route email confirmation funnel through Test Kitchen experiment (T420007)
- 19:30 swfrench-wmf: applied external-services network policy updates for cassandra-analytics-query-service-storage-[ab]-eqiad (aqs1026) and dumps-wikimedia in wikikube clusters
- 19:27 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
- 19:27 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
- 19:24 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
- 19:23 swfrench@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
- 19:22 swfrench@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
- 19:21 swfrench@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
- 19:20 swfrench@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
- 19:19 swfrench@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
- 19:16 eevans@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on aqs1026.eqiad.wmnet with reason: Bootstrapping — T412830
- 18:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2216 (T419635)', diff saved to https://phabricator.wikimedia.org/P90699 and previous config saved to /var/cache/conftool/dbconfig/20260414-184440-fceratto.json
- 18:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2216', diff saved to https://phabricator.wikimedia.org/P90698 and previous config saved to /var/cache/conftool/dbconfig/20260414-183432-fceratto.json
- 18:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2216', diff saved to https://phabricator.wikimedia.org/P90697 and previous config saved to /var/cache/conftool/dbconfig/20260414-182424-fceratto.json
- 18:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2216 (T419635)', diff saved to https://phabricator.wikimedia.org/P90696 and previous config saved to /var/cache/conftool/dbconfig/20260414-181416-fceratto.json
- 18:11 dancy@deploy1003: rebuilt and synchronized wikiversions files: group0 to 1.46.0-wmf.24 refs T420482
- 18:04 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-upgrade-ats (exit_code=0) Rolling upgrade of ATS on A:cp-text_eqiad - 9.2.13 Upgrade ()
- 18:03 ladsgroup@deploy1003: Finished scap sync-world: Backport for src: Fix typos (duration: 07m 13s)
- 17:59 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2216 (T419635)', diff saved to https://phabricator.wikimedia.org/P90695 and previous config saved to /var/cache/conftool/dbconfig/20260414-175927-fceratto.json
- 17:59 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2216.codfw.wmnet with reason: Maintenance
- 17:59 ladsgroup@deploy1003: ladsgroup: Continuing with sync
- 17:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2203 (T419635)', diff saved to https://phabricator.wikimedia.org/P90694 and previous config saved to /var/cache/conftool/dbconfig/20260414-175902-fceratto.json
- 17:58 ladsgroup@deploy1003: ladsgroup: Backport for src: Fix typos synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 17:56 ladsgroup@deploy1003: Started scap sync-world: Backport for src: Fix typos
- 17:56 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-upgrade-ats (exit_code=0) Rolling upgrade of ATS on A:cp-upload_eqiad - 9.2.13 Upgrade ()
- 17:51 mvernon@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ms-be2068.codfw.wmnet with OS bullseye
- 17:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2203', diff saved to https://phabricator.wikimedia.org/P90693 and previous config saved to /var/cache/conftool/dbconfig/20260414-174854-fceratto.json
- 17:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2203', diff saved to https://phabricator.wikimedia.org/P90692 and previous config saved to /var/cache/conftool/dbconfig/20260414-173846-fceratto.json
- 17:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2203 (T419635)', diff saved to https://phabricator.wikimedia.org/P90691 and previous config saved to /var/cache/conftool/dbconfig/20260414-172838-fceratto.json
- 17:17 brett@cumin2002: START - Cookbook sre.cdn.roll-upgrade-ats Rolling upgrade of ATS on A:cp-text_eqiad - 9.2.13 Upgrade ()
- 17:17 brett@cumin2002: START - Cookbook sre.cdn.roll-upgrade-ats Rolling upgrade of ATS on A:cp-upload_eqiad - 9.2.13 Upgrade ()
- 17:12 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2203 (T419635)', diff saved to https://phabricator.wikimedia.org/P90690 and previous config saved to /var/cache/conftool/dbconfig/20260414-171246-fceratto.json
- 17:12 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2203.codfw.wmnet with reason: Maintenance
- 17:07 taavi: updating caprica hostlists on cloud-hosts-in cr firewall policies
- 17:00 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2202.codfw.wmnet with reason: Maintenance
- 17:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2188 (T419635)', diff saved to https://phabricator.wikimedia.org/P90689 and previous config saved to /var/cache/conftool/dbconfig/20260414-170010-fceratto.json
- 16:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2188', diff saved to https://phabricator.wikimedia.org/P90688 and previous config saved to /var/cache/conftool/dbconfig/20260414-165001-fceratto.json
- 16:43 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on zuul2001.codfw.wmnet with reason: T421398
- 16:43 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on zuul1001.eqiad.wmnet with reason: T421398
- 16:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2188', diff saved to https://phabricator.wikimedia.org/P90687 and previous config saved to /var/cache/conftool/dbconfig/20260414-163953-fceratto.json
- 16:35 daniel@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/aux-k8s-services/redioscope: apply
- 16:35 daniel@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/aux-k8s-services/redioscope: apply
- 16:34 daniel@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/aux-k8s-services/redioscope: apply
- 16:34 daniel@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/aux-k8s-services/redioscope: apply
- 16:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2188 (T419635)', diff saved to https://phabricator.wikimedia.org/P90686 and previous config saved to /var/cache/conftool/dbconfig/20260414-162945-fceratto.json
- 16:20 jforrester@deploy1003: Finished scap sync-world: Backport for wmgMonologChannels: Reduce WikiLambda* sub-channels logging from debug to info (duration: 08m 27s)
- 16:16 jforrester@deploy1003: jforrester: Continuing with sync
- 16:15 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2068.codfw.wmnet with reason: host reimage
- 16:13 jforrester@deploy1003: jforrester: Backport for wmgMonologChannels: Reduce WikiLambda* sub-channels logging from debug to info synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 16:13 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2188 (T419635)', diff saved to https://phabricator.wikimedia.org/P90685 and previous config saved to /var/cache/conftool/dbconfig/20260414-161351-fceratto.json
- 16:13 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2188.codfw.wmnet with reason: Maintenance
- 16:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2176 (T419635)', diff saved to https://phabricator.wikimedia.org/P90684 and previous config saved to /var/cache/conftool/dbconfig/20260414-161326-fceratto.json
- 16:12 jforrester@deploy1003: Started scap sync-world: Backport for wmgMonologChannels: Reduce WikiLambda* sub-channels logging from debug to info
- 16:10 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2068.codfw.wmnet with reason: host reimage
- 16:08 jforrester@deploy1003: Finished scap sync-world: Backport for wmgMonologChannels: Add WikiLambda* sub-channels, all at debug for some quick checks (duration: 06m 32s)
- 16:04 jforrester@deploy1003: jforrester: Continuing with sync
- 16:03 jforrester@deploy1003: jforrester: Backport for wmgMonologChannels: Add WikiLambda* sub-channels, all at debug for some quick checks synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 16:03 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2176', diff saved to https://phabricator.wikimedia.org/P90683 and previous config saved to /var/cache/conftool/dbconfig/20260414-160319-fceratto.json
- 16:01 jforrester@deploy1003: Started scap sync-world: Backport for wmgMonologChannels: Add WikiLambda* sub-channels, all at debug for some quick checks
- 15:53 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2176', diff saved to https://phabricator.wikimedia.org/P90682 and previous config saved to /var/cache/conftool/dbconfig/20260414-155310-fceratto.json
- 15:52 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2068.codfw.wmnet with OS bullseye
- 15:50 mvernon@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ms-be2068.codfw.wmnet
- 15:45 cdanis@deploy1003: Finished scap sync-world: Backport for SwiftFileBackend: propagate tracing context to HTTP client (duration: 08m 24s)
- 15:43 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2176 (T419635)', diff saved to https://phabricator.wikimedia.org/P90681 and previous config saved to /var/cache/conftool/dbconfig/20260414-154302-fceratto.json
- 15:41 cdanis@deploy1003: cdanis: Continuing with sync
- 15:38 cdanis@deploy1003: cdanis: Backport for SwiftFileBackend: propagate tracing context to HTTP client synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 15:37 cdanis@deploy1003: Started scap sync-world: Backport for SwiftFileBackend: propagate tracing context to HTTP client
- 15:33 mvernon@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ms-be2068.codfw.wmnet
- 15:32 mvernon@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host ms-be2068.codfw.wmnet with OS bullseye
- 15:26 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
- 15:24 jasmine@dns1004: END - running authdns-update
- 15:24 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
- 15:23 jasmine@dns1004: START - running authdns-update
- 15:21 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2176 (T419635)', diff saved to https://phabricator.wikimedia.org/P90680 and previous config saved to /var/cache/conftool/dbconfig/20260414-152156-fceratto.json
- 15:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2176.codfw.wmnet with reason: Maintenance
- 15:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2174 (T419635)', diff saved to https://phabricator.wikimedia.org/P90679 and previous config saved to /var/cache/conftool/dbconfig/20260414-152132-fceratto.json
- 15:18 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
- 15:18 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
- 15:17 daniel@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/aux-k8s-services/redioscope: apply
- 15:15 daniel@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/aux-k8s-services/redioscope: apply
- 15:13 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
- 15:12 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
- 15:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2174', diff saved to https://phabricator.wikimedia.org/P90678 and previous config saved to /var/cache/conftool/dbconfig/20260414-151123-fceratto.json
- 15:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2174', diff saved to https://phabricator.wikimedia.org/P90677 and previous config saved to /var/cache/conftool/dbconfig/20260414-150115-fceratto.json
- 14:59 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2068.codfw.wmnet with reason: host reimage
- 14:56 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
- 14:54 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2068.codfw.wmnet with reason: host reimage
- 14:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2174 (T419635)', diff saved to https://phabricator.wikimedia.org/P90676 and previous config saved to /var/cache/conftool/dbconfig/20260414-145107-fceratto.json
- 14:50 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
- 14:49 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
- 14:44 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
- 14:39 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2214: after reimage to trixie
- 14:36 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2068.codfw.wmnet with OS bullseye
- 14:35 mvernon@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ms-be2068.codfw.wmnet
- 14:33 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2174 (T419635)', diff saved to https://phabricator.wikimedia.org/P90673 and previous config saved to /var/cache/conftool/dbconfig/20260414-143301-fceratto.json
- 14:32 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2174.codfw.wmnet with reason: Maintenance
- 14:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2173 (T419635)', diff saved to https://phabricator.wikimedia.org/P90672 and previous config saved to /var/cache/conftool/dbconfig/20260414-143235-fceratto.json
- 14:26 mvernon@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ms-be2068.codfw.wmnet
- 14:26 mvernon@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ms-be2068.codfw.wmnet
- 14:25 mvernon@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ms-be2068.codfw.wmnet
- 14:25 mvernon@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ms-be2068.codfw.wmnet
- 14:25 mvernon@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ms-be2068.codfw.wmnet
- 14:24 mvernon@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ms-be2068.codfw.wmnet
- 14:22 mvernon@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ms-be2068.codfw.wmnet
- 14:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2173', diff saved to https://phabricator.wikimedia.org/P90670 and previous config saved to /var/cache/conftool/dbconfig/20260414-142227-fceratto.json
- 14:18 sukhe@dns1004: END - running authdns-update
- 14:17 sukhe@dns1004: START - running authdns-update
- 14:16 ayounsi@cumin1003: conftool action : set/pooled=yes; selector: cluster=dnsbox,dc=esams [reason: esams maintenance over]
- 14:16 mvernon@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host ms-be2068.codfw.wmnet with OS bullseye
- 14:16 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: pool esams [reason: network maintenance finished, T416450]
- 14:16 ayounsi@cumin1003: START - Cookbook sre.dns.admin DNS admin: pool esams [reason: network maintenance finished, T416450]
- 14:14 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for 13 hosts
- 14:14 ayounsi@cumin1003: START - Cookbook sre.hosts.remove-downtime for 13 hosts
- 14:14 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for 12 hosts
- 14:14 ayounsi@cumin1003: START - Cookbook sre.hosts.remove-downtime for 12 hosts
- 14:13 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for 8 hosts
- 14:13 ayounsi@cumin1003: START - Cookbook sre.hosts.remove-downtime for 8 hosts
- 14:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2173', diff saved to https://phabricator.wikimedia.org/P90669 and previous config saved to /var/cache/conftool/dbconfig/20260414-141219-fceratto.json
- 14:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2173 (T419635)', diff saved to https://phabricator.wikimedia.org/P90667 and previous config saved to /var/cache/conftool/dbconfig/20260414-140211-fceratto.json
- 13:57 XioNoX: asw1-by27-esams> request system reboot - T416450
- 13:56 ayounsi@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 8 hosts with reason: router upgrade
- 13:55 ayounsi@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on asw1-by27-esams,asw1-by27-esams IPv6,asw1-by27-esams.mgmt with reason: router upgrade
- 13:55 mszwarc@deploy1003: mwscript-k8s job started: foreachwikiindblist all backfillInterwikiRightsLog.php --remote-wiki metawiki 20260311190000 # T6055 (third attempt)
- 13:54 ayounsi@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 12 hosts with reason: router upgrade
- 13:54 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2214: after reimage to trixie
- 13:53 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2214.codfw.wmnet with OS trixie
- 13:47 Lucas_WMDE: UTC afternoon backport+config window done
- 13:45 stran@deploy1003: Finished scap sync-world: Backport for Update webonyx/graphql-php to 15.31.5 (T423216) (duration: 07m 05s)
- 13:44 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2173 (T419635)', diff saved to https://phabricator.wikimedia.org/P90665 and previous config saved to /var/cache/conftool/dbconfig/20260414-134416-fceratto.json
- 13:44 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2173.codfw.wmnet with reason: Maintenance
- 13:43 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2170 (T419635)', diff saved to https://phabricator.wikimedia.org/P90664 and previous config saved to /var/cache/conftool/dbconfig/20260414-134350-fceratto.json
- 13:41 stran@deploy1003: stran: Continuing with sync
- 13:40 stran@deploy1003: stran: Backport for Update webonyx/graphql-php to 15.31.5 (T423216) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 13:38 stran@deploy1003: Started scap sync-world: Backport for Update webonyx/graphql-php to 15.31.5 (T423216)
- 13:36 XioNoX: asw1-bw27-esams> request system reboot - T416450
- 13:35 ayounsi@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on asw1-bw27-esams,asw1-bw27-esams IPv6,asw1-bw27-esams.mgmt with reason: router upgrade
- 13:35 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2068.codfw.wmnet with reason: host reimage
- 13:33 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2170', diff saved to https://phabricator.wikimedia.org/P90663 and previous config saved to /var/cache/conftool/dbconfig/20260414-133342-fceratto.json
- 13:33 ayounsi@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 13 hosts with reason: router upgrade
- 13:31 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2068.codfw.wmnet with reason: host reimage
- 13:31 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2214.codfw.wmnet with reason: host reimage
- 13:26 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2214.codfw.wmnet with reason: host reimage
- 13:23 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2170', diff saved to https://phabricator.wikimedia.org/P90662 and previous config saved to /var/cache/conftool/dbconfig/20260414-132334-fceratto.json
- 13:23 Amir1: on testcommonswiki drop table if exists categorylinks; drop table if exists externallinks; drop table if exists linktarget; drop table if exists collation; drop table if exists imagelinks; drop table if exists iwlinks; drop table if exists existencelinks; (T421914)
- 13:19 dreamyjazz@deploy1003: Finished scap sync-world: Backport for Stop setting $wgCampaignEventsEnableEventGoals (T414150), Revert "zhwiki: Temporary Logo Change for WP25" (T414299), Enable VisualEditor hCaptcha on testwiki (T423252) (duration: 09m 27s)
- 13:16 dreamyjazz@deploy1003: daimona, stang, dreamyjazz: Continuing with sync
- 13:15 XioNoX: cr2-esams - request vmhost reboot - T416450
- 13:14 elukey: disable cert-renewal on wikikube staging clusters as a test for the PKI discovery intermediate rollout - To rollback, revert: https://gerrit.wikimedia.org/r/c/operations/deployment-charts/+/1270873 - T420993
- 13:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2170 (T419635)', diff saved to https://phabricator.wikimedia.org/P90661 and previous config saved to /var/cache/conftool/dbconfig/20260414-131326-fceratto.json
- 13:13 elukey@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'sync'.
- 13:12 elukey@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'sync'.
- 13:12 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2068.codfw.wmnet with OS bullseye
- 13:12 elukey@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'sync'.
- 13:12 elukey@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'sync'.
- 13:12 dreamyjazz@deploy1003: daimona, stang, dreamyjazz: Backport for Stop setting $wgCampaignEventsEnableEventGoals (T414150), Revert "zhwiki: Temporary Logo Change for WP25" (T414299), Enable VisualEditor hCaptcha on testwiki (T423252) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 13:12 mvernon@cumin2002: END (FAIL) - Cookbook sre.swift.convert-disks (exit_code=99) for host ms-be2068
- 13:12 mvernon@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host ms-be2068.codfw.wmnet with OS bullseye
- 13:10 dreamyjazz@deploy1003: Started scap sync-world: Backport for Stop setting $wgCampaignEventsEnableEventGoals (T414150), Revert "zhwiki: Temporary Logo Change for WP25" (T414299), Enable VisualEditor hCaptcha on testwiki (T423252)
- 13:09 ayounsi@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on cr2-esams,cr2-esams IPv6,cr2-esams.mgmt with reason: router upgrade
- 13:06 jmm@dns1004: END - running authdns-update
- 13:06 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2214.codfw.wmnet with OS trixie
- 13:05 jmm@dns1004: START - running authdns-update
- 13:03 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2214: Reimage to Trixie
- 13:03 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2214: Reimage to Trixie
- 13:03 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2214.codfw.wmnet with reason: Reimage to Trixie
- 13:01 XioNoX: cr1-esams - request chassis routing-engine master switch - T416450
- 12:59 ayounsi@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on cr1-esams,cr1-esams IPv6,re0.cr1-esams.mgmt with reason: router upgrade
- 12:56 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2170 (T419635)', diff saved to https://phabricator.wikimedia.org/P90659 and previous config saved to /var/cache/conftool/dbconfig/20260414-125642-fceratto.json
- 12:56 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2170.codfw.wmnet with reason: Maintenance
- 12:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2153 (T419635)', diff saved to https://phabricator.wikimedia.org/P90658 and previous config saved to /var/cache/conftool/dbconfig/20260414-125628-fceratto.json
- 12:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2153', diff saved to https://phabricator.wikimedia.org/P90657 and previous config saved to /var/cache/conftool/dbconfig/20260414-124620-fceratto.json
- 12:41 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2068.codfw.wmnet with reason: host reimage
- 12:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2153', diff saved to https://phabricator.wikimedia.org/P90656 and previous config saved to /var/cache/conftool/dbconfig/20260414-123611-fceratto.json
- 12:35 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2068.codfw.wmnet with reason: host reimage
- 12:35 dbrant@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifeeds: apply
- 12:34 dbrant@deploy1003: helmfile [codfw] START helmfile.d/services/wikifeeds: apply
- 12:33 XioNoX: cr1-esams - request chassis routing-engine master switch - T416450
- 12:33 dbrant@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifeeds: apply
- 12:32 dbrant@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifeeds: apply
- 12:28 dbrant@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifeeds: apply
- 12:28 root@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1151: Security updates
- 12:28 root@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
- 12:28 root@cumin1003: START - Cookbook sre.mysql.parsercache
- 12:28 root@cumin1003: START - Cookbook sre.mysql.pool pool db1151: Security updates
- 12:27 dbrant@deploy1003: helmfile [staging] START helmfile.d/services/wikifeeds: apply
- 12:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2153 (T419635)', diff saved to https://phabricator.wikimedia.org/P90654 and previous config saved to /var/cache/conftool/dbconfig/20260414-122603-fceratto.json
- 12:22 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: depool esams [reason: network maintenance continue, T416450]
- 12:22 ayounsi@cumin1003: START - Cookbook sre.dns.admin DNS admin: depool esams [reason: network maintenance continue, T416450]
- 12:17 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: pool esams [reason: network maintenance paused, T416450]
- 12:17 ayounsi@cumin1003: START - Cookbook sre.dns.admin DNS admin: pool esams [reason: network maintenance paused, T416450]
- 12:14 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/proton: apply
- 12:08 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2153 (T419635)', diff saved to https://phabricator.wikimedia.org/P90653 and previous config saved to /var/cache/conftool/dbconfig/20260414-120812-fceratto.json
- 12:08 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2153.codfw.wmnet with reason: Maintenance
- 12:07 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2146 (T419635)', diff saved to https://phabricator.wikimedia.org/P90652 and previous config saved to /var/cache/conftool/dbconfig/20260414-120747-fceratto.json
- 12:03 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/proton: apply
- 12:02 root@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1151: Security updates
- 12:02 root@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
- 12:02 root@cumin1003: START - Cookbook sre.mysql.parsercache
- 12:02 root@cumin1003: START - Cookbook sre.mysql.depool depool db1151: Security updates
- 12:02 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/proton: apply
- 12:02 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db1227 (T410589)', diff saved to https://phabricator.wikimedia.org/P90650 and previous config saved to /var/cache/conftool/dbconfig/20260414-120200-ladsgroup.json
- 12:01 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db1227.eqiad.wmnet with reason: Maintenance
- 12:01 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/proton: apply
- 11:57 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db2208 (T410589)', diff saved to https://phabricator.wikimedia.org/P90649 and previous config saved to /var/cache/conftool/dbconfig/20260414-115752-ladsgroup.json
- 11:57 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db2208.codfw.wmnet with reason: Maintenance
- 11:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2146', diff saved to https://phabricator.wikimedia.org/P90648 and previous config saved to /var/cache/conftool/dbconfig/20260414-115739-fceratto.json
- 11:57 ayounsi@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on cr1-esams,cr1-esams IPv6,re0.cr1-esams.mgmt with reason: router upgrade
- 11:55 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/proton: apply
- 11:54 jmm@deploy1003: helmfile [staging] START helmfile.d/services/proton: apply
- 11:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2146', diff saved to https://phabricator.wikimedia.org/P90647 and previous config saved to /var/cache/conftool/dbconfig/20260414-114732-fceratto.json
- 11:47 ayounsi@cumin1003: conftool action : set/pooled=no; selector: cluster=dnsbox,dc=esams [reason: esams maintenance over]
- 11:46 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: depool esams [reason: network maintenance, T416450]
- 11:46 ayounsi@cumin1003: START - Cookbook sre.dns.admin DNS admin: depool esams [reason: network maintenance, T416450]
- 11:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2146 (T419635)', diff saved to https://phabricator.wikimedia.org/P90646 and previous config saved to /var/cache/conftool/dbconfig/20260414-113721-fceratto.json
- 11:35 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2146 (T419635)', diff saved to https://phabricator.wikimedia.org/P90645 and previous config saved to /var/cache/conftool/dbconfig/20260414-113510-fceratto.json
- 11:35 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2146.codfw.wmnet with reason: Maintenance
- 11:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2145 (T419635)', diff saved to https://phabricator.wikimedia.org/P90644 and previous config saved to /var/cache/conftool/dbconfig/20260414-113456-fceratto.json
- 11:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2145', diff saved to https://phabricator.wikimedia.org/P90643 and previous config saved to /var/cache/conftool/dbconfig/20260414-112448-fceratto.json
- 11:24 root@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1153: Security updates
- 11:24 root@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
- 11:24 root@cumin1003: START - Cookbook sre.mysql.parsercache
- 11:24 root@cumin1003: START - Cookbook sre.mysql.pool pool db1153: Security updates
- 11:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2145', diff saved to https://phabricator.wikimedia.org/P90641 and previous config saved to /var/cache/conftool/dbconfig/20260414-111440-fceratto.json
- 11:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2145 (T419635)', diff saved to https://phabricator.wikimedia.org/P90640 and previous config saved to /var/cache/conftool/dbconfig/20260414-110432-fceratto.json
- 11:00 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2068.codfw.wmnet with OS bullseye
- 10:59 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2145 (T419635)', diff saved to https://phabricator.wikimedia.org/P90639 and previous config saved to /var/cache/conftool/dbconfig/20260414-105920-fceratto.json
- 10:59 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2145.codfw.wmnet with reason: Maintenance
- 10:59 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2141.codfw.wmnet with reason: Maintenance
- 10:56 root@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1153: Security updates
- 10:56 root@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
- 10:56 root@cumin1003: START - Cookbook sre.mysql.parsercache
- 10:56 root@cumin1003: START - Cookbook sre.mysql.depool depool db1153: Security updates
- 10:55 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1151: Security update
- 10:55 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
- 10:54 fceratto@cumin1003: START - Cookbook sre.mysql.parsercache
- 10:54 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1151: Security update
- 10:25 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cumin1003.eqiad.wmnet
- 10:24 volans@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:05:00 on sretest1005.eqiad.wmnet with reason: Testing cumin v6.0.0
- 10:23 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
- 10:23 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
- 10:21 volans: install cumin v6.0.0 on cumin1003 (last host remained to upgrade)
- 10:19 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host cumin1003.eqiad.wmnet
- 10:16 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
- 10:14 fceratto@cumin2002: dbctl commit (dc=all): 'Pool in', diff saved to https://phabricator.wikimedia.org/P90636 and previous config saved to /var/cache/conftool/dbconfig/20260414-101428-fceratto.json
- 10:14 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
- 10:13 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2068.codfw.wmnet
- 10:11 marostegui@cumin1003: dbctl commit (dc=all): 'Fully repool db1168', diff saved to https://phabricator.wikimedia.org/P90635 and previous config saved to /var/cache/conftool/dbconfig/20260414-101119-marostegui.json
- 10:10 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
- 10:10 marostegui@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db1168: after reimage to trixie
- 10:09 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
- 10:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2146', diff saved to https://phabricator.wikimedia.org/P90634 and previous config saved to /var/cache/conftool/dbconfig/20260414-100942-fceratto.json
- 10:09 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
- 10:08 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
- 10:07 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1168: after reimage to trixie
- 10:06 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1168.eqiad.wmnet with OS trixie
- 10:04 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
- 10:04 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
- 10:03 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
- 10:02 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
- 09:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2146', diff saved to https://phabricator.wikimedia.org/P90632 and previous config saved to /var/cache/conftool/dbconfig/20260414-095934-fceratto.json
- 09:56 elukey: rotated debmonitor client and server certs fleetwide for intermediate certs rotation - T420993
- 09:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2146 (T419635)', diff saved to https://phabricator.wikimedia.org/P90631 and previous config saved to /var/cache/conftool/dbconfig/20260414-094926-fceratto.json
- 09:47 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool pc2011: Test depool
- 09:47 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
- 09:46 fceratto@cumin1003: START - Cookbook sre.mysql.parsercache
- 09:46 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool pc2011: Test depool
- 09:46 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool pc2011: Test depool
- 09:46 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
- 09:46 fceratto@cumin1003: START - Cookbook sre.mysql.parsercache
- 09:46 fceratto@cumin1003: START - Cookbook sre.mysql.depool depool pc2011: Test depool
- 09:46 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool pc2011: Test depool
- 09:46 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
- 09:46 fceratto@cumin1003: START - Cookbook sre.mysql.parsercache
- 09:46 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool pc2011: Test depool
- 09:45 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2144.codfw.wmnet with reason: T419961
- 09:38 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1151.eqiad.wmnet with reason: T419961
- 09:32 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool pc2011: Test depool
- 09:32 fceratto@cumin1003: END (FAIL) - Cookbook sre.mysql.parsercache (exit_code=99)
- 09:32 fceratto@cumin1003: START - Cookbook sre.mysql.parsercache
- 09:32 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool pc2011: Test depool
- 09:32 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2146 (T419635)', diff saved to https://phabricator.wikimedia.org/P90627 and previous config saved to /var/cache/conftool/dbconfig/20260414-093204-fceratto.json
- 09:31 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool pc2011: Test depool
- 09:31 fceratto@cumin1003: END (FAIL) - Cookbook sre.mysql.parsercache (exit_code=99)
- 09:31 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2146.codfw.wmnet with reason: Maintenance
- 09:31 fceratto@cumin1003: START - Cookbook sre.mysql.parsercache
- 09:31 fceratto@cumin1003: START - Cookbook sre.mysql.depool depool pc2011: Test depool
- 09:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2145 (T419635)', diff saved to https://phabricator.wikimedia.org/P90625 and previous config saved to /var/cache/conftool/dbconfig/20260414-093138-fceratto.json
- 09:29 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2144: Test depool
- 09:29 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
- 09:29 fceratto@cumin1003: START - Cookbook sre.mysql.parsercache
- 09:29 fceratto@cumin1003: START - Cookbook sre.mysql.depool depool db2144: Test depool
- 09:27 root@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1151: Security updates
- 09:27 root@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
- 09:27 root@cumin1003: START - Cookbook sre.mysql.parsercache
- 09:27 root@cumin1003: START - Cookbook sre.mysql.depool depool db1151: Security updates
- 09:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host backup1006
- 09:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host backup1006
- 09:24 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host backup1006
- 09:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) backup1006.eqiad.wmnet 162.32.64.10.in-addr.arpa 2.6.1.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
- 09:24 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache backup1006.eqiad.wmnet 162.32.64.10.in-addr.arpa 2.6.1.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
- 09:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 09:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host backup1006 - ayounsi@cumin1003"
- 09:24 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host backup1006 - ayounsi@cumin1003"
- 09:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2145', diff saved to https://phabricator.wikimedia.org/P90623 and previous config saved to /var/cache/conftool/dbconfig/20260414-092130-fceratto.json
- 09:19 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
- 09:17 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host backup1006
- 09:12 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on backup[1006-1007,1014].eqiad.wmnet with reason: maintenance
- 09:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2145', diff saved to https://phabricator.wikimedia.org/P90621 and previous config saved to /var/cache/conftool/dbconfig/20260414-091122-fceratto.json
- 09:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2145 (T419635)', diff saved to https://phabricator.wikimedia.org/P90620 and previous config saved to /var/cache/conftool/dbconfig/20260414-090112-fceratto.json
- 08:45 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2180: repool after maintenance
- 08:43 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2145 (T419635)', diff saved to https://phabricator.wikimedia.org/P90618 and previous config saved to /var/cache/conftool/dbconfig/20260414-084353-fceratto.json
- 08:43 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2145.codfw.wmnet with reason: Maintenance
- 08:34 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2068.codfw.wmnet
- 08:34 mvernon@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ms-be2068.codfw.wmnet
- 08:31 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1168.eqiad.wmnet with reason: host reimage
- 08:28 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2141.codfw.wmnet with reason: Maintenance
- 08:25 mvernon@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ms-be2068.codfw.wmnet
- 08:25 mvernon@cumin2002: START - Cookbook sre.swift.convert-disks for host ms-be2068
- 08:23 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2068.codfw.wmnet with OS bullseye
- 08:22 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1168.eqiad.wmnet with reason: host reimage
- 08:20 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool pc2012: T419961
- 08:20 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
- 08:20 fceratto@cumin1003: START - Cookbook sre.mysql.parsercache
- 08:20 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool pc2012: T419961
- 08:07 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1168.eqiad.wmnet with OS trixie
- 08:06 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1168: Reimage to Trixie
- 08:05 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1168: Reimage to Trixie
- 08:05 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1168.eqiad.wmnet with reason: Reimage to Trixie
- 08:05 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2068.codfw.wmnet with reason: host reimage
- 08:04 moritzm: installing libnginx-mod-http-lua security updates
- 08:02 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool pc2012: T419961
- 08:02 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
- 08:02 fceratto@cumin1003: START - Cookbook sre.mysql.parsercache
- 08:02 fceratto@cumin1003: START - Cookbook sre.mysql.depool depool pc2012: T419961
- 08:02 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool pc1012: T419961
- 08:02 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
- 08:02 fceratto@cumin1003: START - Cookbook sre.mysql.parsercache
- 08:02 fceratto@cumin1003: START - Cookbook sre.mysql.depool depool pc1012: T419961
- 08:02 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool pc1012: T419961
- 08:02 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
- 08:02 fceratto@cumin1003: START - Cookbook sre.mysql.parsercache
- 08:01 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool pc1012: T419961
- 08:00 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2180: repool after maintenance
- 08:00 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2180.codfw.wmnet with OS trixie
- 07:58 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2068.codfw.wmnet with reason: host reimage
- 07:51 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1180: after upgrade
- 07:49 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on pc1012.eqiad.wmnet with reason: T419961
- 07:49 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on pc2012.codfw.wmnet with reason: T419961
- 07:41 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host ms-be2068
- 07:41 mvernon@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-be2068
- 07:40 mvernon@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ms-be2068
- 07:40 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ms-be2068.codfw.wmnet 91.32.192.10.in-addr.arpa 1.9.0.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
- 07:40 mvernon@cumin2002: START - Cookbook sre.dns.wipe-cache ms-be2068.codfw.wmnet 91.32.192.10.in-addr.arpa 1.9.0.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
- 07:40 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 07:40 mvernon@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be2068 - mvernon@cumin2002"
- 07:40 mvernon@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be2068 - mvernon@cumin2002"
- 07:38 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2180.codfw.wmnet with reason: host reimage
- 07:36 mvernon@cumin2002: START - Cookbook sre.dns.netbox
- 07:36 mvernon@cumin2002: START - Cookbook sre.hosts.move-vlan for host ms-be2068
- 07:35 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2068.codfw.wmnet with OS bullseye
- 07:32 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2180.codfw.wmnet with reason: host reimage
- 07:22 mszwarc@deploy1003: Finished scap sync-world: Backport for Prepare $wgOATH2FARequiredGroupRemovalPages for next groups (T423118) (duration: 12m 36s)
- 07:16 mszwarc@deploy1003: mszwarc: Continuing with sync
- 07:15 mszwarc@deploy1003: mszwarc: Backport for Prepare $wgOATH2FARequiredGroupRemovalPages for next groups (T423118) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 07:14 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2180.codfw.wmnet with OS trixie
- 07:12 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2180: Reimage to Trixie
- 07:11 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2180: Reimage to Trixie
- 07:11 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2180.codfw.wmnet with reason: Reimage to Trixie
- 07:11 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2180.codfw.wmnet with reason: Reimage to Trixie
- 07:10 mszwarc@deploy1003: Started scap sync-world: Backport for Prepare $wgOATH2FARequiredGroupRemovalPages for next groups (T423118)
- 07:09 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2180.codfw.wmnet with reason: Reimage to Trixie
- 07:09 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2180.codfw.wmnet with reason: Reimage to Trixie
- 07:06 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1180: after upgrade
- 06:58 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2217: repool after reimage to trixie
- 06:57 jmm@dns1004: END - running authdns-update
- 06:56 jmm@dns1004: START - running authdns-update
- 06:47 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host puppetserver1002.eqiad.wmnet
- 06:40 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host puppetserver1002.eqiad.wmnet
- 06:40 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host puppetserver2002.codfw.wmnet
- 06:33 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host puppetserver2002.codfw.wmnet
- 06:30 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-wmde: apply
- 06:30 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1180.eqiad.wmnet with OS trixie
- 06:30 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-wmde: apply
- 06:27 jmm@dns1004: END - running authdns-update
- 06:25 jmm@dns1004: START - running authdns-update
- 06:22 jmm@dns1004: END - running authdns-update
- 06:20 jmm@dns1004: START - running authdns-update
- 06:13 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2217: repool after reimage to trixie
- 06:11 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2217.codfw.wmnet with OS trixie
- 06:08 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1180.eqiad.wmnet with reason: host reimage
- 06:02 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1180.eqiad.wmnet with reason: host reimage
- 06:02 jmm@dns1004: END - running authdns-update
- 06:00 jmm@dns1004: START - running authdns-update
- 05:54 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2217.codfw.wmnet with reason: host reimage
- 05:49 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2217.codfw.wmnet with reason: host reimage
- 05:47 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1180.eqiad.wmnet with OS trixie
- 05:46 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.depool (exit_code=99) depool db1180: Upgrade package
- 05:35 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1180.eqiad.wmnet with reason: Reimage to Trixie
- 05:35 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1180: Upgrade package
- 05:30 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2217.codfw.wmnet with OS trixie
- 05:28 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2217: Reimage
- 05:27 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2217: Reimage
- 05:27 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2217.codfw.wmnet with reason: Reimage to Trixie
- 04:02 mwpresync@deploy1003: Pruned MediaWiki: 1.46.0-wmf.21 (duration: 02m 34s)
- 03:39 mwpresync@deploy1003: Finished scap sync-world: testwikis to 1.46.0-wmf.24 refs T420482 (duration: 35m 44s)
- 03:03 mwpresync@deploy1003: Started scap sync-world: testwikis to 1.46.0-wmf.24 refs T420482
- 00:57 ladsgroup@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1227: Work done
- 00:51 eevans@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for aqs1025.eqiad.wmnet
- 00:51 eevans@cumin1003: START - Cookbook sre.hosts.remove-downtime for aqs1025.eqiad.wmnet
- 00:39 ladsgroup@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2208: Work done
- 00:09 ladsgroup@cumin1003: START - Cookbook sre.mysql.pool pool db1227: Work done
- 00:08 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: sync
- 00:05 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: sync
2026-04-13
- 23:54 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/data-gateway: sync
- 23:53 eevans@deploy1003: helmfile [staging] START helmfile.d/services/data-gateway: sync
- 23:49 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: sync
- 23:49 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: sync
- 23:49 ladsgroup@cumin1003: START - Cookbook sre.mysql.pool pool db2208: Work done
- 23:05 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-upgrade-ats (exit_code=0) Rolling upgrade of ATS on A:cp-text_codfw - 9.2.13 Upgrade ()
- 23:02 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-upgrade-ats (exit_code=0) Rolling upgrade of ATS on A:cp-upload_codfw - 9.2.13 Upgrade ()
- 22:29 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5024.*
- 22:29 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5023.*
- 22:26 brett@cumin2002: START - Cookbook sre.cdn.roll-upgrade-ats Rolling upgrade of ATS on A:cp-upload_codfw - 9.2.13 Upgrade ()
- 22:26 brett@cumin2002: START - Cookbook sre.cdn.roll-upgrade-ats Rolling upgrade of ATS on A:cp-text_codfw - 9.2.13 Upgrade ()
- 22:23 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-upgrade-ats (exit_code=0) Rolling upgrade of ATS on P{cp[5023-5024].eqsin.wmnet} and A:cp - 9.2.13 Upgrade ()
- 22:15 sbassett@deploy1003: Finished scap sync-world: Deployed security fix for T422085 (duration: 30m 14s)
- 22:08 brett@cumin2002: START - Cookbook sre.cdn.roll-upgrade-ats Rolling upgrade of ATS on P{cp[5023-5024].eqsin.wmnet} and A:cp - 9.2.13 Upgrade ()
- 22:08 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-upgrade-ats (exit_code=0) Rolling upgrade of ATS on A:cp-upload_eqsin - 9.2.13 Upgrade ()
- 22:04 brett@dns1006: END - running authdns-update
- 22:04 swfrench-wmf: applied pending external-services network policy diffs for aqs1025 in wikikube clusters
- 22:03 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
- 22:02 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
- 22:02 brett@dns1006: START - running authdns-update
- 21:56 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
- 21:55 swfrench@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
- 21:55 brett@cumin2002: END (FAIL) - Cookbook sre.cdn.roll-upgrade-ats (exit_code=1) Rolling upgrade of ATS on A:cp-text_eqsin - 9.2.13 Upgrade ()
- 21:55 swfrench@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
- 21:54 swfrench@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
- 21:53 swfrench@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
- 21:52 swfrench@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
- 21:44 sbassett@deploy1003: Started scap sync-world: Deployed security fix for T422085
- 21:41 sbassett: Deployed security patch for T418533
- 21:16 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db2208 (T410589)', diff saved to https://phabricator.wikimedia.org/P90589 and previous config saved to /var/cache/conftool/dbconfig/20260413-211606-ladsgroup.json
- 21:15 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db2208.codfw.wmnet with reason: Maintenance
- 21:13 brett@cumin2002: START - Cookbook sre.cdn.roll-upgrade-ats Rolling upgrade of ATS on A:cp-upload_eqsin - 9.2.13 Upgrade ()
- 21:13 brett@cumin2002: START - Cookbook sre.cdn.roll-upgrade-ats Rolling upgrade of ATS on A:cp-text_eqsin - 9.2.13 Upgrade ()
- 21:08 eevans@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on aqs1025.eqiad.wmnet with reason: Bootstrapping — T412830
- 20:55 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2198.codfw.wmnet with reason: Maintenance
- 20:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2195 (T419635)', diff saved to https://phabricator.wikimedia.org/P90588 and previous config saved to /var/cache/conftool/dbconfig/20260413-205531-fceratto.json
- 20:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2195', diff saved to https://phabricator.wikimedia.org/P90587 and previous config saved to /var/cache/conftool/dbconfig/20260413-204523-fceratto.json
- 20:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2195', diff saved to https://phabricator.wikimedia.org/P90586 and previous config saved to /var/cache/conftool/dbconfig/20260413-203514-fceratto.json
- 20:31 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-upgrade-ats (exit_code=0) Rolling upgrade of ATS on P{cp[3066,3068-3073].esams.wmnet} and A:cp - 9.2.13 Upgrade ()
- 20:28 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-upgrade-ats (exit_code=0) Rolling upgrade of ATS on P{cp[3075-3081].esams.wmnet} and A:cp - 9.2.13 Upgrade ()
- 20:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2195 (T419635)', diff saved to https://phabricator.wikimedia.org/P90585 and previous config saved to /var/cache/conftool/dbconfig/20260413-202506-fceratto.json
- 20:22 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2195 (T419635)', diff saved to https://phabricator.wikimedia.org/P90584 and previous config saved to /var/cache/conftool/dbconfig/20260413-202201-fceratto.json
- 20:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2195.codfw.wmnet with reason: Maintenance
- 20:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2181 (T419635)', diff saved to https://phabricator.wikimedia.org/P90583 and previous config saved to /var/cache/conftool/dbconfig/20260413-202137-fceratto.json
- 20:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2181', diff saved to https://phabricator.wikimedia.org/P90582 and previous config saved to /var/cache/conftool/dbconfig/20260413-201130-fceratto.json
- 20:07 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 20:07 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 20:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2181', diff saved to https://phabricator.wikimedia.org/P90581 and previous config saved to /var/cache/conftool/dbconfig/20260413-200122-fceratto.json
- 20:01 andrewtavis-wmde@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-wmde: apply
- 20:01 andrewtavis-wmde@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-wmde: apply
- 19:56 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudvirtlocal1003.eqiad.wmnet
- 19:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2181 (T419635)', diff saved to https://phabricator.wikimedia.org/P90580 and previous config saved to /var/cache/conftool/dbconfig/20260413-195113-fceratto.json
- 19:49 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudvirtlocal1003.eqiad.wmnet
- 19:49 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudvirtlocal1002.eqiad.wmnet
- 19:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2181 (T419635)', diff saved to https://phabricator.wikimedia.org/P90579 and previous config saved to /var/cache/conftool/dbconfig/20260413-194759-fceratto.json
- 19:47 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2181.codfw.wmnet with reason: Maintenance
- 19:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2167 (T419635)', diff saved to https://phabricator.wikimedia.org/P90578 and previous config saved to /var/cache/conftool/dbconfig/20260413-194734-fceratto.json
- 19:46 brett@cumin2002: START - Cookbook sre.cdn.roll-upgrade-ats Rolling upgrade of ATS on P{cp[3075-3081].esams.wmnet} and A:cp - 9.2.13 Upgrade ()
- 19:45 brett@cumin2002: START - Cookbook sre.cdn.roll-upgrade-ats Rolling upgrade of ATS on P{cp[3066,3068-3073].esams.wmnet} and A:cp - 9.2.13 Upgrade ()
- 19:42 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudvirtlocal1002.eqiad.wmnet
- 19:42 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudvirtlocal1001.eqiad.wmnet
- 19:39 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-upgrade-ats (exit_code=0) Rolling upgrade of ATS on A:cp-text_drmrs - 9.2.13 Upgrade ()
- 19:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2167', diff saved to https://phabricator.wikimedia.org/P90577 and previous config saved to /var/cache/conftool/dbconfig/20260413-193726-fceratto.json
- 19:36 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-upgrade-ats (exit_code=0) Rolling upgrade of ATS on A:cp-upload_drmrs - 9.2.13 Upgrade ()
- 19:35 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudvirtlocal1001.eqiad.wmnet
- 19:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2167', diff saved to https://phabricator.wikimedia.org/P90576 and previous config saved to /var/cache/conftool/dbconfig/20260413-192715-fceratto.json
- 19:25 cdanis: 💙cdanis@apt1002.wikimedia.org ~ 🕞🍵 sudo -i reprepro --component main --restrict cidergrinder update trixie-wikimedia
- 19:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2167 (T419635)', diff saved to https://phabricator.wikimedia.org/P90575 and previous config saved to /var/cache/conftool/dbconfig/20260413-191707-fceratto.json
- 19:14 swfrench-wmf: applied aqs cassandra host list changes from https://gerrit.wikimedia.org/r/1270496 - T423168
- 19:13 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2167 (T419635)', diff saved to https://phabricator.wikimedia.org/P90574 and previous config saved to /var/cache/conftool/dbconfig/20260413-191355-fceratto.json
- 19:13 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2167.codfw.wmnet with reason: Maintenance
- 19:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2166 (T419635)', diff saved to https://phabricator.wikimedia.org/P90573 and previous config saved to /var/cache/conftool/dbconfig/20260413-191330-fceratto.json
- 19:12 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/page-analytics: apply
- 19:11 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/page-analytics: apply
- 19:11 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/media-analytics: apply
- 19:10 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/media-analytics: apply
- 19:10 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/geo-analytics: apply
- 19:10 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/geo-analytics: apply
- 19:09 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/edit-analytics: apply
- 19:09 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/edit-analytics: apply
- 19:08 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/device-analytics: apply
- 19:08 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/device-analytics: apply
- 19:07 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/data-gateway: apply
- 19:07 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/data-gateway: apply
- 19:06 zabe@deploy1003: Finished scap sync-world: Backport for Revert "NewFilesPager: Make sure filerevision is queried before file" (duration: 05m 51s)
- 19:03 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2166', diff saved to https://phabricator.wikimedia.org/P90572 and previous config saved to /var/cache/conftool/dbconfig/20260413-190322-fceratto.json
- 19:02 zabe@deploy1003: zabe: Continuing with sync
- 19:02 zabe@deploy1003: zabe: Backport for Revert "NewFilesPager: Make sure filerevision is queried before file" synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 19:00 zabe@deploy1003: Started scap sync-world: Backport for Revert "NewFilesPager: Make sure filerevision is queried before file"
- 18:55 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/page-analytics: apply
- 18:54 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/page-analytics: apply
- 18:53 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/page-analytics: apply
- 18:53 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/media-analytics: apply
- 18:53 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2166', diff saved to https://phabricator.wikimedia.org/P90571 and previous config saved to /var/cache/conftool/dbconfig/20260413-185314-fceratto.json
- 18:52 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/media-analytics: apply
- 18:52 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/geo-analytics: apply
- 18:52 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/geo-analytics: apply
- 18:52 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/edit-analytics: apply
- 18:51 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/edit-analytics: apply
- 18:51 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/device-analytics: apply
- 18:51 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/device-analytics: apply
- 18:51 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/data-gateway: apply
- 18:51 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/data-gateway: apply
- 18:45 brett@cumin2002: START - Cookbook sre.cdn.roll-upgrade-ats Rolling upgrade of ATS on A:cp-upload_drmrs - 9.2.13 Upgrade ()
- 18:45 brett@cumin2002: START - Cookbook sre.cdn.roll-upgrade-ats Rolling upgrade of ATS on A:cp-text_drmrs - 9.2.13 Upgrade ()
- 18:44 zabe@deploy1003: Sync cancelled.
- 18:43 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2166 (T419635)', diff saved to https://phabricator.wikimedia.org/P90570 and previous config saved to /var/cache/conftool/dbconfig/20260413-184305-fceratto.json
- 18:41 zabe@deploy1003: zabe: Backport for NewFilesPager: Make sure filerevision is queried before file (T422946) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 18:40 zabe@deploy1003: Started scap sync-world: Backport for NewFilesPager: Make sure filerevision is queried before file (T422946)
- 18:39 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2166 (T419635)', diff saved to https://phabricator.wikimedia.org/P90569 and previous config saved to /var/cache/conftool/dbconfig/20260413-183953-fceratto.json
- 18:39 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2166.codfw.wmnet with reason: Maintenance
- 18:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2164 (T419635)', diff saved to https://phabricator.wikimedia.org/P90568 and previous config saved to /var/cache/conftool/dbconfig/20260413-183927-fceratto.json
- 18:37 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-upgrade-ats (exit_code=0) Rolling upgrade of ATS on A:cp-text_ulsfo - 9.2.13 Upgrade ()
- 18:36 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-upgrade-ats (exit_code=0) Rolling upgrade of ATS on A:cp-upload_ulsfo - 9.2.13 Upgrade ()
- 18:30 root@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool pc1018: Security updates
- 18:30 root@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
- 18:30 root@cumin1003: START - Cookbook sre.mysql.parsercache
- 18:30 root@cumin1003: START - Cookbook sre.mysql.pool pool pc1018: Security updates
- 18:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2164', diff saved to https://phabricator.wikimedia.org/P90566 and previous config saved to /var/cache/conftool/dbconfig/20260413-182919-fceratto.json
- 18:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2164', diff saved to https://phabricator.wikimedia.org/P90565 and previous config saved to /var/cache/conftool/dbconfig/20260413-181911-fceratto.json
- 18:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2164 (T419635)', diff saved to https://phabricator.wikimedia.org/P90564 and previous config saved to /var/cache/conftool/dbconfig/20260413-180902-fceratto.json
- 18:05 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2164 (T419635)', diff saved to https://phabricator.wikimedia.org/P90563 and previous config saved to /var/cache/conftool/dbconfig/20260413-180551-fceratto.json
- 18:05 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2164.codfw.wmnet with reason: Maintenance
- 18:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2163 (T419635)', diff saved to https://phabricator.wikimedia.org/P90562 and previous config saved to /var/cache/conftool/dbconfig/20260413-180525-fceratto.json
- 18:04 root@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool pc1018: Security updates
- 18:04 root@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
- 18:04 root@cumin1003: START - Cookbook sre.mysql.parsercache
- 18:04 root@cumin1003: START - Cookbook sre.mysql.depool depool pc1018: Security updates
- 17:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2163', diff saved to https://phabricator.wikimedia.org/P90560 and previous config saved to /var/cache/conftool/dbconfig/20260413-175517-fceratto.json
- 17:52 brett@cumin2002: START - Cookbook sre.cdn.roll-upgrade-ats Rolling upgrade of ATS on A:cp-upload_ulsfo - 9.2.13 Upgrade ()
- 17:52 brett@cumin2002: START - Cookbook sre.cdn.roll-upgrade-ats Rolling upgrade of ATS on A:cp-text_ulsfo - 9.2.13 Upgrade ()
- 17:46 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/editor-analytics: apply
- 17:46 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/editor-analytics: apply
- 17:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2163', diff saved to https://phabricator.wikimedia.org/P90559 and previous config saved to /var/cache/conftool/dbconfig/20260413-174509-fceratto.json
- 17:40 swfrench-wmf: applied latent external-services network policy changes for aqs{1023,1024} - T423168
- 17:37 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
- 17:36 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
- 17:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2163 (T419635)', diff saved to https://phabricator.wikimedia.org/P90558 and previous config saved to /var/cache/conftool/dbconfig/20260413-173501-fceratto.json
- 17:34 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
- 17:33 root@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool pc1017: Security updates
- 17:33 root@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
- 17:33 root@cumin1003: START - Cookbook sre.mysql.parsercache
- 17:33 root@cumin1003: START - Cookbook sre.mysql.pool pool pc1017: Security updates
- 17:33 swfrench@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
- 17:33 Amir1: dropping templatelinks and pagelinks on testcommonswiki core db (T421914)
- 17:32 swfrench@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
- 17:31 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2163 (T419635)', diff saved to https://phabricator.wikimedia.org/P90556 and previous config saved to /var/cache/conftool/dbconfig/20260413-173148-fceratto.json
- 17:31 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2163.codfw.wmnet with reason: Maintenance
- 17:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2161 (T419635)', diff saved to https://phabricator.wikimedia.org/P90555 and previous config saved to /var/cache/conftool/dbconfig/20260413-173123-fceratto.json
- 17:31 swfrench@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
- 17:30 ladsgroup@deploy1003: Finished scap sync-world: Backport for Revert^6 "Use envoy for swift inside mediawiki" (duration: 07m 31s)
- 17:29 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-upgrade-ats (exit_code=0) Rolling upgrade of ATS on P{cp[7002-7008].magru.wmnet} and A:cp - 9.2.13 Upgrade ()
- 17:26 ladsgroup@deploy1003: ladsgroup: Continuing with sync
- 17:26 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-upgrade-ats (exit_code=0) Rolling upgrade of ATS on P{cp[7010-7016].magru.wmnet} and A:cp - 9.2.13 Upgrade ()
- 17:24 ladsgroup@deploy1003: ladsgroup: Backport for Revert^6 "Use envoy for swift inside mediawiki" synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 17:23 ladsgroup@deploy1003: Started scap sync-world: Backport for Revert^6 "Use envoy for swift inside mediawiki"
- 17:21 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host phab2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 17:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2161', diff saved to https://phabricator.wikimedia.org/P90554 and previous config saved to /var/cache/conftool/dbconfig/20260413-172115-fceratto.json
- 17:20 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/editor-analytics: apply
- 17:19 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/editor-analytics: apply
- 17:19 swfrench@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
- 17:18 swfrench@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
- 17:13 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/editor-analytics: apply
- 17:12 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host phab2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 17:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2161', diff saved to https://phabricator.wikimedia.org/P90553 and previous config saved to /var/cache/conftool/dbconfig/20260413-171107-fceratto.json
- 17:06 root@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool pc1017: Security updates
- 17:06 root@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
- 17:06 root@cumin1003: START - Cookbook sre.mysql.parsercache
- 17:06 root@cumin1003: START - Cookbook sre.mysql.depool depool pc1017: Security updates
- 17:03 ladsgroup@deploy1003: Finished scap sync-world: Backport for ExternalStore: Start reading and writing from clusters 32 and 33 (T421729) (duration: 06m 43s)
- 17:03 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/editor-analytics: apply
- 17:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2161 (T419635)', diff saved to https://phabricator.wikimedia.org/P90551 and previous config saved to /var/cache/conftool/dbconfig/20260413-170059-fceratto.json
- 16:59 ladsgroup@deploy1003: ladsgroup: Continuing with sync
- 16:58 ladsgroup@deploy1003: ladsgroup: Backport for ExternalStore: Start reading and writing from clusters 32 and 33 (T421729) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 16:57 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2161 (T419635)', diff saved to https://phabricator.wikimedia.org/P90550 and previous config saved to /var/cache/conftool/dbconfig/20260413-165747-fceratto.json
- 16:57 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2161.codfw.wmnet with reason: Maintenance
- 16:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2154 (T419635)', diff saved to https://phabricator.wikimedia.org/P90549 and previous config saved to /var/cache/conftool/dbconfig/20260413-165721-fceratto.json
- 16:56 ladsgroup@deploy1003: Started scap sync-world: Backport for ExternalStore: Start reading and writing from clusters 32 and 33 (T421729)
- 16:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2154', diff saved to https://phabricator.wikimedia.org/P90548 and previous config saved to /var/cache/conftool/dbconfig/20260413-164713-fceratto.json
- 16:46 mutante: contint2002 (prod CI) - re-enabled puppet - this applied a refresh of the contint.wikimedia.org certificate (T423152 T420993)
- 16:44 brett@cumin2002: START - Cookbook sre.cdn.roll-upgrade-ats Rolling upgrade of ATS on P{cp[7010-7016].magru.wmnet} and A:cp - 9.2.13 Upgrade ()
- 16:44 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1187: After Reimage
- 16:44 brett@cumin2002: START - Cookbook sre.cdn.roll-upgrade-ats Rolling upgrade of ATS on P{cp[7002-7008].magru.wmnet} and A:cp - 9.2.13 Upgrade ()
- 16:44 mutante: contint2002 (prod CI) - re-enabled puppet - this applied a refresh of the contint.wikimedia.org certificate
- 16:40 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/editor-analytics: apply
- 16:40 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/editor-analytics: apply
- 16:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2154', diff saved to https://phabricator.wikimedia.org/P90546 and previous config saved to /var/cache/conftool/dbconfig/20260413-163706-fceratto.json
- 16:36 root@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool pc1014: Security updates
- 16:36 root@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
- 16:35 root@cumin1003: START - Cookbook sre.mysql.parsercache
- 16:35 root@cumin1003: START - Cookbook sre.mysql.pool pool pc1014: Security updates
- 16:35 Amir1: banning non-standard thumbs with external referrer regardless of cache status (T414805)
- 16:28 cgoubert@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker1273.eqiad.wmnet
- 16:28 cgoubert@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker1273.eqiad.wmnet
- 16:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2154 (T419635)', diff saved to https://phabricator.wikimedia.org/P90543 and previous config saved to /var/cache/conftool/dbconfig/20260413-162657-fceratto.json
- 16:23 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2154 (T419635)', diff saved to https://phabricator.wikimedia.org/P90542 and previous config saved to /var/cache/conftool/dbconfig/20260413-162344-fceratto.json
- 16:23 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2154.codfw.wmnet with reason: Maintenance
- 16:23 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2152 (T419635)', diff saved to https://phabricator.wikimedia.org/P90541 and previous config saved to /var/cache/conftool/dbconfig/20260413-162318-fceratto.json
- 16:14 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host clouddb1019.eqiad.wmnet with OS trixie
- 16:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2152', diff saved to https://phabricator.wikimedia.org/P90539 and previous config saved to /var/cache/conftool/dbconfig/20260413-161310-fceratto.json
- 16:07 root@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool pc1014: Security updates
- 16:07 root@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
- 16:07 root@cumin1003: START - Cookbook sre.mysql.parsercache
- 16:07 root@cumin1003: START - Cookbook sre.mysql.depool depool pc1014: Security updates
- 16:03 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2152', diff saved to https://phabricator.wikimedia.org/P90537 and previous config saved to /var/cache/conftool/dbconfig/20260413-160301-fceratto.json
- 16:02 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2069.codfw.wmnet with OS bullseye
- 15:59 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1187: After Reimage
- 15:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2152 (T419635)', diff saved to https://phabricator.wikimedia.org/P90535 and previous config saved to /var/cache/conftool/dbconfig/20260413-155253-fceratto.json
- 15:51 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1187.eqiad.wmnet with OS trixie
- 15:50 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2224: After Reimage
- 15:49 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2152 (T419635)', diff saved to https://phabricator.wikimedia.org/P90533 and previous config saved to /var/cache/conftool/dbconfig/20260413-154937-fceratto.json
- 15:49 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2152.codfw.wmnet with reason: Maintenance
- 15:46 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1166: repool after maintenance
- 15:43 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2069.codfw.wmnet with reason: host reimage
- 15:43 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 15:43 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 15:39 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2069.codfw.wmnet with reason: host reimage
- 15:37 root@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool pc1013: Security updates
- 15:37 root@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
- 15:37 root@cumin1003: START - Cookbook sre.mysql.parsercache
- 15:37 root@cumin1003: START - Cookbook sre.mysql.pool pool pc1013: Security updates
- 15:36 moritzm: installing postgresql-15 security updates
- 15:32 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 15:32 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 15:31 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 15:31 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 15:31 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db1227 (T410589)', diff saved to https://phabricator.wikimedia.org/P90529 and previous config saved to /var/cache/conftool/dbconfig/20260413-153107-ladsgroup.json
- 15:30 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db1227.eqiad.wmnet with reason: Maintenance
- 15:30 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1202 (T410589)', diff saved to https://phabricator.wikimedia.org/P90527 and previous config saved to /var/cache/conftool/dbconfig/20260413-153042-ladsgroup.json
- 15:29 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1187.eqiad.wmnet with reason: host reimage
- 15:25 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1187.eqiad.wmnet with reason: host reimage
- 15:21 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
- 15:21 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
- 15:21 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
- 15:21 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
- 15:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 15:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 15:20 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
- 15:20 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
- 15:20 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1202', diff saved to https://phabricator.wikimedia.org/P90526 and previous config saved to /var/cache/conftool/dbconfig/20260413-152034-ladsgroup.json
- 15:16 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 15:16 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 15:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 15:13 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 15:11 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host phab2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 15:10 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1187.eqiad.wmnet with OS trixie
- 15:10 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1202', diff saved to https://phabricator.wikimedia.org/P90522 and previous config saved to /var/cache/conftool/dbconfig/20260413-151027-ladsgroup.json
- 15:10 root@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool pc1013: Security updates
- 15:10 root@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
- 15:09 root@cumin1003: START - Cookbook sre.mysql.parsercache
- 15:09 root@cumin1003: START - Cookbook sre.mysql.depool depool pc1013: Security updates
- 15:08 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1187: Upgrade package
- 15:08 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1187.eqiad.wmnet with reason: Reimage to Trixie
- 15:08 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1187: Upgrade package
- 15:07 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host phab2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 15:06 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host phab2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 15:05 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2224: After Reimage
- 15:04 marostegui@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db2224: After Reimage
- 15:04 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2224: After Reimage
- 15:04 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host phab2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 15:03 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2224.codfw.wmnet with OS trixie
- 15:01 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2222 (T419635)', diff saved to https://phabricator.wikimedia.org/P90518 and previous config saved to /var/cache/conftool/dbconfig/20260413-150116-fceratto.json
- 15:00 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1166: repool after maintenance
- 15:00 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1202 (T410589)', diff saved to https://phabricator.wikimedia.org/P90516 and previous config saved to /var/cache/conftool/dbconfig/20260413-150019-ladsgroup.json
- 14:54 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host clouddb1019.eqiad.wmnet with OS trixie
- 14:53 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool pc1012: T419961
- 14:53 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
- 14:53 fceratto@cumin1003: START - Cookbook sre.mysql.parsercache
- 14:53 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool pc1012: T419961
- 14:53 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool pc2012: T419961
- 14:53 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
- 14:53 fceratto@cumin1003: START - Cookbook sre.mysql.parsercache
- 14:53 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool pc2012: T419961
- 14:51 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host ms-be2069
- 14:51 mvernon@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-be2069
- 14:50 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2222', diff saved to https://phabricator.wikimedia.org/P90514 and previous config saved to /var/cache/conftool/dbconfig/20260413-145028-fceratto.json
- 14:49 mvernon@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ms-be2069
- 14:49 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ms-be2069.codfw.wmnet 181.48.192.10.in-addr.arpa 1.8.1.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
- 14:48 mvernon@cumin2002: START - Cookbook sre.dns.wipe-cache ms-be2069.codfw.wmnet 181.48.192.10.in-addr.arpa 1.8.1.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
- 14:48 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 14:48 mvernon@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be2069 - mvernon@cumin2002"
- 14:48 mvernon@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be2069 - mvernon@cumin2002"
- 14:44 mvernon@cumin2002: START - Cookbook sre.dns.netbox
- 14:40 mvernon@cumin2002: START - Cookbook sre.hosts.move-vlan for host ms-be2069
- 14:40 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2224.codfw.wmnet with reason: host reimage
- 14:39 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2069.codfw.wmnet with OS bullseye
- 14:39 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2222', diff saved to https://phabricator.wikimedia.org/P90513 and previous config saved to /var/cache/conftool/dbconfig/20260413-143939-fceratto.json
- 14:36 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2224.codfw.wmnet with reason: host reimage
- 14:34 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2070.codfw.wmnet with OS bullseye
- 14:28 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2222 (T419635)', diff saved to https://phabricator.wikimedia.org/P90512 and previous config saved to /var/cache/conftool/dbconfig/20260413-142851-fceratto.json
- 14:22 Lucas_WMDE: UTC afternoon backport+config window done
- 14:22 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for Record TOR account creation failure separately (T422283), stats: add counters for experiment account creation (T422283), GrowthSuggestionToneCheck: flag as non-experimental (T422835) (duration: 10m 22s)
- 14:20 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host an-redacteddb1001.eqiad.wmnet with OS bookworm
- 14:19 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
- 14:19 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthbook: apply
- 14:19 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2224.codfw.wmnet with OS trixie
- 14:19 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
- 14:18 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook: apply
- 14:18 lucaswerkmeister-wmde@deploy1003: migr, lucaswerkmeister-wmde, urbanecm: Continuing with sync
- 14:17 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
- 14:16 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2224: Reimage
- 14:16 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2224: Reimage
- 14:15 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2224.codfw.wmnet with reason: Reimage to Trixie
- 14:14 root@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool pc1012: Security updates
- 14:14 root@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
- 14:14 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2070.codfw.wmnet with reason: host reimage
- 14:14 root@cumin1003: START - Cookbook sre.mysql.parsercache
- 14:14 root@cumin1003: START - Cookbook sre.mysql.depool depool pc1012: Security updates
- 14:14 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool pc2011: T419961
- 14:14 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
- 14:14 fceratto@cumin2002: dbctl commit (dc=all): 'Depooling db2222 (T419635)', diff saved to https://phabricator.wikimedia.org/P90509 and previous config saved to /var/cache/conftool/dbconfig/20260413-141414-fceratto.json
- 14:14 inflatador: bking@apt1002 sudo -E reprepro --ignore=wrongdistribution -C component/opensearch2 include trixie-wikimedia ~/opensearch-madvise-0.2/opensearch-madvise_0.2_amd64.changes T422860
- 14:13 fceratto@cumin1003: START - Cookbook sre.mysql.parsercache
- 14:13 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool pc2011: T419961
- 14:13 fceratto@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2222.codfw.wmnet with reason: Maintenance
- 14:13 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool pc1011: T419961
- 14:13 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
- 14:13 lucaswerkmeister-wmde@deploy1003: migr, lucaswerkmeister-wmde, urbanecm: Backport for Record TOR account creation failure separately (T422283), stats: add counters for experiment account creation (T422283), GrowthSuggestionToneCheck: flag as non-experimental (T422835) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be
- 14:13 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2221 (T419635)', diff saved to https://phabricator.wikimedia.org/P90507 and previous config saved to /var/cache/conftool/dbconfig/20260413-141306-fceratto.json
- 14:12 fceratto@cumin1003: START - Cookbook sre.mysql.parsercache
- 14:12 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool pc1011: T419961
- 14:11 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for Record TOR account creation failure separately (T422283), stats: add counters for experiment account creation (T422283), GrowthSuggestionToneCheck: flag as non-experimental (T422835)
- 14:10 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
- 14:09 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2070.codfw.wmnet with reason: host reimage
- 14:08 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
- 14:02 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2221', diff saved to https://phabricator.wikimedia.org/P90506 and previous config saved to /var/cache/conftool/dbconfig/20260413-140218-fceratto.json
- 14:01 lucaswerkmeister-wmde@deploy1003: mwscript-k8s job started: namespaceDupes urwikisource --fix # T422824
- 14:00 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for EventStreamConfig: remove unused contextual attributes causing problems (T422001), [abstractwiki] Enable wgParserEnableUserLanguage, so we don't need ⧼lang⧽s, urwikisource: add مصنف (author) namespace (T422824) (duration: 08m 30s)
- 13:59 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-redacteddb1001.eqiad.wmnet with reason: host reimage
- 13:56 lucaswerkmeister-wmde@deploy1003: sgimeno, anzx, lucaswerkmeister-wmde, jforrester: Continuing with sync
- 13:55 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
- 13:55 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
- 13:53 lucaswerkmeister-wmde@deploy1003: sgimeno, anzx, lucaswerkmeister-wmde, jforrester: Backport for EventStreamConfig: remove unused contextual attributes causing problems (T422001), [abstractwiki] Enable wgParserEnableUserLanguage, so we don't need ⧼lang⧽s, urwikisource: add مصنف (author) namespace (T422824) synced to the testservers (see https://wikitec
- 13:53 moritzm: installing postgresql-common bugfix updates
- 13:52 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on an-redacteddb1001.eqiad.wmnet with reason: host reimage
- 13:52 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for EventStreamConfig: remove unused contextual attributes causing problems (T422001), [abstractwiki] Enable wgParserEnableUserLanguage, so we don't need ⧼lang⧽s, urwikisource: add مصنف (author) namespace (T422824)
- 13:51 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2221', diff saved to https://phabricator.wikimedia.org/P90505 and previous config saved to /var/cache/conftool/dbconfig/20260413-135129-fceratto.json
- 13:50 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host ms-be2070
- 13:50 mvernon@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-be2070
- 13:49 mvernon@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ms-be2070
- 13:49 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ms-be2070.codfw.wmnet 86.0.192.10.in-addr.arpa 6.8.0.0.0.0.0.0.2.9.1.0.0.1.0.0.1.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
- 13:49 mvernon@cumin2002: START - Cookbook sre.dns.wipe-cache ms-be2070.codfw.wmnet 86.0.192.10.in-addr.arpa 6.8.0.0.0.0.0.0.2.9.1.0.0.1.0.0.1.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
- 13:49 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 13:49 mvernon@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be2070 - mvernon@cumin2002"
- 13:49 mvernon@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be2070 - mvernon@cumin2002"
- 13:49 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for Re-add p-personal id to the user menu (T422885) (duration: 10m 41s)
- 13:44 mvernon@cumin2002: START - Cookbook sre.dns.netbox
- 13:44 mvernon@cumin2002: START - Cookbook sre.hosts.move-vlan for host ms-be2070
- 13:43 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2070.codfw.wmnet with OS bullseye
- 13:42 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, aude: Continuing with sync
- 13:42 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, aude: Backport for Re-add p-personal id to the user menu (T422885) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 13:41 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host phab1006.eqiad.wmnet with OS trixie
- 13:41 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
- 13:40 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2221 (T419635)', diff saved to https://phabricator.wikimedia.org/P90504 and previous config saved to /var/cache/conftool/dbconfig/20260413-134041-fceratto.json
- 13:38 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for Re-add p-personal id to the user menu (T422885)
- 13:37 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for Opt-in new accounts to ReadingLists beta feature on pilot wikis (T422833) (duration: 34m 09s)
- 13:37 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
- 13:36 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host an-redacteddb1001
- 13:36 btullis@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host an-redacteddb1001
- 13:35 btullis@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host an-redacteddb1001
- 13:35 btullis@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) an-redacteddb1001.eqiad.wmnet 18.48.64.10.in-addr.arpa 8.1.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
- 13:35 btullis@cumin1003: START - Cookbook sre.dns.wipe-cache an-redacteddb1001.eqiad.wmnet 18.48.64.10.in-addr.arpa 8.1.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
- 13:35 btullis@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 13:35 btullis@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host an-redacteddb1001 - btullis@cumin1003"
- 13:35 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host an-redacteddb1001 - btullis@cumin1003"
- 13:26 fceratto@cumin2002: dbctl commit (dc=all): 'Depooling db2221 (T419635)', diff saved to https://phabricator.wikimedia.org/P90503 and previous config saved to /var/cache/conftool/dbconfig/20260413-132604-fceratto.json
- 13:25 fceratto@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2221.codfw.wmnet with reason: Maintenance
- 13:25 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2218 (T419635)', diff saved to https://phabricator.wikimedia.org/P90502 and previous config saved to /var/cache/conftool/dbconfig/20260413-132457-fceratto.json
- 13:24 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host clouddb1019.eqiad.wmnet with OS trixie
- 13:24 root@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool pc1011: Security updates
- 13:24 root@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
- 13:24 root@cumin1003: START - Cookbook sre.mysql.parsercache
- 13:24 root@cumin1003: START - Cookbook sre.mysql.depool depool pc1011: Security updates
- 13:24 lucaswerkmeister-wmde@deploy1003: aude, lucaswerkmeister-wmde: Continuing with sync
- 13:24 btullis@cumin1003: START - Cookbook sre.dns.netbox
- 13:24 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host clouddb1019.eqiad.wmnet with OS trixie
- 13:21 lucaswerkmeister-wmde@deploy1003: aude, lucaswerkmeister-wmde: Backport for Opt-in new accounts to ReadingLists beta feature on pilot wikis (T422833) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 13:20 btullis@cumin1003: START - Cookbook sre.hosts.move-vlan for host an-redacteddb1001
- 13:19 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host an-redacteddb1001.eqiad.wmnet with OS bookworm
- 13:18 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on phab1006.eqiad.wmnet with reason: host reimage
- 13:14 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2218', diff saved to https://phabricator.wikimedia.org/P90501 and previous config saved to /var/cache/conftool/dbconfig/20260413-131408-fceratto.json
- 13:13 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on phab1006.eqiad.wmnet with reason: host reimage
- 13:04 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host apt1002.wikimedia.org
- 13:03 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for Opt-in new accounts to ReadingLists beta feature on pilot wikis (T422833)
- 13:03 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2218', diff saved to https://phabricator.wikimedia.org/P90500 and previous config saved to /var/cache/conftool/dbconfig/20260413-130320-fceratto.json
- 13:01 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host phab1006.eqiad.wmnet with OS trixie
- 13:00 moritzm: installing libnginx-mod-http-lua security updates
- 12:58 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host apt1002.wikimedia.org
- 12:52 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2218 (T419635)', diff saved to https://phabricator.wikimedia.org/P90499 and previous config saved to /var/cache/conftool/dbconfig/20260413-125231-fceratto.json
- 12:47 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host clouddb1019.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
- 12:38 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host clouddb1019.eqiad.wmnet with OS trixie
- 12:38 fceratto@cumin2002: dbctl commit (dc=all): 'Depooling db2218 (T419635)', diff saved to https://phabricator.wikimedia.org/P90498 and previous config saved to /var/cache/conftool/dbconfig/20260413-123801-fceratto.json
- 12:37 fceratto@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2218.codfw.wmnet with reason: Maintenance
- 12:36 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2208 (T419635)', diff saved to https://phabricator.wikimedia.org/P90497 and previous config saved to /var/cache/conftool/dbconfig/20260413-123653-fceratto.json
- 12:36 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
- 12:36 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
- 12:26 jclark@cumin1003: START - Cookbook sre.hosts.provision for host clouddb1019.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
- 12:26 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2208', diff saved to https://phabricator.wikimedia.org/P90496 and previous config saved to /var/cache/conftool/dbconfig/20260413-122604-fceratto.json
- 12:21 jmm@dns1004: END - running authdns-update
- 12:20 jmm@dns1004: START - running authdns-update
- 12:15 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2208', diff saved to https://phabricator.wikimedia.org/P90495 and previous config saved to /var/cache/conftool/dbconfig/20260413-121516-fceratto.json
- 12:04 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2208 (T419635)', diff saved to https://phabricator.wikimedia.org/P90494 and previous config saved to /var/cache/conftool/dbconfig/20260413-120428-fceratto.json
- 12:01 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host puppetserver1003.eqiad.wmnet
- 11:55 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host puppetserver1003.eqiad.wmnet
- 11:54 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host puppetserver2004.codfw.wmnet
- 11:49 fceratto@cumin2002: dbctl commit (dc=all): 'Depooling db2208 (T419635)', diff saved to https://phabricator.wikimedia.org/P90493 and previous config saved to /var/cache/conftool/dbconfig/20260413-114953-fceratto.json
- 11:49 fceratto@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2208.codfw.wmnet with reason: Maintenance
- 11:48 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host puppetserver2004.codfw.wmnet
- 11:38 jmm@dns1004: END - running authdns-update
- 11:36 jmm@dns1004: START - running authdns-update
- 11:36 fceratto@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2200.codfw.wmnet with reason: Maintenance
- 11:36 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2182 (T419635)', diff saved to https://phabricator.wikimedia.org/P90492 and previous config saved to /var/cache/conftool/dbconfig/20260413-113630-fceratto.json
- 11:25 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2182', diff saved to https://phabricator.wikimedia.org/P90491 and previous config saved to /var/cache/conftool/dbconfig/20260413-112541-fceratto.json
- 11:14 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2182', diff saved to https://phabricator.wikimedia.org/P90490 and previous config saved to /var/cache/conftool/dbconfig/20260413-111452-fceratto.json
- 11:04 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2182 (T419635)', diff saved to https://phabricator.wikimedia.org/P90489 and previous config saved to /var/cache/conftool/dbconfig/20260413-110405-fceratto.json
- 10:48 fceratto@cumin2002: dbctl commit (dc=all): 'Depooling db2182 (T419635)', diff saved to https://phabricator.wikimedia.org/P90488 and previous config saved to /var/cache/conftool/dbconfig/20260413-104852-fceratto.json
- 10:48 fceratto@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2182.codfw.wmnet with reason: Maintenance
- 10:47 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2168 (T419635)', diff saved to https://phabricator.wikimedia.org/P90487 and previous config saved to /var/cache/conftool/dbconfig/20260413-104756-fceratto.json
- 10:38 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
- 10:38 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
- 10:37 vgutierrez@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-ats (exit_code=0) Rolling upgrade of ATS on P{cp[3067,3074].esams.wmnet} and A:cp - 9.2.13 upgrade (T422328)
- 10:37 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2168', diff saved to https://phabricator.wikimedia.org/P90486 and previous config saved to /var/cache/conftool/dbconfig/20260413-103707-fceratto.json
- 10:34 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
- 10:33 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
- 10:29 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
- 10:28 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
- 10:26 vgutierrez@cumin1003: START - Cookbook sre.cdn.roll-upgrade-ats Rolling upgrade of ATS on P{cp[3067,3074].esams.wmnet} and A:cp - 9.2.13 upgrade (T422328)
- 10:26 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2168', diff saved to https://phabricator.wikimedia.org/P90485 and previous config saved to /var/cache/conftool/dbconfig/20260413-102619-fceratto.json
- 10:19 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
- 10:19 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host krb1002.eqiad.wmnet
- 10:19 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
- 10:15 blake@deploy1003: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
- 10:15 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2168 (T419635)', diff saved to https://phabricator.wikimedia.org/P90484 and previous config saved to /var/cache/conftool/dbconfig/20260413-101530-fceratto.json
- 10:15 blake@deploy1003: helmfile [eqiad] START helmfile.d/services/thumbor: apply
- 10:14 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
- 10:14 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host krb1002.eqiad.wmnet
- 10:14 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
- 10:09 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
- 10:09 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
- 10:07 blake@deploy1003: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
- 10:06 blake@deploy1003: helmfile [codfw] START helmfile.d/services/thumbor: apply
- 10:05 blake@deploy1003: helmfile [staging] DONE helmfile.d/services/thumbor: apply
- 10:05 blake@deploy1003: helmfile [staging] START helmfile.d/services/thumbor: apply
- 10:00 fceratto@cumin2002: dbctl commit (dc=all): 'Depooling db2168 (T419635)', diff saved to https://phabricator.wikimedia.org/P90483 and previous config saved to /var/cache/conftool/dbconfig/20260413-100003-fceratto.json
- 09:59 fceratto@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2168.codfw.wmnet with reason: Maintenance
- 09:59 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2159 (T419635)', diff saved to https://phabricator.wikimedia.org/P90482 and previous config saved to /var/cache/conftool/dbconfig/20260413-095906-fceratto.json
- 09:49 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
- 09:48 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2159', diff saved to https://phabricator.wikimedia.org/P90481 and previous config saved to /var/cache/conftool/dbconfig/20260413-094818-fceratto.json
- 09:47 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
- 09:38 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db2200.codfw.wmnet with reason: Maintenance
- 09:37 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2159', diff saved to https://phabricator.wikimedia.org/P90480 and previous config saved to /var/cache/conftool/dbconfig/20260413-093729-fceratto.json
- 09:26 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2159 (T419635)', diff saved to https://phabricator.wikimedia.org/P90479 and previous config saved to /var/cache/conftool/dbconfig/20260413-092640-fceratto.json
- 09:19 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
- 09:19 root@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool pc1011: Security updates
- 09:19 root@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
- 09:19 root@cumin1003: START - Cookbook sre.mysql.parsercache
- 09:19 root@cumin1003: START - Cookbook sre.mysql.depool depool pc1011: Security updates
- 09:17 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool pc1011: Security updates
- 09:17 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
- 09:17 fceratto@cumin1003: START - Cookbook sre.mysql.parsercache
- 09:17 fceratto@cumin1003: START - Cookbook sre.mysql.depool depool pc1011: Security updates
- 09:15 root@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool pc1011: Security updates
- 09:15 root@cumin1003: END (FAIL) - Cookbook sre.mysql.parsercache (exit_code=99)
- 09:15 root@cumin1003: START - Cookbook sre.mysql.parsercache
- 09:15 root@cumin1003: START - Cookbook sre.mysql.depool depool pc1011: Security updates
- 09:11 fceratto@cumin2002: dbctl commit (dc=all): 'Depooling db2159 (T419635)', diff saved to https://phabricator.wikimedia.org/P90477 and previous config saved to /var/cache/conftool/dbconfig/20260413-091122-fceratto.json
- 09:10 fceratto@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2159.codfw.wmnet with reason: Maintenance
- 09:10 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2150 (T419635)', diff saved to https://phabricator.wikimedia.org/P90476 and previous config saved to /var/cache/conftool/dbconfig/20260413-091027-fceratto.json
- 08:59 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2150', diff saved to https://phabricator.wikimedia.org/P90474 and previous config saved to /var/cache/conftool/dbconfig/20260413-085938-fceratto.json
- 08:48 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host bast1003.wikimedia.org
- 08:48 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2150', diff saved to https://phabricator.wikimedia.org/P90473 and previous config saved to /var/cache/conftool/dbconfig/20260413-084850-fceratto.json
- 08:43 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host bast1003.wikimedia.org
- 08:38 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2150 (T419635)', diff saved to https://phabricator.wikimedia.org/P90471 and previous config saved to /var/cache/conftool/dbconfig/20260413-083801-fceratto.json
- 08:22 fceratto@cumin2002: dbctl commit (dc=all): 'Depooling db2150 (T419635)', diff saved to https://phabricator.wikimedia.org/P90470 and previous config saved to /var/cache/conftool/dbconfig/20260413-082233-fceratto.json
- 08:21 fceratto@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2150.codfw.wmnet with reason: Maintenance
- 08:10 taavi@dns1004: END - running authdns-update
- 08:09 taavi@dns1004: START - running authdns-update
- 08:06 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'article-descriptions' for release 'main' .
- 07:40 elukey@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
- 07:35 elukey@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
- 07:23 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mirror1001.wikimedia.org
- 07:21 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthboo-next: apply
- 07:20 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook-next: apply
- 07:16 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host mirror1001.wikimedia.org
- 07:09 moritzm: installing openssh security updates
- 05:51 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db1202 (T410589)', diff saved to https://phabricator.wikimedia.org/P90469 and previous config saved to /var/cache/conftool/dbconfig/20260413-055130-ladsgroup.json
- 05:51 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db1202.eqiad.wmnet with reason: Maintenance
- 05:51 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1194 (T410589)', diff saved to https://phabricator.wikimedia.org/P90468 and previous config saved to /var/cache/conftool/dbconfig/20260413-055106-ladsgroup.json
- 05:41 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1194', diff saved to https://phabricator.wikimedia.org/P90467 and previous config saved to /var/cache/conftool/dbconfig/20260413-054100-ladsgroup.json
- 05:30 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1194', diff saved to https://phabricator.wikimedia.org/P90466 and previous config saved to /var/cache/conftool/dbconfig/20260413-053050-ladsgroup.json
- 05:20 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1194 (T410589)', diff saved to https://phabricator.wikimedia.org/P90465 and previous config saved to /var/cache/conftool/dbconfig/20260413-052042-ladsgroup.json
- 03:34 TimStarling: on gerrit2003 restarted gerrit T423027
2026-04-12
- 21:21 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db2198.codfw.wmnet with reason: Maintenance
- 21:20 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2182 (T410589)', diff saved to https://phabricator.wikimedia.org/P90464 and previous config saved to /var/cache/conftool/dbconfig/20260412-212043-ladsgroup.json
- 21:10 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2182', diff saved to https://phabricator.wikimedia.org/P90463 and previous config saved to /var/cache/conftool/dbconfig/20260412-211036-ladsgroup.json
- 21:00 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2182', diff saved to https://phabricator.wikimedia.org/P90462 and previous config saved to /var/cache/conftool/dbconfig/20260412-210028-ladsgroup.json
- 20:55 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db1194 (T410589)', diff saved to https://phabricator.wikimedia.org/P90461 and previous config saved to /var/cache/conftool/dbconfig/20260412-205525-ladsgroup.json
- 20:55 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db1194.eqiad.wmnet with reason: Maintenance
- 20:55 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1191 (T410589)', diff saved to https://phabricator.wikimedia.org/P90460 and previous config saved to /var/cache/conftool/dbconfig/20260412-205500-ladsgroup.json
- 20:50 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2182 (T410589)', diff saved to https://phabricator.wikimedia.org/P90459 and previous config saved to /var/cache/conftool/dbconfig/20260412-205020-ladsgroup.json
- 20:44 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1191', diff saved to https://phabricator.wikimedia.org/P90458 and previous config saved to /var/cache/conftool/dbconfig/20260412-204451-ladsgroup.json
- 20:34 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1191', diff saved to https://phabricator.wikimedia.org/P90457 and previous config saved to /var/cache/conftool/dbconfig/20260412-203443-ladsgroup.json
- 20:24 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1191 (T410589)', diff saved to https://phabricator.wikimedia.org/P90456 and previous config saved to /var/cache/conftool/dbconfig/20260412-202435-ladsgroup.json
- 14:32 cgoubert@dns2004: START - running authdns-update
- 11:51 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db1191 (T410589)', diff saved to https://phabricator.wikimedia.org/P90452 and previous config saved to /var/cache/conftool/dbconfig/20260412-115148-ladsgroup.json
- 11:51 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db1191.eqiad.wmnet with reason: Maintenance
- 11:51 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1174 (T410589)', diff saved to https://phabricator.wikimedia.org/P90451 and previous config saved to /var/cache/conftool/dbconfig/20260412-115124-ladsgroup.json
- 11:41 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P90450 and previous config saved to /var/cache/conftool/dbconfig/20260412-114116-ladsgroup.json
- 11:31 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P90449 and previous config saved to /var/cache/conftool/dbconfig/20260412-113108-ladsgroup.json
- 11:21 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1174 (T410589)', diff saved to https://phabricator.wikimedia.org/P90448 and previous config saved to /var/cache/conftool/dbconfig/20260412-112100-ladsgroup.json
- 07:06 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db2182 (T410589)', diff saved to https://phabricator.wikimedia.org/P90447 and previous config saved to /var/cache/conftool/dbconfig/20260412-070649-ladsgroup.json
- 07:06 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db2182.codfw.wmnet with reason: Maintenance
- 07:06 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2168 (T410589)', diff saved to https://phabricator.wikimedia.org/P90446 and previous config saved to /var/cache/conftool/dbconfig/20260412-070624-ladsgroup.json
- 06:56 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2168', diff saved to https://phabricator.wikimedia.org/P90445 and previous config saved to /var/cache/conftool/dbconfig/20260412-065616-ladsgroup.json
- 06:46 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2168', diff saved to https://phabricator.wikimedia.org/P90444 and previous config saved to /var/cache/conftool/dbconfig/20260412-064608-ladsgroup.json
- 06:36 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2168 (T410589)', diff saved to https://phabricator.wikimedia.org/P90443 and previous config saved to /var/cache/conftool/dbconfig/20260412-063600-ladsgroup.json
- 02:44 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db1174 (T410589)', diff saved to https://phabricator.wikimedia.org/P90442 and previous config saved to /var/cache/conftool/dbconfig/20260412-024415-ladsgroup.json
- 02:44 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db1174.eqiad.wmnet with reason: Maintenance
- 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 19s)
- 02:01 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
2026-04-11
- 22:40 cmooney@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 16735
- 22:38 cmooney@cumin1003: START - Cookbook sre.network.peering with action 'configure' for AS: 16735
- 22:38 cmooney@cumin1003: END (FAIL) - Cookbook sre.network.peering (exit_code=99) with action 'configure' for AS: 16735
- 22:37 cmooney@cumin1003: START - Cookbook sre.network.peering with action 'configure' for AS: 16735
- 18:50 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2248 (T419635)', diff saved to https://phabricator.wikimedia.org/P90441 and previous config saved to /var/cache/conftool/dbconfig/20260411-185048-fceratto.json
- 18:40 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2248', diff saved to https://phabricator.wikimedia.org/P90440 and previous config saved to /var/cache/conftool/dbconfig/20260411-184000-fceratto.json
- 18:29 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2248', diff saved to https://phabricator.wikimedia.org/P90439 and previous config saved to /var/cache/conftool/dbconfig/20260411-182912-fceratto.json
- 18:18 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2248 (T419635)', diff saved to https://phabricator.wikimedia.org/P90438 and previous config saved to /var/cache/conftool/dbconfig/20260411-181823-fceratto.json
- 17:23 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db2168 (T410589)', diff saved to https://phabricator.wikimedia.org/P90437 and previous config saved to /var/cache/conftool/dbconfig/20260411-172321-ladsgroup.json
- 17:23 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db2168.codfw.wmnet with reason: Maintenance
- 17:22 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2159 (T410589)', diff saved to https://phabricator.wikimedia.org/P90436 and previous config saved to /var/cache/conftool/dbconfig/20260411-172257-ladsgroup.json
- 17:12 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2159', diff saved to https://phabricator.wikimedia.org/P90435 and previous config saved to /var/cache/conftool/dbconfig/20260411-171248-ladsgroup.json
- 17:02 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2159', diff saved to https://phabricator.wikimedia.org/P90434 and previous config saved to /var/cache/conftool/dbconfig/20260411-170240-ladsgroup.json
- 17:02 fceratto@cumin2002: dbctl commit (dc=all): 'Depooling db2248 (T419635)', diff saved to https://phabricator.wikimedia.org/P90433 and previous config saved to /var/cache/conftool/dbconfig/20260411-170233-fceratto.json
- 17:01 fceratto@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2248.codfw.wmnet with reason: Maintenance
- 17:01 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2247 (T419635)', diff saved to https://phabricator.wikimedia.org/P90432 and previous config saved to /var/cache/conftool/dbconfig/20260411-170138-fceratto.json
- 16:52 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2159 (T410589)', diff saved to https://phabricator.wikimedia.org/P90431 and previous config saved to /var/cache/conftool/dbconfig/20260411-165232-ladsgroup.json
- 16:50 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2247', diff saved to https://phabricator.wikimedia.org/P90430 and previous config saved to /var/cache/conftool/dbconfig/20260411-165049-fceratto.json
- 16:40 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2247', diff saved to https://phabricator.wikimedia.org/P90429 and previous config saved to /var/cache/conftool/dbconfig/20260411-164000-fceratto.json
- 16:29 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2247 (T419635)', diff saved to https://phabricator.wikimedia.org/P90428 and previous config saved to /var/cache/conftool/dbconfig/20260411-162912-fceratto.json
- 14:40 fceratto@cumin2002: dbctl commit (dc=all): 'Depooling db2247 (T419635)', diff saved to https://phabricator.wikimedia.org/P90427 and previous config saved to /var/cache/conftool/dbconfig/20260411-144002-fceratto.json
- 14:39 fceratto@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2247.codfw.wmnet with reason: Maintenance
- 14:38 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2246 (T419635)', diff saved to https://phabricator.wikimedia.org/P90426 and previous config saved to /var/cache/conftool/dbconfig/20260411-143854-fceratto.json
- 14:28 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2246', diff saved to https://phabricator.wikimedia.org/P90425 and previous config saved to /var/cache/conftool/dbconfig/20260411-142805-fceratto.json
- 14:17 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2246', diff saved to https://phabricator.wikimedia.org/P90424 and previous config saved to /var/cache/conftool/dbconfig/20260411-141717-fceratto.json
- 14:06 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2246 (T419635)', diff saved to https://phabricator.wikimedia.org/P90423 and previous config saved to /var/cache/conftool/dbconfig/20260411-140628-fceratto.json
- 12:43 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db1171.eqiad.wmnet with reason: Maintenance
- 12:42 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1170 (T410589)', diff saved to https://phabricator.wikimedia.org/P90422 and previous config saved to /var/cache/conftool/dbconfig/20260411-124244-ladsgroup.json
- 12:32 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1170', diff saved to https://phabricator.wikimedia.org/P90421 and previous config saved to /var/cache/conftool/dbconfig/20260411-123235-ladsgroup.json
- 12:22 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1170', diff saved to https://phabricator.wikimedia.org/P90420 and previous config saved to /var/cache/conftool/dbconfig/20260411-122226-ladsgroup.json
- 12:14 fceratto@cumin2002: dbctl commit (dc=all): 'Depooling db2246 (T419635)', diff saved to https://phabricator.wikimedia.org/P90419 and previous config saved to /var/cache/conftool/dbconfig/20260411-121410-fceratto.json
- 12:13 fceratto@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2246.codfw.wmnet with reason: Maintenance
- 12:13 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2245 (T419635)', diff saved to https://phabricator.wikimedia.org/P90418 and previous config saved to /var/cache/conftool/dbconfig/20260411-121302-fceratto.json
- 12:12 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1170 (T410589)', diff saved to https://phabricator.wikimedia.org/P90417 and previous config saved to /var/cache/conftool/dbconfig/20260411-121218-ladsgroup.json
- 12:02 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2245', diff saved to https://phabricator.wikimedia.org/P90416 and previous config saved to /var/cache/conftool/dbconfig/20260411-120214-fceratto.json
- 11:51 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2245', diff saved to https://phabricator.wikimedia.org/P90415 and previous config saved to /var/cache/conftool/dbconfig/20260411-115126-fceratto.json
- 11:40 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2245 (T419635)', diff saved to https://phabricator.wikimedia.org/P90414 and previous config saved to /var/cache/conftool/dbconfig/20260411-114037-fceratto.json
- 09:52 fceratto@cumin2002: dbctl commit (dc=all): 'Depooling db2245 (T419635)', diff saved to https://phabricator.wikimedia.org/P90413 and previous config saved to /var/cache/conftool/dbconfig/20260411-095220-fceratto.json
- 09:51 fceratto@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2245.codfw.wmnet with reason: Maintenance
- 09:51 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2240 (T419635)', diff saved to https://phabricator.wikimedia.org/P90412 and previous config saved to /var/cache/conftool/dbconfig/20260411-095113-fceratto.json
- 09:40 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2240', diff saved to https://phabricator.wikimedia.org/P90411 and previous config saved to /var/cache/conftool/dbconfig/20260411-094024-fceratto.json
- 09:29 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2240', diff saved to https://phabricator.wikimedia.org/P90410 and previous config saved to /var/cache/conftool/dbconfig/20260411-092936-fceratto.json
- 09:18 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2240 (T419635)', diff saved to https://phabricator.wikimedia.org/P90409 and previous config saved to /var/cache/conftool/dbconfig/20260411-091847-fceratto.json
- 07:36 fceratto@cumin2002: dbctl commit (dc=all): 'Depooling db2240 (T419635)', diff saved to https://phabricator.wikimedia.org/P90408 and previous config saved to /var/cache/conftool/dbconfig/20260411-073627-fceratto.json
- 07:35 fceratto@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2240.codfw.wmnet with reason: Maintenance
- 06:01 fceratto@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2239.codfw.wmnet with reason: Maintenance
- 06:01 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2237 (T419635)', diff saved to https://phabricator.wikimedia.org/P90407 and previous config saved to /var/cache/conftool/dbconfig/20260411-060126-fceratto.json
- 05:50 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2237', diff saved to https://phabricator.wikimedia.org/P90406 and previous config saved to /var/cache/conftool/dbconfig/20260411-055038-fceratto.json
- 05:39 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2237', diff saved to https://phabricator.wikimedia.org/P90405 and previous config saved to /var/cache/conftool/dbconfig/20260411-053950-fceratto.json
- 05:29 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2237 (T419635)', diff saved to https://phabricator.wikimedia.org/P90404 and previous config saved to /var/cache/conftool/dbconfig/20260411-052901-fceratto.json
- 03:45 fceratto@cumin2002: dbctl commit (dc=all): 'Depooling db2237 (T419635)', diff saved to https://phabricator.wikimedia.org/P90403 and previous config saved to /var/cache/conftool/dbconfig/20260411-034549-fceratto.json
- 03:45 fceratto@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2237.codfw.wmnet with reason: Maintenance
- 03:44 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2236 (T419635)', diff saved to https://phabricator.wikimedia.org/P90402 and previous config saved to /var/cache/conftool/dbconfig/20260411-034441-fceratto.json
- 03:37 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db2159 (T410589)', diff saved to https://phabricator.wikimedia.org/P90401 and previous config saved to /var/cache/conftool/dbconfig/20260411-033701-ladsgroup.json
- 03:36 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db2159.codfw.wmnet with reason: Maintenance
- 03:36 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2150 (T410589)', diff saved to https://phabricator.wikimedia.org/P90400 and previous config saved to /var/cache/conftool/dbconfig/20260411-033636-ladsgroup.json
- 03:33 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2236', diff saved to https://phabricator.wikimedia.org/P90399 and previous config saved to /var/cache/conftool/dbconfig/20260411-033352-fceratto.json
- 03:26 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2150', diff saved to https://phabricator.wikimedia.org/P90398 and previous config saved to /var/cache/conftool/dbconfig/20260411-032628-ladsgroup.json
- 03:23 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2236', diff saved to https://phabricator.wikimedia.org/P90397 and previous config saved to /var/cache/conftool/dbconfig/20260411-032304-fceratto.json
- 03:16 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2150', diff saved to https://phabricator.wikimedia.org/P90396 and previous config saved to /var/cache/conftool/dbconfig/20260411-031620-ladsgroup.json
- 03:12 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2236 (T419635)', diff saved to https://phabricator.wikimedia.org/P90395 and previous config saved to /var/cache/conftool/dbconfig/20260411-031216-fceratto.json
- 03:06 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2150 (T410589)', diff saved to https://phabricator.wikimedia.org/P90394 and previous config saved to /var/cache/conftool/dbconfig/20260411-030611-ladsgroup.json
- 01:31 fceratto@cumin2002: dbctl commit (dc=all): 'Depooling db2236 (T419635)', diff saved to Unable to send diff to phaste and previous config saved to /var/cache/conftool/dbconfig/20260411-013151-fceratto.json
- 01:31 fceratto@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2236.codfw.wmnet with reason: Maintenance
- 01:30 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2219 (T419635)', diff saved to https://phabricator.wikimedia.org/P90393 and previous config saved to /var/cache/conftool/dbconfig/20260411-013040-fceratto.json
- 01:19 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2219', diff saved to Unable to send diff to phaste and previous config saved to /var/cache/conftool/dbconfig/20260411-011948-fceratto.json
- 01:09 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2219', diff saved to and previous config saved to /var/cache/conftool/dbconfig/20260411-010859-fceratto.json
- 00:58 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2219 (T419635)', diff saved to https://phabricator.wikimedia.org/P90392 and previous config saved to /var/cache/conftool/dbconfig/20260411-005811-fceratto.json
- 00:08 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host phab2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 00:03 jclark@cumin1003: START - Cookbook sre.hosts.provision for host phab2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 00:01 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host phab2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
2026-04-10
- 23:54 jclark@cumin1003: START - Cookbook sre.hosts.provision for host phab2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 23:53 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host phab1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 23:51 jclark@cumin1003: START - Cookbook sre.hosts.provision for host phab1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 23:49 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host apus-be1006.eqiad.wmnet with OS bookworm
- 23:49 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
- 23:49 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host phab1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 23:48 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
- 23:47 jclark@cumin1003: START - Cookbook sre.hosts.provision for host phab1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 23:25 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on apus-be1006.eqiad.wmnet with reason: host reimage
- 23:20 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on apus-be1006.eqiad.wmnet with reason: host reimage
- 23:17 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host apus-be1005.eqiad.wmnet with OS bookworm
- 23:17 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
- 23:16 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
- 23:13 fceratto@cumin2002: dbctl commit (dc=all): 'Depooling db2219 (T419635)', diff saved to https://phabricator.wikimedia.org/P90391 and previous config saved to /var/cache/conftool/dbconfig/20260410-231337-fceratto.json
- 23:12 fceratto@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2219.codfw.wmnet with reason: Maintenance
- 23:12 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2210 (T419635)', diff saved to https://phabricator.wikimedia.org/P90390 and previous config saved to /var/cache/conftool/dbconfig/20260410-231231-fceratto.json
- 23:05 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host apus-be1006.eqiad.wmnet with OS bookworm
- 23:01 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2210', diff saved to https://phabricator.wikimedia.org/P90389 and previous config saved to /var/cache/conftool/dbconfig/20260410-230143-fceratto.json
- 22:59 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on apus-be1005.eqiad.wmnet with reason: host reimage
- 22:55 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on apus-be1005.eqiad.wmnet with reason: host reimage
- 22:50 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2210', diff saved to https://phabricator.wikimedia.org/P90388 and previous config saved to /var/cache/conftool/dbconfig/20260410-225055-fceratto.json
- 22:40 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2210 (T419635)', diff saved to https://phabricator.wikimedia.org/P90387 and previous config saved to /var/cache/conftool/dbconfig/20260410-224008-fceratto.json
- 22:33 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host apus-be1005.eqiad.wmnet with OS bookworm
- 22:32 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host apus-be1005.eqiad.wmnet with OS bookworm
- 22:31 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host apus-be1005.eqiad.wmnet with OS bookworm
- 22:31 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host apus-be1005.eqiad.wmnet with OS bookworm
- 22:30 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host apus-be1005.eqiad.wmnet with OS bookworm
- 22:28 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host phab1006.eqiad.wmnet with OS trixie
- 22:24 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db1170 (T410589)', diff saved to https://phabricator.wikimedia.org/P90386 and previous config saved to /var/cache/conftool/dbconfig/20260410-222445-ladsgroup.json
- 22:24 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db1170.eqiad.wmnet with reason: Maintenance
- 22:24 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1158 (T410589)', diff saved to https://phabricator.wikimedia.org/P90385 and previous config saved to /var/cache/conftool/dbconfig/20260410-222421-ladsgroup.json
- 22:17 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host phab1006.eqiad.wmnet with OS trixie
- 22:14 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P90384 and previous config saved to /var/cache/conftool/dbconfig/20260410-221414-ladsgroup.json
- 22:13 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host phab1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 22:04 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P90383 and previous config saved to /var/cache/conftool/dbconfig/20260410-220406-ladsgroup.json
- 22:02 jclark@cumin1003: START - Cookbook sre.hosts.provision for host phab1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 22:00 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host phab1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 21:58 jclark@cumin1003: START - Cookbook sre.hosts.provision for host phab1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 21:57 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host phab1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 21:54 jclark@cumin1003: START - Cookbook sre.hosts.provision for host phab1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 21:54 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1158 (T410589)', diff saved to https://phabricator.wikimedia.org/P90382 and previous config saved to /var/cache/conftool/dbconfig/20260410-215358-ladsgroup.json
- 21:37 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host clouddb1019.eqiad.wmnet with OS trixie
- 20:59 bking@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5 days, 0:00:00 on cloudelastic1012.eqiad.wmnet with reason: still fixing Puppet
- 20:54 fceratto@cumin2002: dbctl commit (dc=all): 'Depooling db2210 (T419635)', diff saved to https://phabricator.wikimedia.org/P90381 and previous config saved to /var/cache/conftool/dbconfig/20260410-205420-fceratto.json
- 20:53 fceratto@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2210.codfw.wmnet with reason: Maintenance
- 20:53 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2206 (T419635)', diff saved to https://phabricator.wikimedia.org/P90380 and previous config saved to /var/cache/conftool/dbconfig/20260410-205324-fceratto.json
- 20:42 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2206', diff saved to https://phabricator.wikimedia.org/P90378 and previous config saved to /var/cache/conftool/dbconfig/20260410-204236-fceratto.json
- 20:31 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2206', diff saved to https://phabricator.wikimedia.org/P90377 and previous config saved to /var/cache/conftool/dbconfig/20260410-203147-fceratto.json
- 20:21 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2206 (T419635)', diff saved to https://phabricator.wikimedia.org/P90376 and previous config saved to /var/cache/conftool/dbconfig/20260410-202059-fceratto.json
- 20:02 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1055.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 19:58 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1055.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 19:57 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1055.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 19:55 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1055.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 19:52 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1055.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 19:48 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1055.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 18:34 fceratto@cumin2002: dbctl commit (dc=all): 'Depooling db2206 (T419635)', diff saved to https://phabricator.wikimedia.org/P90373 and previous config saved to /var/cache/conftool/dbconfig/20260410-183455-fceratto.json
- 18:34 fceratto@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2206.codfw.wmnet with reason: Maintenance
- 18:27 dancy@deploy1003: Finished deploy [releng/jenkins-deploy@46eae53] (releasing): (no justification provided) (duration: 00m 56s)
- 18:26 dancy@deploy1003: Started deploy [releng/jenkins-deploy@46eae53] (releasing): (no justification provided)
- 17:46 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1055.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 17:41 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1055.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 17:28 dancy@deploy1003: Installation of scap version "4.248.0" completed for 2 hosts
- 17:27 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudelastic1012.eqiad.wmnet with OS trixie
- 17:26 dancy@deploy1003: Installing scap version "4.248.0" for 2 host(s)
- 17:12 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudelastic1012.eqiad.wmnet with reason: host reimage
- 17:08 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudelastic1012.eqiad.wmnet with reason: host reimage
- 17:00 fceratto@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2199.codfw.wmnet with reason: Maintenance
- 16:59 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2172 (T419635)', diff saved to https://phabricator.wikimedia.org/P90372 and previous config saved to /var/cache/conftool/dbconfig/20260410-165951-fceratto.json
- 16:57 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1012.eqiad.wmnet with OS trixie
- 16:49 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2172', diff saved to https://phabricator.wikimedia.org/P90371 and previous config saved to /var/cache/conftool/dbconfig/20260410-164902-fceratto.json
- 16:39 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4 days, 0:00:00 on zuul1001.eqiad.wmnet with reason: T421398
- 16:38 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2172', diff saved to https://phabricator.wikimedia.org/P90370 and previous config saved to /var/cache/conftool/dbconfig/20260410-163814-fceratto.json
- 16:27 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2172 (T419635)', diff saved to https://phabricator.wikimedia.org/P90368 and previous config saved to /var/cache/conftool/dbconfig/20260410-162726-fceratto.json
- 16:05 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.roll-restart-ntp (exit_code=0) rolling restart_daemons on A:dnsbox
- 15:59 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 15:59 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 15:46 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcephmon1006.eqiad.wmnet
- 15:45 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 15:44 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 15:41 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudelastic1012.eqiad.wmnet with OS trixie
- 15:39 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudcephmon1006.eqiad.wmnet
- 15:38 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 15:38 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 15:36 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 15:34 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 15:34 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudelastic1012.eqiad.wmnet with reason: host reimage
- 15:27 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudelastic1012.eqiad.wmnet with reason: host reimage
- 15:23 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcephmon1006.eqiad.wmnet
- 15:17 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudcephmon1006.eqiad.wmnet
- 15:16 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1012.eqiad.wmnet with OS trixie
- 14:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 14:53 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 14:38 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 14:38 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 14:36 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
- 14:32 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
- 14:30 gkyziridis@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
- 14:30 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
- 14:27 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
- 14:25 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
- 14:23 fceratto@cumin2002: dbctl commit (dc=all): 'Depooling db2172 (T419635)', diff saved to https://phabricator.wikimedia.org/P90367 and previous config saved to /var/cache/conftool/dbconfig/20260410-142308-fceratto.json
- 14:22 fceratto@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2172.codfw.wmnet with reason: Maintenance
- 14:22 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2155 (T419635)', diff saved to https://phabricator.wikimedia.org/P90366 and previous config saved to /var/cache/conftool/dbconfig/20260410-142200-fceratto.json
- 14:13 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 14:13 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 14:13 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 14:13 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 14:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 14:11 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 14:11 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 14:11 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2155', diff saved to https://phabricator.wikimedia.org/P90365 and previous config saved to /var/cache/conftool/dbconfig/20260410-141112-fceratto.json
- 14:10 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 14:10 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 14:10 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 14:00 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2155', diff saved to https://phabricator.wikimedia.org/P90363 and previous config saved to /var/cache/conftool/dbconfig/20260410-140023-fceratto.json
- 13:49 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2155 (T419635)', diff saved to https://phabricator.wikimedia.org/P90362 and previous config saved to /var/cache/conftool/dbconfig/20260410-134935-fceratto.json
- 13:44 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 13:44 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 13:40 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
- 13:32 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
- 13:32 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
- 13:32 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
- 13:30 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
- 13:30 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
- 13:30 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
- 13:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 13:29 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 13:28 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
- 13:22 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db2150 (T410589)', diff saved to https://phabricator.wikimedia.org/P90358 and previous config saved to /var/cache/conftool/dbconfig/20260410-132215-ladsgroup.json
- 13:22 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db2150.codfw.wmnet with reason: Maintenance
- 13:21 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
- 13:21 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db1158 (T410589)', diff saved to https://phabricator.wikimedia.org/P90357 and previous config saved to /var/cache/conftool/dbconfig/20260410-132119-ladsgroup.json
- 13:21 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db1158.eqiad.wmnet with reason: Maintenance
- 13:20 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 13:19 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 13:19 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
- 13:16 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
- 13:15 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
- 13:08 cmooney@dns2005: END - running authdns-update
- 13:07 cmooney@dns2005: START - running authdns-update
- 13:06 cmooney@dns2005: START - running authdns-update
- 13:05 sukhe@cumin1003: START - Cookbook sre.dns.roll-restart-ntp rolling restart_daemons on A:dnsbox
- 13:02 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 13:02 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 13:02 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: remove dns for decom lumen transport cct - cmooney@cumin1003"
- 13:01 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: remove dns for decom lumen transport cct - cmooney@cumin1003"
- 12:59 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
- 12:57 cmooney@cumin1003: START - Cookbook sre.dns.netbox
- 12:57 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
- 12:57 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
- 12:57 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 12:55 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
- 12:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 12:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 12:52 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthboo-next: apply
- 12:51 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
- 12:47 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook-next: apply
- 12:46 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
- 12:44 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
- 12:34 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
- 12:32 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
- 12:27 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
- 12:26 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
- 12:24 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
- 12:24 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
- 12:22 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
- 12:09 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 12:09 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 11:50 fceratto@cumin2002: dbctl commit (dc=all): 'Depooling db2155 (T419635)', diff saved to https://phabricator.wikimedia.org/P90351 and previous config saved to /var/cache/conftool/dbconfig/20260410-115015-fceratto.json
- 11:49 fceratto@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2155.codfw.wmnet with reason: Maintenance
- 11:49 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2147 (T419635)', diff saved to https://phabricator.wikimedia.org/P90350 and previous config saved to /var/cache/conftool/dbconfig/20260410-114919-fceratto.json
- 11:38 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2147', diff saved to https://phabricator.wikimedia.org/P90349 and previous config saved to /var/cache/conftool/dbconfig/20260410-113830-fceratto.json
- 11:27 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2147', diff saved to https://phabricator.wikimedia.org/P90348 and previous config saved to /var/cache/conftool/dbconfig/20260410-112742-fceratto.json
- 11:22 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
- 11:20 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
- 11:19 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
- 11:16 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2147 (T419635)', diff saved to https://phabricator.wikimedia.org/P90347 and previous config saved to /var/cache/conftool/dbconfig/20260410-111654-fceratto.json
- 11:16 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
- 11:15 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
- 11:10 aokoth@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1004.wikimedia.org with reason: Security Release - T422668
- 11:04 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
- 11:04 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
- 11:02 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
- 11:02 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
- 11:00 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
- 11:00 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
- 10:58 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
- 10:30 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
- 10:27 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
- 10:27 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
- 10:25 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
- 10:21 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
- 10:20 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 10:20 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 10:19 vgutierrez: upload haproxy 2.8.20 to thirdparty/haproxy28 for bookworm-wikimedia (apt.wm.o) - T422926
- 10:18 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
- 10:16 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
- 10:16 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
- 10:14 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
- 10:11 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
- 10:11 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 10:11 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 10:10 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 10:10 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 10:09 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
- 10:09 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
- 10:07 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
- 10:03 aokoth@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1004.wikimedia.org with reason: Security Release - T422668
- 09:58 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 09:57 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 09:50 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
- 09:48 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 09:48 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 09:39 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
- 09:36 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
- 09:35 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
- 09:31 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 09:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 09:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
- 09:24 elukey@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/aux-k8s-services/jaeger: sync
- 09:24 elukey@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/aux-k8s-services/jaeger: sync
- 09:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
- 09:24 elukey@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/aux-k8s-services/jaeger: sync
- 09:24 elukey@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/aux-k8s-services/jaeger: sync
- 09:22 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
- 09:22 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
- 09:21 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
- 09:21 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
- 09:21 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
- 09:21 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
- 09:18 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
- 09:17 fceratto@cumin2002: dbctl commit (dc=all): 'Depooling db2147 (T419635)', diff saved to https://phabricator.wikimedia.org/P90346 and previous config saved to /var/cache/conftool/dbconfig/20260410-091713-fceratto.json
- 09:17 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
- 09:16 fceratto@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2147.codfw.wmnet with reason: Maintenance
- 09:15 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
- 09:14 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
- 09:13 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
- 09:04 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
- 09:02 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
- 09:02 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
- 09:00 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
- 08:55 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
- 08:54 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
- 08:54 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
- 08:54 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
- 08:54 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
- 08:45 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
- 08:45 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
- 08:45 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
- 08:44 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
- 08:44 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
- 08:42 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
- 08:19 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
- 07:30 jelto@dns1004: END - running authdns-update
- 07:29 jelto@dns1004: START - running authdns-update
- 07:09 aokoth@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on gitlab1003.wikimedia.org with reason: Upgrade
- 06:26 marostegui@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host clouddb1019.eqiad.wmnet with OS trixie
- 05:05 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host clouddb1019.eqiad.wmnet with OS trixie
- 01:26 rzl@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
- 01:25 rzl@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
- 01:23 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
- 01:23 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
- 00:57 rzl@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
- 00:57 rzl@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
- 00:54 zabe@deploy1003: Finished scap sync-world: Backport for Stop setting specific virtual domain for link tables (T421914) (duration: 05m 51s)
- 00:50 zabe@deploy1003: zabe: Continuing with sync
- 00:50 zabe@deploy1003: zabe: Backport for Stop setting specific virtual domain for link tables (T421914) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 00:48 zabe@deploy1003: Started scap sync-world: Backport for Stop setting specific virtual domain for link tables (T421914)
- 00:46 zabe@deploy1003: Finished scap sync-world: Backport for Start reading from new file tables on enwiki (T416548) (duration: 06m 11s)
- 00:43 zabe@deploy1003: zabe: Continuing with sync
- 00:42 zabe@deploy1003: zabe: Backport for Start reading from new file tables on enwiki (T416548) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 00:40 zabe@deploy1003: Started scap sync-world: Backport for Start reading from new file tables on enwiki (T416548)
- 00:29 zabe: marked 425 content rows as bad # T393237
- 00:26 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host apus-be2005.codfw.wmnet with OS bookworm
- 00:26 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
- 00:21 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
- 00:08 zabe@deploy1003: Finished scap sync-world: Backport for Disable query pages on testcommonswiki not compatible with split (T421914) (duration: 07m 17s)
- 00:07 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host apus-be2006.codfw.wmnet with OS bookworm
- 00:07 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
- 00:07 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
- 00:04 zabe@deploy1003: zabe: Continuing with sync
- 00:03 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on apus-be2005.codfw.wmnet with reason: host reimage
- 00:02 zabe@deploy1003: zabe: Backport for Disable query pages on testcommonswiki not compatible with split (T421914) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 00:00 zabe@deploy1003: Started scap sync-world: Backport for Disable query pages on testcommonswiki not compatible with split (T421914)
2026-04-09
- 23:58 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on apus-be2005.codfw.wmnet with reason: host reimage
- 23:48 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on apus-be2006.codfw.wmnet with reason: host reimage
- 23:43 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on apus-be2006.codfw.wmnet with reason: host reimage
- 23:15 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host apus-be2006.codfw.wmnet with OS bookworm
- 23:15 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host apus-be2005.codfw.wmnet with OS bookworm
- 23:14 jhancock@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts ['apus-be2005']
- 23:14 jhancock@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['apus-be2005']
- 23:13 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host apus-be2005.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 23:11 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host apus-be2006.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 22:58 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host apus-be2006.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 22:58 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host apus-be2005.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 21:57 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host phab2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 21:27 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host phab2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 21:25 cscott@deploy1003: Finished scap sync-world: Backport for ParsoidLanguageConverter: Don't convert inside <style> elements (T422879) (duration: 06m 52s)
- 21:25 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host phab2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 21:22 cscott@deploy1003: cscott: Continuing with sync
- 21:21 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen: apply
- 21:21 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen: apply
- 21:20 cscott@deploy1003: cscott: Backport for ParsoidLanguageConverter: Don't convert inside <style> elements (T422879) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 21:19 cscott@deploy1003: Started scap sync-world: Backport for ParsoidLanguageConverter: Don't convert inside <style> elements (T422879)
- 21:11 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host phab2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 21:06 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host phab2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 20:56 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host phab2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 20:50 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host phab2003
- 20:50 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host phab2003
- 20:50 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host apus-be2006
- 20:45 inflatador: reprepro --noskipold --component thirdparty/opensearch2 update trixie-wikimedia T422860
- 20:45 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host apus-be2006
- 20:45 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host apus-be2005
- 20:39 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
- 20:37 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
- 20:31 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host apus-be2005
- 20:31 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 20:31 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding apus-be2005-6 and phab2003 to codfw - jhancock@cumin2002"
- 20:31 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding apus-be2005-6 and phab2003 to codfw - jhancock@cumin2002"
- 20:28 aude@deploy1003: Finished scap sync-world: Backport for Make onboarding dialog a little less eager beaver 🦫 (T421942) (duration: 10m 05s)
- 20:27 jhancock@cumin2002: START - Cookbook sre.dns.netbox
- 20:25 aude@deploy1003: aude: Continuing with sync
- 20:20 aude@deploy1003: aude: Backport for Make onboarding dialog a little less eager beaver 🦫 (T421942) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 20:18 aude@deploy1003: Started scap sync-world: Backport for Make onboarding dialog a little less eager beaver 🦫 (T421942)
- 20:17 tappof@cumin1003: END (PASS) - Cookbook sre.o11y.thanos-compact-restart (exit_code=0) rebalance blocks across compactor instances (patch id: 1265429)
- 20:17 aude@deploy1003: Finished scap sync-world: Backport for Opt-in new accounts to ReadingLists beta feature on testwiki (T422833), Add new protection level (edituserprotected) for nowiki (T367943), Turn on Parsoid Read Views for dewiki (T422524) (duration: 09m 04s)
- 20:15 tappof@cumin1003: START - Cookbook sre.o11y.thanos-compact-restart rebalance blocks across compactor instances (patch id: 1265429)
- 20:13 aude@deploy1003: cscott, jhsoby, aude: Continuing with sync
- 20:10 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcephmon2006-dev.codfw.wmnet
- 20:09 aude@deploy1003: cscott, jhsoby, aude: Backport for Opt-in new accounts to ReadingLists beta feature on testwiki (T422833), Add new protection level (edituserprotected) for nowiki (T367943), Turn on Parsoid Read Views for dewiki (T422524) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 20:07 aude@deploy1003: Started scap sync-world: Backport for Opt-in new accounts to ReadingLists beta feature on testwiki (T422833), Add new protection level (edituserprotected) for nowiki (T367943), Turn on Parsoid Read Views for dewiki (T422524)
- 20:04 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudcephmon2006-dev.codfw.wmnet
- 20:03 tappof@cumin1003: END (PASS) - Cookbook sre.o11y.thanos-compact-restart (exit_code=0) cookbook test (patch id: 1260650)
- 20:00 tappof@cumin1003: START - Cookbook sre.o11y.thanos-compact-restart cookbook test (patch id: 1260650)
- 19:46 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1055.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 19:44 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1055.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 19:30 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1055.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 19:29 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1055.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 19:24 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1055.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 19:20 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1055.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 19:18 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1055.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 19:16 inflatador: bking@apt1002 sudo -E reprepro -C thirdparty/opensearch2 copy trixie-wikimedia bookworm-wikimedia opensearch T422860
- 19:16 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1055.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 19:15 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1055.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 19:11 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1055.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 19:10 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ganeti1055
- 19:09 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host ganeti1055
- 19:09 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 19:06 vriley@cumin1003: START - Cookbook sre.dns.netbox
- 19:04 inflatador: bking@apt1002 delete old haproxy pkgs P90343
- 19:02 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 19:02 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 18:58 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 18:58 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 18:56 ladsgroup@deploy1003: Finished scap sync-world: Backport for Revert^5 "Use envoy for swift inside mediawiki" (duration: 05m 53s)
- 18:52 ladsgroup@deploy1003: ladsgroup: Continuing with sync
- 18:52 ladsgroup@deploy1003: ladsgroup: Backport for Revert^5 "Use envoy for swift inside mediawiki" synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 18:50 ladsgroup@deploy1003: Started scap sync-world: Backport for Revert^5 "Use envoy for swift inside mediawiki"
- 18:49 dancy@deploy1003: Installation of scap version "4.247.0" completed for 2 hosts
- 18:49 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 18:49 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 18:47 dancy@deploy1003: Installing scap version "4.247.0" for 2 host(s)
- 18:46 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 18:46 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 18:40 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 18:40 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 18:37 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 18:37 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 18:29 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 18:29 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 18:23 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcephmon2005-dev.codfw.wmnet
- 18:17 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudcephmon2005-dev.codfw.wmnet
- 18:13 dancy@deploy1003: rebuilt and synchronized wikiversions files: group2 to 1.46.0-wmf.23 refs T420481
- 18:04 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 18:04 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 18:03 dancy@deploy1003: Installation of scap version "4.246.0" completed for 2 hosts
- 18:02 dancy@deploy1003: Installing scap version "4.246.0" for 2 host(s)
- 17:52 dzahn@deploy1003: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
- 17:52 dzahn@deploy1003: helmfile [eqiad] START helmfile.d/services/miscweb: apply
- 17:52 dzahn@deploy1003: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
- 17:51 dzahn@deploy1003: helmfile [codfw] START helmfile.d/services/miscweb: apply
- 17:51 dzahn@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply
- 17:51 dzahn@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply
- 17:43 ladsgroup@deploy1003: Finished scap sync-world: Backport for Revert^4 "Use envoy for swift inside mediawiki" (duration: 06m 49s)
- 17:39 ladsgroup@deploy1003: ladsgroup: Continuing with sync
- 17:38 ladsgroup@deploy1003: ladsgroup: Backport for Revert^4 "Use envoy for swift inside mediawiki" synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 17:36 ladsgroup@deploy1003: Started scap sync-world: Backport for Revert^4 "Use envoy for swift inside mediawiki"
- 17:35 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-video: apply
- 17:35 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-video: apply
- 17:25 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-video: apply
- 17:24 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-video: apply
- 17:24 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-video: apply
- 17:23 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-video: apply
- 17:10 aokoth@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab2002.wikimedia.org with reason: Security Release - T422668
- 17:09 bd808@deploy1003: helmfile [eqiad] DONE helmfile.d/services/developer-portal: apply
- 17:09 bd808@deploy1003: helmfile [eqiad] START helmfile.d/services/developer-portal: apply
- 17:09 bd808@deploy1003: helmfile [codfw] DONE helmfile.d/services/developer-portal: apply
- 17:08 bd808@deploy1003: helmfile [codfw] START helmfile.d/services/developer-portal: apply
- 17:08 bd808@deploy1003: helmfile [staging] DONE helmfile.d/services/developer-portal: apply
- 17:07 bd808@deploy1003: helmfile [staging] START helmfile.d/services/developer-portal: apply
- 17:03 ladsgroup@deploy1003: Finished scap sync-world: Backport for Revert^3 "Use envoy for swift inside mediawiki" (duration: 06m 11s)
- 17:01 aokoth@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab2002.wikimedia.org with reason: Security Release - T422668
- 16:59 ladsgroup@deploy1003: ladsgroup: Continuing with sync
- 16:59 ladsgroup@deploy1003: ladsgroup: Backport for Revert^3 "Use envoy for swift inside mediawiki" synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 16:57 ladsgroup@deploy1003: Started scap sync-world: Backport for Revert^3 "Use envoy for swift inside mediawiki"
- 16:56 aokoth@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1003.wikimedia.org with reason: Security Release - T422668
- 16:51 ladsgroup@deploy1003: Started scap sync-world: Backport for Revert^2 "Use envoy for swift inside mediawiki" (T328872)
- 16:48 ladsgroup@deploy1003: Finished scap sync-world: Backport for Revert^2 "Use envoy for swift inside mediawiki" (T328872) (duration: 07m 02s)
- 16:46 aokoth@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1003.wikimedia.org with reason: Security Release - T422668
- 16:44 ladsgroup@deploy1003: ladsgroup: Continuing with sync
- 16:43 ladsgroup@deploy1003: ladsgroup: Backport for Revert^2 "Use envoy for swift inside mediawiki" (T328872) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 16:41 ladsgroup@deploy1003: Started scap sync-world: Backport for Revert^2 "Use envoy for swift inside mediawiki" (T328872)
- 16:11 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host moss-be1002.eqiad.wmnet with OS bookworm
- 15:47 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on moss-be1002.eqiad.wmnet with reason: host reimage
- 15:46 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.roll-restart (exit_code=0) rolling restart_daemons on A:dnsbox and (A:dnsbox)
- 15:44 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on moss-be1002.eqiad.wmnet with reason: host reimage
- 15:44 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-experimental: apply
- 15:43 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/mw-experimental: apply
- 15:43 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
- 15:42 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
- 15:36 cgoubert@deploy1003: Finished scap sync-world: swift service proxy configuration cahnges (duration: 05m 45s)
- 15:31 cgoubert@deploy1003: Started scap sync-world: swift service proxy configuration cahnges
- 15:26 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host moss-be1002
- 15:25 mvernon@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host moss-be1002
- 15:25 mvernon@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host moss-be1002
- 15:25 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) moss-be1002.eqiad.wmnet 79.32.64.10.in-addr.arpa 9.7.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
- 15:25 mvernon@cumin2002: START - Cookbook sre.dns.wipe-cache moss-be1002.eqiad.wmnet 79.32.64.10.in-addr.arpa 9.7.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
- 15:25 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 15:25 mvernon@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host moss-be1002 - mvernon@cumin2002"
- 15:25 mvernon@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host moss-be1002 - mvernon@cumin2002"
- 15:21 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
- 15:20 mvernon@cumin2002: START - Cookbook sre.dns.netbox
- 15:20 mvernon@cumin2002: START - Cookbook sre.hosts.move-vlan for host moss-be1002
- 15:19 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host moss-be1002.eqiad.wmnet with OS bookworm
- 15:19 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
- 15:19 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
- 15:17 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
- 15:03 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
- 14:59 dancy@deploy1003: Installation of scap version "4.245.0" completed for 2 hosts
- 14:58 dancy@deploy1003: Installing scap version "4.245.0" for 2 host(s)
- 14:53 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.roll-restart-reboot-wikimedia-dns (exit_code=0) rolling restart_daemons on A:wikidough
- 14:47 sukhe@cumin1003: START - Cookbook sre.dns.roll-restart rolling restart_daemons on A:dnsbox and (A:dnsbox)
- 14:45 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
- 14:44 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
- 14:42 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
- 14:41 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 14:41 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 14:40 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
- 14:39 sukhe@cumin1003: START - Cookbook sre.dns.roll-restart-reboot-wikimedia-dns rolling restart_daemons on A:wikidough
- 14:38 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
- 14:37 Emperor: ceph orch host drain moss-be1002 --zap-osd-devices T421719
- 14:35 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 14:35 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 14:30 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host apus-fe1003.eqiad.wmnet with OS bookworm
- 14:29 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 14:29 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 14:26 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
- 14:25 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 14:25 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 14:23 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 14:23 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 14:14 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
- 14:14 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
- 14:13 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
- 14:09 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on apus-fe1003.eqiad.wmnet with reason: host reimage
- 14:06 mszwarc@deploy1003: mwscript-k8s job started: foreachwikiindblist all backfillInterwikiRightsLog.php --remote-wiki metawiki 20260311190000 # T6055 (second attempt)
- 14:04 aude@deploy1003: Finished scap sync-world: Backport for Enable reading list beta feature for pilot wikis (T420878) (duration: 08m 40s)
- 14:01 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on apus-fe1003.eqiad.wmnet with reason: host reimage
- 14:00 aude@deploy1003: bwang, aude: Continuing with sync
- 13:57 aude@deploy1003: bwang, aude: Backport for Enable reading list beta feature for pilot wikis (T420878) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 13:55 aude@deploy1003: Started scap sync-world: Backport for Enable reading list beta feature for pilot wikis (T420878)
- 13:52 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 13:52 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- {{safesubst:SAL entry|1=13:52 hashar@deploy1003: Finished scap sync-world: Backport for fix: adjust to return type changed by upstream, browser-tests: hide Cypress tests from CI (T419574), browser-tests: hide Cypress tests from CI (T419574), Fix BackfillInterwikiRightsLog wrt. cyclic renames (T6055), [[gerrit:1269334|Fix BackfillInterwikiRightsLog wrt. cyclic renames (T605}}
- 13:48 hashar@deploy1003: mszwarc, hashar: Continuing with sync
- 13:46 bking@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ipoid: apply
- 13:45 bking@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ipoid: apply
- 13:44 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host apus-fe1003
- 13:44 mvernon@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host apus-fe1003
- 13:44 mvernon@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host apus-fe1003
- 13:44 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) apus-fe1003.eqiad.wmnet 102.32.64.10.in-addr.arpa 2.0.1.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
- 13:44 mvernon@cumin2002: START - Cookbook sre.dns.wipe-cache apus-fe1003.eqiad.wmnet 102.32.64.10.in-addr.arpa 2.0.1.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
- 13:44 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 13:44 mvernon@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host apus-fe1003 - mvernon@cumin2002"
- 13:44 mvernon@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host apus-fe1003 - mvernon@cumin2002"
- 13:44 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
- 13:44 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
- 13:40 mvernon@cumin2002: START - Cookbook sre.dns.netbox
- 13:40 mvernon@cumin2002: START - Cookbook sre.hosts.move-vlan for host apus-fe1003
- 13:39 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host apus-fe1003.eqiad.wmnet with OS bookworm
- 13:38 hashar@deploy1003: mszwarc, hashar: Backport for fix: adjust to return type changed by upstream, browser-tests: hide Cypress tests from CI (T419574), browser-tests: hide Cypress tests from CI (T419574), Fix BackfillInterwikiRightsLog wrt. cyclic renames (T6055), Fix BackfillInterwikiRightsLog wrt. cyclic renames (T6055) sync
- 13:37 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
- {{safesubst:SAL entry|1=13:36 hashar@deploy1003: Started scap sync-world: Backport for fix: adjust to return type changed by upstream, browser-tests: hide Cypress tests from CI (T419574), browser-tests: hide Cypress tests from CI (T419574), Fix BackfillInterwikiRightsLog wrt. cyclic renames (T6055), [[gerrit:1269334|Fix BackfillInterwikiRightsLog wrt. cyclic renames (T6055}}
- 13:33 bking@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-ipoid: apply
- 13:33 mvernon@cumin2002: END (PASS) - Cookbook sre.swift.roll-restart-reboot-swift-thanos-proxies (exit_code=0) rolling restart_daemons on A:thanos-fe-codfw
- 13:31 mvernon@cumin2002: START - Cookbook sre.swift.roll-restart-reboot-swift-thanos-proxies rolling restart_daemons on A:thanos-fe-codfw
- 13:30 mvernon@cumin2002: END (PASS) - Cookbook sre.swift.roll-restart-reboot-swift-thanos-proxies (exit_code=0) rolling restart_daemons on P{thanos-fe[1004-1006].eqiad.wmnet} and (A:thanos-fe or A:thanos-fe-codfw or A:thanos-fe-eqiad)
- 13:29 mvernon@cumin2002: START - Cookbook sre.swift.roll-restart-reboot-swift-thanos-proxies rolling restart_daemons on P{thanos-fe[1004-1006].eqiad.wmnet} and (A:thanos-fe or A:thanos-fe-codfw or A:thanos-fe-eqiad)
- 13:28 bking@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-ipoid: apply
- 13:27 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host thanos-fe1007.eqiad.wmnet with OS bullseye
- 13:19 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
- 13:17 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
- 13:15 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
- 13:14 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host phab1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 13:09 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-cluster (exit_code=0) pool all services in eqiad/aux-eqiad: maintenance
- 13:09 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-cluster pool all services in eqiad/aux-eqiad: maintenance
- 13:09 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/thumbor: apply
- 13:08 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on thanos-fe1007.eqiad.wmnet with reason: host reimage
- 13:08 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.wipe-cluster (exit_code=0) Wipe the K8s cluster aux-eqiad: Kubernetes upgrade
- 13:08 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
- 13:07 jclark@cumin1003: START - Cookbook sre.hosts.provision for host phab1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 13:07 elukey@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: sync
- 13:07 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host phab1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 13:07 elukey@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: sync
- 13:06 elukey@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
- 13:06 elukey@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/aux-k8s-services/sophroid: sync
- 13:06 elukey@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/aux-k8s-services/sophroid: sync
- 13:06 elukey@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/aux-k8s-services/redioscope: sync
- 13:05 elukey@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/aux-k8s-services/redioscope: sync
- 13:05 elukey@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/aux-k8s-services/kafka-mirrormaker: sync
- 13:05 elukey@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/aux-k8s-services/kafka-mirrormaker: sync
- 13:05 elukey@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/aux-k8s-services/jaeger: sync
- 13:05 elukey@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/aux-k8s-services/jaeger: sync
- 13:04 jclark@cumin1003: START - Cookbook sre.hosts.provision for host phab1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 13:03 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/thumbor: apply
- 13:02 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on thanos-fe1007.eqiad.wmnet with reason: host reimage
- 13:01 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host phab1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 13:01 elukey@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'sync'.
- 12:59 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/thumbor: apply
- 12:59 jmm@deploy1003: helmfile [staging] START helmfile.d/services/thumbor: apply
- 12:59 elukey@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'sync'.
- 12:53 hashar: Directly pushed GrowthExperiments wmf/1.46.0-wmf.22 patch https://gerrit.wikimedia.org/r/c/mediawiki/extensions/GrowthExperiments/+/1269351 due to a chicken-and-egg issue on that branch
- 12:47 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host thanos-fe1007
- 12:47 mvernon@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host thanos-fe1007
- 12:46 mvernon@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host thanos-fe1007
- 12:46 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) thanos-fe1007.eqiad.wmnet 186.48.64.10.in-addr.arpa 6.8.1.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
- 12:46 mvernon@cumin2002: START - Cookbook sre.dns.wipe-cache thanos-fe1007.eqiad.wmnet 186.48.64.10.in-addr.arpa 6.8.1.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
- 12:46 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 12:46 mvernon@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host thanos-fe1007 - mvernon@cumin2002"
- 12:46 mvernon@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host thanos-fe1007 - mvernon@cumin2002"
- 12:46 elukey@cumin1003: START - Cookbook sre.k8s.wipe-cluster Wipe the K8s cluster aux-eqiad: Kubernetes upgrade
- 12:42 jclark@cumin1003: START - Cookbook sre.hosts.provision for host phab1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 12:42 mvernon@cumin2002: START - Cookbook sre.dns.netbox
- 12:42 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-cluster (exit_code=0) depool all services in eqiad/aux-eqiad: maintenance
- 12:42 mvernon@cumin2002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
- 12:42 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
- 12:42 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 12:42 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding phab1006 to eqiad - jclark@cumin1003"
- 12:42 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-cluster depool all services in eqiad/aux-eqiad: maintenance
- 12:42 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding phab1006 to eqiad - jclark@cumin1003"
- 12:40 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
- 12:40 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
- 12:38 mvernon@cumin2002: START - Cookbook sre.dns.netbox
- 12:38 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
- 12:35 mvernon@cumin2002: START - Cookbook sre.hosts.move-vlan for host thanos-fe1007
- 12:34 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host thanos-fe1007.eqiad.wmnet with OS bullseye
- 12:33 jclark@cumin1003: START - Cookbook sre.dns.netbox
- 12:32 mvernon@cumin2002: END (PASS) - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies (exit_code=0) rolling restart_daemons on P{ms-fe[1009-1019,1021-1024].eqiad.wmnet} and (A:swift-fe or A:swift-fe-canary or A:swift-fe-codfw or A:swift-fe-eqiad)
- 12:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 12:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 12:28 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host apus-be1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 12:27 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host apus-be1005.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 12:24 mvernon@cumin2002: START - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies rolling restart_daemons on P{ms-fe[1009-1019,1021-1024].eqiad.wmnet} and (A:swift-fe or A:swift-fe-canary or A:swift-fe-codfw or A:swift-fe-eqiad)
- 12:23 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-fe1020.eqiad.wmnet with OS bullseye
- 12:18 moritzm: restarting Postfix on mx-in to pick up OpenSSL updates
- 12:13 fnegri@cumin1003: conftool action : set/pooled=no; selector: name=clouddb1019.eqiad.wmnet
- 12:12 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
- 12:10 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
- 12:09 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
- 12:08 moritzm: restarting Postfix on mx-out to pick up OpenSSL updates
- 12:07 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
- 12:06 jclark@cumin1003: START - Cookbook sre.hosts.provision for host apus-be1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 12:06 jclark@cumin1003: START - Cookbook sre.hosts.provision for host apus-be1005.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 12:05 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 12:05 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding apus-be1005 to eqiad - jclark@cumin1003"
- 12:05 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding apus-be1005 to eqiad - jclark@cumin1003"
- 12:04 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-fe1020.eqiad.wmnet with reason: host reimage
- 12:02 jclark@cumin1003: START - Cookbook sre.dns.netbox
- 12:00 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-fe1020.eqiad.wmnet with reason: host reimage
- 11:51 moritzm: installing nginx security updates
- 11:44 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host ms-fe1020
- 11:44 mvernon@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-fe1020
- 11:31 mvernon@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ms-fe1020
- 11:31 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ms-fe1020.eqiad.wmnet 113.48.64.10.in-addr.arpa 3.1.1.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
- 11:31 mvernon@cumin2002: START - Cookbook sre.dns.wipe-cache ms-fe1020.eqiad.wmnet 113.48.64.10.in-addr.arpa 3.1.1.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
- 11:31 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 11:31 mvernon@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-fe1020 - mvernon@cumin2002"
- 11:30 mvernon@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-fe1020 - mvernon@cumin2002"
- 11:29 jmm@cumin2002: END (PASS) - Cookbook sre.misc-clusters.roll-restart-reboot-docker-registry (exit_code=0) rolling restart_daemons on A:docker-registry
- 11:29 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 11:28 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 11:27 jmm@cumin2002: START - Cookbook sre.misc-clusters.roll-restart-reboot-docker-registry rolling restart_daemons on A:docker-registry
- 11:27 mvernon@cumin2002: START - Cookbook sre.dns.netbox
- 11:26 mvernon@cumin2002: START - Cookbook sre.hosts.move-vlan for host ms-fe1020
- 11:25 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-fe1020.eqiad.wmnet with OS bullseye
- 11:16 moritzm: installing tiff security updates
- 11:05 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 11:04 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 11:04 mvernon@cumin2002: END (PASS) - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies (exit_code=0) rolling restart_daemons on P{ms-fe[1009-1018,1020-1024].eqiad.wmnet} and (A:swift-fe or A:swift-fe-canary or A:swift-fe-codfw or A:swift-fe-eqiad)
- 11:01 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 11:01 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 11:01 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 11:01 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 10:55 mvernon@cumin2002: START - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies rolling restart_daemons on P{ms-fe[1009-1018,1020-1024].eqiad.wmnet} and (A:swift-fe or A:swift-fe-canary or A:swift-fe-codfw or A:swift-fe-eqiad)
- 10:54 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-fe1019.eqiad.wmnet with OS bullseye
- 10:47 moritzm: installing openssl security updates
- 10:45 moritzm: upgrade debdeploy-server on cumin2002 to 0.0.99.14-1+deb12u1+exp2 (temporary build with Cumin 6 compat before we have Cumin 6 universally)
- 10:34 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-fe1019.eqiad.wmnet with reason: host reimage
- 10:29 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-fe1019.eqiad.wmnet with reason: host reimage
- 10:13 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host ms-fe1019
- 10:13 mvernon@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-fe1019
- 10:13 mvernon@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ms-fe1019
- 10:13 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ms-fe1019.eqiad.wmnet 92.32.64.10.in-addr.arpa 2.9.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
- 10:13 mvernon@cumin2002: START - Cookbook sre.dns.wipe-cache ms-fe1019.eqiad.wmnet 92.32.64.10.in-addr.arpa 2.9.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
- 10:13 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 10:13 mvernon@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-fe1019 - mvernon@cumin2002"
- 10:13 mvernon@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-fe1019 - mvernon@cumin2002"
- 10:08 mvernon@cumin2002: START - Cookbook sre.dns.netbox
- 10:08 mvernon@cumin2002: START - Cookbook sre.hosts.move-vlan for host ms-fe1019
- 10:08 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_esams - 3.2 upgrade (T421402)
- 10:07 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-fe1019.eqiad.wmnet with OS bullseye
- 10:04 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_esams - 3.2 upgrade (T421402)
- 09:58 gkyziridis@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
- 09:43 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2157: Pooling in
- 09:33 mvernon@cumin2002: END (PASS) - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies (exit_code=0) rolling restart_daemons on P{ms-fe[1009-1012,1014-1024].eqiad.wmnet} and (A:swift-fe or A:swift-fe-canary or A:swift-fe-codfw or A:swift-fe-eqiad)
- 09:25 mvernon@cumin2002: START - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies rolling restart_daemons on P{ms-fe[1009-1012,1014-1024].eqiad.wmnet} and (A:swift-fe or A:swift-fe-canary or A:swift-fe-codfw or A:swift-fe-eqiad)
- 09:22 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-fe1013.eqiad.wmnet with OS bullseye
- 09:16 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_esams - 3.2 upgrade (T421402)
- 09:16 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_esams - 3.2 upgrade (T421402)
- 09:15 elukey: remove /var/run/confd-template/_var_lib_gdnsd_discovery-k8s-ingress-aux-rw.state.err on affected dns servers and restart confd
- 09:12 elukey: remove /var/run/confd-template/_var_lib_gdnsd_discovery-k8s-ingress-aux-rw.state.err on dns5004 and restart confd
- 09:11 fabfur: upgrading esams to haproxy 3.2 (T421402)
- 09:10 gkyziridis@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
- 09:07 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
- 09:06 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
- 09:04 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-fe1013.eqiad.wmnet with reason: host reimage
- 08:59 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_eqiad - 3.2 upgrade (T421402)
- 08:58 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-fe1013.eqiad.wmnet with reason: host reimage
- 08:57 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2157: Pooling in
- 08:56 oblivian@puppetserver1001: conftool action : set/pooled=false; selector: dnsdisc=k8s-ingress-aux-rw,name=codfw
- 08:51 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_eqiad - 3.2 upgrade (T421402)
- 08:41 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host ms-fe1013
- 08:41 mvernon@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-fe1013
- 08:41 mvernon@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ms-fe1013
- 08:41 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ms-fe1013.eqiad.wmnet 149.48.64.10.in-addr.arpa 9.4.1.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
- 08:41 mvernon@cumin2002: START - Cookbook sre.dns.wipe-cache ms-fe1013.eqiad.wmnet 149.48.64.10.in-addr.arpa 9.4.1.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
- 08:41 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 08:41 mvernon@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-fe1013 - mvernon@cumin2002"
- 08:41 mvernon@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-fe1013 - mvernon@cumin2002"
- 08:26 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2157 (T419635)', diff saved to https://phabricator.wikimedia.org/P90334 and previous config saved to /var/cache/conftool/dbconfig/20260409-082633-fceratto.json
- 08:26 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2157.codfw.wmnet with reason: Maintenance
- 08:23 elukey@cumin1003: END (FAIL) - Cookbook sre.k8s.pool-depool-cluster (exit_code=93) pool all services in codfw/aux-codfw: maintenance
- 08:21 mvernon@cumin2002: START - Cookbook sre.dns.netbox
- 08:21 mvernon@cumin2002: START - Cookbook sre.hosts.move-vlan for host ms-fe1013
- 08:21 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-cluster pool all services in codfw/aux-codfw: maintenance
- 08:20 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-fe1013.eqiad.wmnet with OS bullseye
- 08:14 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_eqiad - 3.2 upgrade (T421402)
- 08:14 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_eqiad - 3.2 upgrade (T421402)
- 08:11 fabfur: upgrading eqiad to haproxy 3.2 (T421402)
- 07:17 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool pc1016: After reimage
- 07:17 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
- 07:17 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
- 07:17 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool pc1016: After reimage
- 07:16 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host pc1016.eqiad.wmnet with OS trixie
- 07:03 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on pc2016.codfw.wmnet with reason: Maintenance
- 06:59 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on pc1016.eqiad.wmnet with reason: host reimage
- 06:55 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on pc1016.eqiad.wmnet with reason: host reimage
- 06:39 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host pc1016.eqiad.wmnet with OS trixie
- 06:38 marostegui@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host pc1016.eqiad.wmnet with OS trixie
- 06:38 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host pc1016.eqiad.wmnet with OS trixie
- 05:59 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host pc2016.codfw.wmnet with OS trixie
- 05:35 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on pc2016.codfw.wmnet with reason: host reimage
- 05:31 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on pc2016.codfw.wmnet with reason: host reimage
- 05:13 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host pc2016.codfw.wmnet with OS trixie
- 05:10 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on pc2016.codfw.wmnet,pc1016.eqiad.wmnet with reason: Reimage to Debian Trixie
- 05:09 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool pc1016: Reimage
- 05:09 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
- 05:09 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
- 05:09 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool pc1016: Reimage
- 02:31 kevinbazira@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
- 02:09 kevinbazira@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
- 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 11s)
- 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
- 00:57 zabe@deploy1003: Finished scap sync-world: Backport for Start reading from new file tables everwhere except enwiki and commons (T416548) (duration: 07m 40s)
- 00:53 zabe@deploy1003: zabe: Continuing with sync
- 00:51 zabe@deploy1003: zabe: Backport for Start reading from new file tables everwhere except enwiki and commons (T416548) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 00:49 zabe@deploy1003: Started scap sync-world: Backport for Start reading from new file tables everwhere except enwiki and commons (T416548)
- 00:22 eevans@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for aqs1024.eqiad.wmnet
- 00:22 eevans@cumin1003: START - Cookbook sre.hosts.remove-downtime for aqs1024.eqiad.wmnet
2026-04-08
- 22:04 ladsgroup@deploy1003: Finished scap sync-world: Backport for Revert "Use envoy for swift inside mediawiki" (duration: 06m 54s)
- 22:00 ladsgroup@deploy1003: ladsgroup: Continuing with sync
- 21:59 ladsgroup@deploy1003: ladsgroup: Backport for Revert "Use envoy for swift inside mediawiki" synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 21:57 ladsgroup@deploy1003: Started scap sync-world: Backport for Revert "Use envoy for swift inside mediawiki"
- 21:46 rzl@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
- 21:45 rzl@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
- 21:45 rzl@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
- 21:27 rzl@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
- 21:17 rzl@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
- 21:04 ladsgroup@deploy1003: Finished scap sync-world: Backport for Use envoy for swift inside mediawiki (T328872) (duration: 06m 27s)
- 21:00 ladsgroup@deploy1003: ladsgroup: Continuing with sync
- 21:00 ladsgroup@deploy1003: ladsgroup: Backport for Use envoy for swift inside mediawiki (T328872) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 20:58 ladsgroup@deploy1003: Started scap sync-world: Backport for Use envoy for swift inside mediawiki (T328872)
- 20:40 jdrewniak@deploy1003: Finished scap sync-world: Backport for Bumping portals to master (T128546) (duration: 06m 14s)
- 20:36 jdrewniak@deploy1003: jdrewniak: Continuing with sync
- 20:35 jdrewniak@deploy1003: jdrewniak: Backport for Bumping portals to master (T128546) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 20:34 jdrewniak@deploy1003: Started scap sync-world: Backport for Bumping portals to master (T128546)
- 20:24 jdrewniak@deploy1003: jdrewniak: Continuing with sync
- 20:23 jdrewniak@deploy1003: jdrewniak: Backport for Bumping portals to master (T128546) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 20:21 jdrewniak@deploy1003: Started scap sync-world: Backport for Bumping portals to master (T128546)
- 20:17 toyofuku@deploy1003: Finished scap sync-world: Backport for Disable extension:WP25EasterEggs from Wikipedias. (T422548) (duration: 09m 27s)
- 20:13 toyofuku@deploy1003: jdrewniak, toyofuku: Continuing with sync
- 20:09 toyofuku@deploy1003: jdrewniak, toyofuku: Backport for Disable extension:WP25EasterEggs from Wikipedias. (T422548) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 20:07 toyofuku@deploy1003: Started scap sync-world: Backport for Disable extension:WP25EasterEggs from Wikipedias. (T422548)
- 19:35 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-mcrouter: apply
- 19:22 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-mcrouter: apply
- 19:14 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cirrussearch1103.eqiad.wmnet with OS bullseye
- 19:09 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-mcrouter: apply
- 19:01 eevans@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on aqs1024.eqiad.wmnet with reason: Bootstrapping — T412830
- 18:57 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/mw-mcrouter: apply
- 18:56 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/mw-mcrouter: apply
- 18:55 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/mw-mcrouter: apply
- 18:54 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cirrussearch1103.eqiad.wmnet with reason: host reimage
- 18:49 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cirrussearch1103.eqiad.wmnet with reason: host reimage
- 18:33 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cirrussearch1103
- 18:33 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host cirrussearch1103
- 18:32 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cirrussearch1103.eqiad.wmnet with OS bullseye
- 18:25 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cirrussearch1103.eqiad.wmnet with OS bullseye
- 18:08 dancy@deploy1003: rebuilt and synchronized wikiversions files: group1 to 1.46.0-wmf.23 refs T420481
- 18:04 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cirrussearch1103.eqiad.wmnet with reason: host reimage
- 18:00 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cirrussearch1103.eqiad.wmnet with reason: host reimage
- 17:49 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cirrussearch1088.eqiad.wmnet with OS bullseye
- 17:43 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cirrussearch1103
- 17:43 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cirrussearch1103
- 17:42 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cirrussearch1103
- 17:42 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cirrussearch1103.eqiad.wmnet 43.48.64.10.in-addr.arpa 3.4.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
- 17:42 bking@cumin2002: START - Cookbook sre.dns.wipe-cache cirrussearch1103.eqiad.wmnet 43.48.64.10.in-addr.arpa 3.4.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
- 17:42 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 17:42 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cirrussearch1103 - bking@cumin2002"
- 17:42 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cirrussearch1103 - bking@cumin2002"
- 17:39 bking@cumin2002: START - Cookbook sre.dns.netbox
- 17:37 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host cirrussearch1103
- 17:36 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cirrussearch1103.eqiad.wmnet with OS bullseye
- 17:36 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cirrussearch1089.eqiad.wmnet with OS bullseye
- 17:35 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cirrussearch1089.eqiad.wmnet with OS bullseye
- 17:29 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cirrussearch1088.eqiad.wmnet with reason: host reimage
- 17:23 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cirrussearch1088.eqiad.wmnet with reason: host reimage
- 17:08 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host testvm2002.codfw.wmnet with OS trixie
- 17:07 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cirrussearch1088
- 17:07 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cirrussearch1088
- 17:06 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cirrussearch1088
- 17:06 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cirrussearch1088.eqiad.wmnet 176.32.64.10.in-addr.arpa 6.7.1.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
- 17:06 bking@cumin2002: START - Cookbook sre.dns.wipe-cache cirrussearch1088.eqiad.wmnet 176.32.64.10.in-addr.arpa 6.7.1.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
- 17:06 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 17:06 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cirrussearch1088 - bking@cumin2002"
- 17:06 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cirrussearch1088 - bking@cumin2002"
- 17:02 bking@cumin2002: START - Cookbook sre.dns.netbox
- 17:01 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host cirrussearch1088
- 17:00 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cirrussearch1088.eqiad.wmnet with OS bullseye
- 16:20 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_codfw - 3.2 upgrade (T421402)
- 16:19 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_codfw - 3.2 upgrade (T421402)
- 16:07 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cirrussearch1087.eqiad.wmnet with OS bullseye
- 16:00 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cirrussearch1081.eqiad.wmnet with OS bullseye
- 15:52 eevans@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for aqs1023.eqiad.wmnet
- 15:52 eevans@cumin1003: START - Cookbook sre.hosts.remove-downtime for aqs1023.eqiad.wmnet
- 15:52 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host testvm2002.codfw.wmnet with OS trixie
- 15:49 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cirrussearch1087.eqiad.wmnet with reason: host reimage
- 15:43 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cirrussearch1087.eqiad.wmnet with reason: host reimage
- 15:42 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_codfw - 3.2 upgrade (T421402)
- 15:42 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_codfw - 3.2 upgrade (T421402)
- 15:41 fabfur: upgrading codfw to haproxy 3.2 (T421402)
- 15:36 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_eqsin - 3.2 upgrade (T421402)
- 15:35 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cirrussearch1081.eqiad.wmnet with reason: host reimage
- 15:29 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_eqsin - 3.2 upgrade (T421402)
- 15:28 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts cloudcephmon2004-dev.codfw.wmnet
- 15:28 andrew@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 15:28 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cirrussearch1087
- 15:28 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cirrussearch1087
- 15:27 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cirrussearch1087
- 15:27 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cirrussearch1087.eqiad.wmnet 174.32.64.10.in-addr.arpa 4.7.1.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
- 15:27 bking@cumin2002: START - Cookbook sre.dns.wipe-cache cirrussearch1087.eqiad.wmnet 174.32.64.10.in-addr.arpa 4.7.1.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
- 15:27 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 15:27 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cirrussearch1087 - bking@cumin2002"
- 15:27 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cirrussearch1081.eqiad.wmnet with reason: host reimage
- 15:26 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cirrussearch1087 - bking@cumin2002"
- 15:26 andrew@cumin2002: START - Cookbook sre.dns.netbox
- 15:20 bking@cumin2002: START - Cookbook sre.dns.netbox
- 15:20 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host cirrussearch1087
- 15:19 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cirrussearch1087.eqiad.wmnet with OS bullseye
- 15:19 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 15:19 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 15:18 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 15:18 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 15:18 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 15:17 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 15:17 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 15:17 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 15:17 sukhe: sukhe@lvs1020:~$ sudo systemctl restart pybal.service
- 15:17 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 15:16 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 15:16 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 15:16 andrew@cumin2002: START - Cookbook sre.hosts.decommission for hosts cloudcephmon2004-dev.codfw.wmnet
- 15:16 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 15:11 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cirrussearch1081
- 15:11 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cirrussearch1081
- 15:10 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cirrussearch1081
- 15:10 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cirrussearch1081.eqiad.wmnet 166.32.64.10.in-addr.arpa 6.6.1.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
- 15:10 bking@cumin2002: START - Cookbook sre.dns.wipe-cache cirrussearch1081.eqiad.wmnet 166.32.64.10.in-addr.arpa 6.6.1.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
- 15:10 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 15:10 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cirrussearch1081 - bking@cumin2002"
- 15:10 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cirrussearch1081 - bking@cumin2002"
- 15:06 bking@cumin2002: START - Cookbook sre.dns.netbox
- 15:05 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host cirrussearch1081
- 15:05 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cirrussearch1081.eqiad.wmnet with OS bullseye
- 15:00 derick@deploy1003: mwscript-k8s job started: extensions/CentralAuth/maintenance/fixStuckGlobalRename.php --wiki=zhwiki --logwiki=metawiki 'Mr Kazi Tuhin' KaziHasanTuhin # T422677
- 14:58 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
- 14:58 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 14:57 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
- 14:53 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 14:49 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
- 14:48 taavi: serve dumps rsync traffic via new LVS service T422040
- 14:47 swfrench@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
- 14:42 taavi@dns1004: END - running authdns-update
- 14:41 taavi@dns1004: START - running authdns-update
- 14:39 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host seaborgium.wikimedia.org
- 14:37 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_eqsin - 3.2 upgrade (T421402)
- 14:37 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_eqsin - 3.2 upgrade (T421402)
- 14:36 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host seaborgium.wikimedia.org
- 14:32 fabfur: upgrading eqsin to haproxy 3.2 (T421402)
- 14:19 ecarg@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
- 14:18 ecarg@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
- 14:18 ecarg@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
- 14:17 ecarg@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
- 14:17 ecarg@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
- 14:16 ecarg@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
- 14:11 ecarg@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
- 14:11 ecarg@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
- 14:11 ecarg@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
- 14:10 ecarg@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
- 14:09 ecarg@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
- 14:08 ecarg@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
- 14:07 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 14:07 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 14:04 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
- 14:03 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
- 13:45 gkyziridis@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
- 13:42 Lucas_WMDE: UTC afternoon backport+config window done
- 13:41 phuedx@deploy1003: Finished scap sync-world: Backport for PHP SDK: Measure known experiments correctly (T422112), PHP SDK: Measure known experiments correctly (T422112) (duration: 07m 58s)
- 13:37 phuedx@deploy1003: phuedx: Continuing with sync
- 13:35 phuedx@deploy1003: phuedx: Backport for PHP SDK: Measure known experiments correctly (T422112), PHP SDK: Measure known experiments correctly (T422112) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 13:33 bking@cumin2002: conftool action : set/pooled=true; selector: dnsdisc=wdqs-internal-scholarly,name=eqiad
- 13:33 phuedx@deploy1003: Started scap sync-world: Backport for PHP SDK: Measure known experiments correctly (T422112), PHP SDK: Measure known experiments correctly (T422112)
- 13:31 ayounsi@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host testvm2006.codfw.wmnet with OS trixie
- 13:28 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for cswiki: lift IP cap for workshop (T422520) (duration: 06m 22s)
- 13:26 moritzm: upgrade debdeploy-server on cumin2002 to 0.0.99.14-1+deb12u1+exp1 (temporary build with Cumin 6 compat before we have Cumin 6 universally)
- 13:25 lucaswerkmeister-wmde@deploy1003: anzx, lucaswerkmeister-wmde: Continuing with sync
- 13:24 lucaswerkmeister-wmde@deploy1003: anzx, lucaswerkmeister-wmde: Backport for cswiki: lift IP cap for workshop (T422520) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 13:22 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for cswiki: lift IP cap for workshop (T422520)
- 13:15 cscott@deploy1003: Finished scap sync-world: Backport for Turn on Parsoid Read Views for eswiki (T422524) (duration: 07m 06s)
- 13:11 cscott@deploy1003: cscott: Continuing with sync
- 13:10 cscott@deploy1003: cscott: Backport for Turn on Parsoid Read Views for eswiki (T422524) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 13:08 cscott@deploy1003: Started scap sync-world: Backport for Turn on Parsoid Read Views for eswiki (T422524)
- 13:04 taavi: restarting pybal on lvs1018
- 12:49 gkyziridis@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
- 12:43 taavi: restarting pybal on lvs1020
- 12:40 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host testvm2006.codfw.wmnet with OS trixie
- 12:32 gkyziridis@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
- 12:32 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
- 12:32 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
- 12:32 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
- 12:31 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
- 12:29 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
- 12:28 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
- 12:28 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
- 12:27 jiji@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM wikikube-worker-exp2001.codfw.wmnet
- 12:27 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
- 12:27 jiji@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM wikikube-worker-exp2001.codfw.wmnet
- 12:15 mszwarc@deploy1003: mwscript-k8s job started: foreachwikiindblist all backfillInterwikiRightsLog.php --remote-wiki metawiki 20260311190000 # T6055
- 12:13 jiji@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM wikikube-worker-exp1001.eqiad.wmnet
- 12:07 jiji@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM wikikube-worker-exp1001.eqiad.wmnet
- 11:53 gkyziridis@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
- 11:44 kart_: machinetranslation: Remove networkpolicies for people* (T335491)
- 11:43 kartik@deploy1003: helmfile [eqiad] DONE helmfile.d/services/machinetranslation: apply
- 11:43 kartik@deploy1003: helmfile [eqiad] START helmfile.d/services/machinetranslation: apply
- 11:42 kartik@deploy1003: helmfile [codfw] DONE helmfile.d/services/machinetranslation: apply
- 11:42 kartik@deploy1003: helmfile [codfw] START helmfile.d/services/machinetranslation: apply
- 11:42 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-drmrs - 3.2.15 upgrade (T421402)
- 11:41 kartik@deploy1003: helmfile [staging] DONE helmfile.d/services/machinetranslation: apply
- 11:41 kartik@deploy1003: helmfile [staging] START helmfile.d/services/machinetranslation: apply
- 11:38 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-magru - 3.2.15 upgrade (T421402)
- 11:35 gkyziridis@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
- 11:23 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-ulsfo - 3.2.15 upgrade (T421402)
- 11:15 moritzm: installing dpkg security updates
- 11:11 moritzm: installing Tomcat security updates
- 11:11 mvolz@deploy1003: helmfile [staging] DONE helmfile.d/services/citoid: apply
- 11:10 mvolz@deploy1003: helmfile [staging] START helmfile.d/services/citoid: apply
- 11:01 mvernon@cumin2002: END (PASS) - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies (exit_code=0) rolling restart_daemons on P{ms-fe[1009-1010,1012-1024].eqiad.wmnet} and (A:swift-fe or A:swift-fe-canary or A:swift-fe-codfw or A:swift-fe-eqiad)
- 10:52 mvernon@cumin2002: START - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies rolling restart_daemons on P{ms-fe[1009-1010,1012-1024].eqiad.wmnet} and (A:swift-fe or A:swift-fe-canary or A:swift-fe-codfw or A:swift-fe-eqiad)
- 10:52 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-fe1011.eqiad.wmnet with OS bullseye
- 10:48 ayounsi@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host testvm2006.codfw.wmnet with OS trixie
- 10:42 blake@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
- 10:42 blake@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-web: apply
- 10:42 blake@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-web: apply
- 10:41 blake@deploy1003: helmfile [codfw] START helmfile.d/services/mw-web: apply
- 10:29 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-fe1011.eqiad.wmnet with reason: host reimage
- 10:25 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-fe1011.eqiad.wmnet with reason: host reimage
- 10:15 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 10:14 hnowlan@deploy1003: Finished deploy [restbase/deploy@dcc15be]: Add urwikisource T415975 (duration: 01m 31s)
- 10:12 hnowlan@deploy1003: Started deploy [restbase/deploy@dcc15be]: Add urwikisource T415975
- 10:12 jmm@cumin2002: START - Cookbook sre.dns.netbox
- 10:11 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host ms-fe1011
- 10:11 mvernon@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-fe1011
- 10:08 mvernon@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ms-fe1011
- 10:08 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ms-fe1011.eqiad.wmnet 182.32.64.10.in-addr.arpa 2.8.1.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
- 10:08 mvernon@cumin2002: START - Cookbook sre.dns.wipe-cache ms-fe1011.eqiad.wmnet 182.32.64.10.in-addr.arpa 2.8.1.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
- 10:08 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 10:08 mvernon@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-fe1011 - mvernon@cumin2002"
- 10:08 mvernon@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-fe1011 - mvernon@cumin2002"
- 10:03 mvernon@cumin2002: START - Cookbook sre.dns.netbox
- 10:02 mvernon@cumin2002: START - Cookbook sre.hosts.move-vlan for host ms-fe1011
- 10:02 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-fe1011.eqiad.wmnet with OS bullseye
- 09:58 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host testvm2006.codfw.wmnet with OS trixie
- 09:55 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-ulsfo - 3.2.15 upgrade (T421402)
- 09:55 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-drmrs - 3.2.15 upgrade (T421402)
- 09:55 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-magru - 3.2.15 upgrade (T421402)
- 09:54 fabfur: upgrading haproxy to version 3.2.15 on magru,drmrs,ulsfo (T421402)
- 09:41 ayounsi@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host testvm2006.codfw.wmnet with OS trixie
- 09:00 taavi: remove unused cloud-vrf clouddumps cr firewall rule https://gerrit.wikimedia.org/r/c/operations/homer/public/+/1268516
- 08:53 taavi@cumin1003: conftool action : set/pooled=yes; selector: name=clouddumps1001.wikimedia.org
- 08:53 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host testvm2006.codfw.wmnet with OS trixie
- 08:52 ayounsi@dns1004: END - running authdns-update
- 08:51 ayounsi@dns1004: START - running authdns-update
- 08:47 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-cluster (exit_code=0) pool all services in codfw/aux-codfw: maintenance
- 08:47 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-cluster pool all services in codfw/aux-codfw: maintenance
- 08:41 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.wipe-cluster (exit_code=0) Wipe the K8s cluster aux-codfw: Kubernetes upgrade
- 08:40 elukey@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: sync
- 08:39 elukey@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: sync
- 08:33 elukey@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/aux-k8s-services/sophroid: sync
- 08:32 elukey@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/aux-k8s-services/sophroid: sync
- 08:32 elukey@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/aux-k8s-services/redioscope: sync
- 08:32 elukey@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/aux-k8s-services/redioscope: sync
- 08:32 elukey@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/aux-k8s-services/kafka-mirrormaker: sync
- 08:31 elukey@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/aux-k8s-services/kafka-mirrormaker: sync
- 08:31 elukey@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/aux-k8s-services/jaeger: sync
- 08:31 elukey@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/aux-k8s-services/jaeger: sync
- 08:24 elukey@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'sync'.
- 08:23 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti5007.eqsin.wmnet
- 08:22 elukey@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'sync'.
- 08:20 elukey@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'sync'.
- 08:19 elukey@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'sync'.
- 08:13 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti5007.eqsin.wmnet
- 08:04 elukey@cumin1003: START - Cookbook sre.k8s.wipe-cluster Wipe the K8s cluster aux-codfw: Kubernetes upgrade
- 08:03 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-cluster (exit_code=0) depool all services in codfw/aux-codfw: maintenance
- 08:02 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-cluster depool all services in codfw/aux-codfw: maintenance
- 07:48 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-cluster (exit_code=0) depool all services in codfw/aux-codfw: maintenance
- 07:48 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-cluster depool all services in codfw/aux-codfw: maintenance
- 07:46 krinkle@deploy1003: Finished scap sync-world: Backport for Enable wgTrackMediaRequestProvenance on most group1 wikis (T414338) (duration: 09m 34s)
- 07:41 krinkle@deploy1003: krinkle: Continuing with sync
- 07:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 07:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: eqsin routed ganeti IPs - ayounsi@cumin1003"
- 07:40 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: eqsin routed ganeti IPs - ayounsi@cumin1003"
- 07:38 krinkle@deploy1003: krinkle: Backport for Enable wgTrackMediaRequestProvenance on most group1 wikis (T414338) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 07:36 krinkle@deploy1003: Started scap sync-world: Backport for Enable wgTrackMediaRequestProvenance on most group1 wikis (T414338)
- 07:35 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
- 07:33 wmde-fisch@deploy1003: Finished scap sync-world: Backport for wikimaniawiki: add editsemiprotected userright to extendedconfirmed usergroup (T421770) (duration: 06m 54s)
- 07:29 wmde-fisch@deploy1003: wmde-fisch, anzx: Continuing with sync
- 07:28 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-cluster (exit_code=0) depool all services in codfw/aux-codfw: maintenance
- 07:28 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-cluster depool all services in codfw/aux-codfw: maintenance
- 07:28 wmde-fisch@deploy1003: wmde-fisch, anzx: Backport for wikimaniawiki: add editsemiprotected userright to extendedconfirmed usergroup (T421770) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 07:26 wmde-fisch@deploy1003: Started scap sync-world: Backport for wikimaniawiki: add editsemiprotected userright to extendedconfirmed usergroup (T421770)
- 07:19 moritzm: installing openssl security updates
- 07:15 wmde-fisch@deploy1003: Finished scap sync-world: Backport for Enable sub-references on Czech and Italian wiki (T420938) (duration: 08m 44s)
- 07:11 wmde-fisch@deploy1003: wmde-fisch: Continuing with sync
- 07:08 wmde-fisch@deploy1003: wmde-fisch: Backport for Enable sub-references on Czech and Italian wiki (T420938) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 07:06 wmde-fisch@deploy1003: Started scap sync-world: Backport for Enable sub-references on Czech and Italian wiki (T420938)
- 05:58 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1152: After reimage
- 05:58 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
- 05:57 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
- 05:57 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1152: After reimage
- 05:50 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1152.eqiad.wmnet with OS trixie
- 05:33 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1152.eqiad.wmnet with reason: host reimage
- 05:29 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1152.eqiad.wmnet with reason: host reimage
- 05:15 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1152.eqiad.wmnet with OS trixie
- 05:15 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1152: Reimage
- 05:15 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
- 05:15 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
- 05:15 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1152: Reimage
- 05:13 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2142.codfw.wmnet,db1152.eqiad.wmnet with reason: Maintenance
- 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 12s)
- 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
2026-04-07
- 22:01 cscott@deploy1003: Finished scap sync-world: Backport for Actually enable parsoid postproc for all wikis (except enwiki) (duration: 08m 05s)
- 21:57 cscott@deploy1003: cscott, ihurbain: Continuing with sync
- 21:55 cscott@deploy1003: cscott, ihurbain: Backport for Actually enable parsoid postproc for all wikis (except enwiki) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 21:53 cscott@deploy1003: Started scap sync-world: Backport for Actually enable parsoid postproc for all wikis (except enwiki)
- 21:52 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cirrussearch1083.eqiad.wmnet with OS bullseye
- 21:50 cscott@deploy1003: Finished scap sync-world: Backport for ParserMigration: transition to new configuration variables (T422543), Enable legacy post-processing cache for DiscussionTools (T376183) (duration: 07m 40s)
- 21:46 cscott@deploy1003: ihurbain, cscott: Continuing with sync
- 21:45 cscott@deploy1003: ihurbain, cscott: Backport for ParserMigration: transition to new configuration variables (T422543), Enable legacy post-processing cache for DiscussionTools (T376183) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 21:43 cscott@deploy1003: Started scap sync-world: Backport for ParserMigration: transition to new configuration variables (T422543), Enable legacy post-processing cache for DiscussionTools (T376183)
- {{safesubst:SAL entry|1=21:39 cscott@deploy1003: Finished scap sync-world: Backport for PHP SDK: Handle experiment config missing or malformed (T422112), ForeignWikiRequest: Pass session to internal 'centralauthtoken' request (T422218), Remove Navigation Menu Link Instrumentation on Personal Dashboard (T422512), Bump wikimedia/parsoid to 0.23.0-a26 (T422394), [[gerrit:1268}}
- 21:35 cscott@deploy1003: matmarex, sfaci, cscott, kgraessle: Continuing with sync
- 21:33 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cirrussearch1083.eqiad.wmnet with reason: host reimage
- {{safesubst:SAL entry|1=21:33 cscott@deploy1003: matmarex, sfaci, cscott, kgraessle: Backport for PHP SDK: Handle experiment config missing or malformed (T422112), ForeignWikiRequest: Pass session to internal 'centralauthtoken' request (T422218), Remove Navigation Menu Link Instrumentation on Personal Dashboard (T422512), Bump wikimedia/parsoid to 0.23.0-a26 (T422394), [[g}}
- {{safesubst:SAL entry|1=21:31 cscott@deploy1003: Started scap sync-world: Backport for PHP SDK: Handle experiment config missing or malformed (T422112), ForeignWikiRequest: Pass session to internal 'centralauthtoken' request (T422218), Remove Navigation Menu Link Instrumentation on Personal Dashboard (T422512), Bump wikimedia/parsoid to 0.23.0-a26 (T422394), [[gerrit:12686}}
- 21:27 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cirrussearch1083.eqiad.wmnet with reason: host reimage
- {{safesubst:SAL entry|1=21:17 cscott@deploy1003: Finished scap sync-world: Backport for Ensure RevisionOutputCache uses post-processing options where appropriate (T421629), Remove Navigation Menu Link Instrumentation on Personal Dashboard (T422512), ForeignWikiRequest: Pass session to internal 'centralauthtoken' request (T422218), [[gerrit:1268651|PHP SDK: Handle experiment config missing or}}
- 21:13 cscott@deploy1003: cscott, kgraessle, sfaci, matmarex: Continuing with sync
- 21:12 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cirrussearch1083
- 21:12 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cirrussearch1083
- 21:11 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cirrussearch1083
- 21:11 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cirrussearch1083.eqiad.wmnet 168.32.64.10.in-addr.arpa 8.6.1.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
- 21:11 bking@cumin2002: START - Cookbook sre.dns.wipe-cache cirrussearch1083.eqiad.wmnet 168.32.64.10.in-addr.arpa 8.6.1.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
- 21:11 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 21:11 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cirrussearch1083 - bking@cumin2002"
- 21:11 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cirrussearch1083 - bking@cumin2002"
- {{safesubst:SAL entry|1=21:10 cscott@deploy1003: cscott, kgraessle, sfaci, matmarex: Backport for Ensure RevisionOutputCache uses post-processing options where appropriate (T421629), Remove Navigation Menu Link Instrumentation on Personal Dashboard (T422512), ForeignWikiRequest: Pass session to internal 'centralauthtoken' request (T422218), [[gerrit:1268651|PHP SDK: Handle experiment config}}
- {{safesubst:SAL entry|1=21:09 cscott@deploy1003: Started scap sync-world: Backport for Ensure RevisionOutputCache uses post-processing options where appropriate (T421629), Remove Navigation Menu Link Instrumentation on Personal Dashboard (T422512), ForeignWikiRequest: Pass session to internal 'centralauthtoken' request (T422218), [[gerrit:1268651|PHP SDK: Handle experiment config missing or}}
- 21:08 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 21:05 ryankemper: [WDQS] codfw is getting slammed hard enough that hosts are falling immediately back into deadlock post-restart and largely failing to report metrics. not much we can do atm, there will be some noise
- 21:02 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 21:01 bking@cumin2002: START - Cookbook sre.dns.netbox
- 21:00 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host cirrussearch1083
- 21:00 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cirrussearch1083.eqiad.wmnet with OS bullseye
- 20:57 cscott@deploy1003: Finished scap sync-world: Backport for Bump jawiki to 100% Parsoid Read Views (from 10%) (T420273), REST: Publish ReadingLists v0 module in REST Sandbox (T419619), Move createwithcontentmodel to autoconfirmed (T248294) (duration: 13m 27s)
- 20:51 cscott@deploy1003: cscott, pppery, kineticpelagic: Continuing with sync
- 20:48 cscott@deploy1003: cscott, pppery, kineticpelagic: Backport for Bump jawiki to 100% Parsoid Read Views (from 10%) (T420273), REST: Publish ReadingLists v0 module in REST Sandbox (T419619), Move createwithcontentmodel to autoconfirmed (T248294) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 20:47 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 20:44 cscott@deploy1003: Started scap sync-world: Backport for Bump jawiki to 100% Parsoid Read Views (from 10%) (T420273), REST: Publish ReadingLists v0 module in REST Sandbox (T419619), Move createwithcontentmodel to autoconfirmed (T248294)
- 20:42 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 20:40 reedy@deploy1003: Finished scap sync-world: Backport for Undeploy Extension:StopForumSpam (T422185) (duration: 31m 17s)
- 20:40 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 20:33 eevans@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on aqs1023.eqiad.wmnet with reason: Bootstrapping — T412830
- 20:30 eevans@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 20:30 eevans@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Additional IPs for aqs hosts - eevans@cumin1003"
- 20:30 eevans@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Additional IPs for aqs hosts - eevans@cumin1003"
- 20:28 reedy@deploy1003: reedy: Continuing with sync
- 20:28 reedy@deploy1003: reedy: Backport for Undeploy Extension:StopForumSpam (T422185) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 20:26 eevans@cumin1003: START - Cookbook sre.dns.netbox
- 20:19 eevans@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 20:19 eevans@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Additional IPs for aqs1024 - eevans@cumin1003"
- 20:19 eevans@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Additional IPs for aqs1024 - eevans@cumin1003"
- 20:14 eevans@cumin1003: START - Cookbook sre.dns.netbox
- 20:09 reedy@deploy1003: Started scap sync-world: Backport for Undeploy Extension:StopForumSpam (T422185)
- 20:07 eevans@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host aqs1023.eqiad.wmnet
- 20:02 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cirrussearch1082.eqiad.wmnet with OS bullseye
- 20:02 eevans@cumin1003: START - Cookbook sre.hosts.reboot-single for host aqs1023.eqiad.wmnet
- 19:43 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cirrussearch1082.eqiad.wmnet with reason: host reimage
- 19:38 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cirrussearch1082.eqiad.wmnet with reason: host reimage
- 19:32 eevans@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 19:32 eevans@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Additional IPs for aqs1023 - eevans@cumin1003"
- 19:32 eevans@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Additional IPs for aqs1023 - eevans@cumin1003"
- 19:27 eevans@cumin1003: START - Cookbook sre.dns.netbox
- 19:26 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cirrussearch1080.eqiad.wmnet with OS bullseye
- 19:22 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cirrussearch1082
- 19:22 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cirrussearch1082
- 19:17 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cirrussearch1082
- 19:17 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cirrussearch1082.eqiad.wmnet 167.32.64.10.in-addr.arpa 7.6.1.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
- 19:17 bking@cumin2002: START - Cookbook sre.dns.wipe-cache cirrussearch1082.eqiad.wmnet 167.32.64.10.in-addr.arpa 7.6.1.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
- 19:16 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 19:16 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cirrussearch1082 - bking@cumin2002"
- 19:16 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cirrussearch1082 - bking@cumin2002"
- 19:08 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cirrussearch1080.eqiad.wmnet with reason: host reimage
- 19:08 bking@cumin2002: START - Cookbook sre.dns.netbox
- 19:08 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host cirrussearch1082
- 19:07 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cirrussearch1082.eqiad.wmnet with OS bullseye
- 19:04 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cirrussearch1080.eqiad.wmnet with reason: host reimage
- 18:49 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cirrussearch1080
- 18:49 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cirrussearch1080
- 18:47 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cirrussearch1080
- 18:47 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cirrussearch1080.eqiad.wmnet 29.32.64.10.in-addr.arpa 9.2.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
- 18:47 bking@cumin2002: START - Cookbook sre.dns.wipe-cache cirrussearch1080.eqiad.wmnet 29.32.64.10.in-addr.arpa 9.2.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
- 18:47 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 18:47 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cirrussearch1080 - bking@cumin2002"
- 18:47 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cirrussearch1080 - bking@cumin2002"
- 18:44 bking@cumin2002: START - Cookbook sre.dns.netbox
- 18:44 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host cirrussearch1080
- 18:43 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cirrussearch1080.eqiad.wmnet with OS bullseye
- 18:24 dancy@deploy1003: rebuilt and synchronized wikiversions files: group0 to 1.46.0-wmf.23 refs T420481
- 18:13 dreamyjazz@deploy1003: Finished scap sync-world: Backport for ClientHints: Don't collect header only on null edit (T418989) (duration: 12m 14s)
- 18:07 dreamyjazz@deploy1003: dreamyjazz: Continuing with sync
- 18:06 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
- 18:06 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
- 18:05 dreamyjazz@deploy1003: dreamyjazz: Backport for ClientHints: Don't collect header only on null edit (T418989) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 18:01 dreamyjazz@deploy1003: Started scap sync-world: Backport for ClientHints: Don't collect header only on null edit (T418989)
- 16:41 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 16:39 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 16:31 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
- 16:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
- 16:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
- 16:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
- 16:29 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
- 16:29 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
- 16:29 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
- 16:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
- 16:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
- 16:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
- 16:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
- 16:22 Lucas_WMDE: UTC afternoon backport+config window (belatedly) done
- {{safesubst:SAL entry|1=16:20 dreamyjazz@deploy1003: Finished scap sync-world: Backport for Set $wgGlobalBlockingWikisWhereGlobalBlocksDoNotApply (T422220), GlobalBlockLocalStatusLookup: Remove unused constructor param (T422220), GlobalBlockLocalStatusLookup: Remove unused constructor param (T422220), [[gerrit:1268585|GlobalBlockLocalStatusLookup: Support wikis that don't apply blocks (T4222}}
- 16:07 dreamyjazz@deploy1003: stran, dreamyjazz: Continuing with sync
- 16:03 dreamyjazz@deploy1003: stran, dreamyjazz: Backport for Set $wgGlobalBlockingWikisWhereGlobalBlocksDoNotApply (T422220), GlobalBlockLocalStatusLookup: Remove unused constructor param (T422220), GlobalBlockLocalStatusLookup: Remove unused constructor param (T422220), GlobalBlockLocalStatusLookup: Support wikis that don't apply blocks (T422220),
- 15:56 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
- 15:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
- 15:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
- 15:45 cgoubert@cumin1003: END (FAIL) - Cookbook sre.k8s.renumber-node (exit_code=1) Renumbering for host wikikube-worker1273.eqiad.wmnet
- 15:45 cgoubert@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker1273.eqiad.wmnet
- 15:45 cgoubert@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker1273.eqiad.wmnet
- {{safesubst:SAL entry|1=15:44 dreamyjazz@deploy1003: Started scap sync-world: Backport for Set $wgGlobalBlockingWikisWhereGlobalBlocksDoNotApply (T422220), GlobalBlockLocalStatusLookup: Remove unused constructor param (T422220), GlobalBlockLocalStatusLookup: Remove unused constructor param (T422220), [[gerrit:1268585|GlobalBlockLocalStatusLookup: Support wikis that don't apply blocks (T42222}}
- 15:44 sukhe@dns1004: END - running authdns-update
- 15:42 sukhe@dns1004: START - running authdns-update
- 15:31 aikochou@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revise-tone-task-generator' for release 'main' .
- 15:31 ayounsi@cumin1003: conftool action : set/pooled=yes; selector: cluster=dnsbox,dc=esams [reason: esams maintenance over]
- 15:30 moritzm: installing postgresql-15 security updates
- 15:29 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: pool esams [reason: network maintenance over, T416450]
- 15:28 ayounsi@cumin1003: START - Cookbook sre.dns.admin DNS admin: pool esams [reason: network maintenance over, T416450]
- 15:25 claime: homer lsw1-d1-eqiad* commit
- 15:24 claime: homer cr*eqiad* commit
- 15:24 cgoubert@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1273.eqiad.wmnet with OS bookworm
- 15:21 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
- 15:20 Emperor: restart swift object/container replicaton services on ms-be1069
- 15:20 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
- 15:14 XioNoX: cr1-esams - re-enabling external peers
- 15:12 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
- 15:10 swfrench@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
- 15:04 cgoubert@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1273.eqiad.wmnet with reason: host reimage
- 14:57 cgoubert@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1273.eqiad.wmnet with reason: host reimage
- 14:57 ayounsi@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on cr1-esams,cr1-esams IPv6,re0.cr1-esams.mgmt with reason: router upgrade
- 14:55 cgoubert@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker1347.eqiad.wmnet
- 14:54 cgoubert@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker1347.eqiad.wmnet
- 14:36 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host testvm2006.codfw.wmnet with OS trixie
- 14:34 cgoubert@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1273
- 14:34 cgoubert@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1273
- 14:30 XioNoX: re0.cr1-esams> request chassis routing-engine master switch
- 14:30 cgoubert@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1273
- 14:30 cgoubert@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1273.eqiad.wmnet 128.48.64.10.in-addr.arpa 8.2.1.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
- 14:30 cgoubert@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1273.eqiad.wmnet 128.48.64.10.in-addr.arpa 8.2.1.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
- 14:30 cgoubert@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 14:30 cgoubert@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1273 - cgoubert@cumin1003"
- 14:30 cgoubert@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1273 - cgoubert@cumin1003"
- 14:25 cgoubert@cumin1003: START - Cookbook sre.dns.netbox
- 14:25 cgoubert@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1273
- 14:24 cgoubert@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1273.eqiad.wmnet with OS bookworm
- 14:24 cgoubert@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker1273.eqiad.wmnet
- 14:23 cgoubert@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker1273.eqiad.wmnet
- 14:23 cgoubert@cumin1003: START - Cookbook sre.k8s.renumber-node Renumbering for host wikikube-worker1273.eqiad.wmnet
- 14:16 eevans@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching sessionstore1*: Upgrade Cassandra to 4.1.11 — T418417 - eevans@cumin1003
- 14:03 cgoubert@cumin1003: END (FAIL) - Cookbook sre.k8s.renumber-node (exit_code=99) Renumbering for host wikikube-worker1273.eqiad.wmnet
- 14:03 cgoubert@cumin1003: START - Cookbook sre.k8s.renumber-node Renumbering for host wikikube-worker1273.eqiad.wmnet
- 14:01 cgoubert@cumin1003: END (FAIL) - Cookbook sre.k8s.renumber-node (exit_code=99) Renumbering for host wikikube-worker1273.eqiad.wmnet
- 14:01 cgoubert@cumin1003: START - Cookbook sre.k8s.renumber-node Renumbering for host wikikube-worker1273.eqiad.wmnet
- 13:58 eevans@cumin1003: START - Cookbook sre.cassandra.roll-restart for nodes matching sessionstore1*: Upgrade Cassandra to 4.1.11 — T418417 - eevans@cumin1003
- 13:58 eevans@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching sessionstore2*: Upgrade Cassandra to 4.1.11 — T418417 - eevans@cumin1003
- 13:58 jmm@dns1004: END - running authdns-update
- 13:57 jmm@dns1004: START - running authdns-update
- 13:56 jmm@dns1004: END - running authdns-update
- 13:54 jmm@dns1004: START - running authdns-update
- 13:53 ayounsi@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cr1-esams,cr1-esams IPv6,re0.cr1-esams.mgmt with reason: router upgrade
- 13:53 jmm@dns1004: END - running authdns-update
- 13:51 jmm@dns1004: START - running authdns-update
- 13:46 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host testvm2006.codfw.wmnet with OS trixie
- 13:45 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ganeti1056.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 13:41 volans: installed cumin v6.0.0 on cumin2002
- 13:40 eevans@cumin1003: START - Cookbook sre.cassandra.roll-restart for nodes matching sessionstore2*: Upgrade Cassandra to 4.1.11 — T418417 - eevans@cumin1003
- 13:39 jclark@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1056.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 13:31 XioNoX: reboot cr1-esams
- 13:30 ayounsi@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on cr1-esams,cr1-esams IPv6,re0.cr1-esams.mgmt with reason: router upgrade
- 13:25 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti5007.eqsin.wmnet with OS bookworm
- 13:19 taavi@cumin1003: conftool action : set/pooled=yes; selector: name=clouddumps1002.wikimedia.org
- 13:19 taavi@cumin1003: conftool action : set/pooled=no; selector: name=clouddumps1001.wikimedia.org
- 13:19 taavi@cumin1003: conftool action : set/weight=100; selector: cluster=dumps
- 13:11 XioNoX: re0.cr1-esams> request chassis routing-engine master switch
- 13:05 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti5007.eqsin.wmnet with reason: host reimage
- 12:59 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti5007.eqsin.wmnet with reason: host reimage
- 12:39 XioNoX: re1.cr1-esams> request chassis routing-engine master switch - that will cause router's short unavailability - T416450
- 12:27 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti5007.eqsin.wmnet with OS bookworm
- 12:16 jmm@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ganeti5007.eqsin.wmnet
- 12:15 jmm@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ganeti5007.eqsin.wmnet
- 12:11 XioNoX: re0.cr1-esams> request chassis routing-engine master switch - that will cause router's short unavailability - T416450
- 12:08 jmm@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ganeti5007.eqsin.wmnet
- 12:08 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti5007.eqsin.wmnet
- 12:04 XioNoX: reboot re1.cr1-esams (backup RE) for upgrade - T416450
- 12:03 ayounsi@cumin1003: conftool action : set/pooled=no; selector: cluster=dnsbox,dc=esams [reason: esams network maintenance]
- 12:01 ayounsi@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on cr1-esams,cr1-esams IPv6,re0.cr1-esams.mgmt with reason: router upgrade
- 11:56 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti5007.eqsin.wmnet
- 11:36 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: depool esams [reason: network maintenance, T416450]
- 11:36 XioNoX: depool esams for network maintenance - T416450
- 11:36 ayounsi@cumin1003: START - Cookbook sre.dns.admin DNS admin: depool esams [reason: network maintenance, T416450]
- 11:31 jmm@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ganeti5007.eqsin.wmnet
- 10:46 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti5007.eqsin.wmnet
- 10:11 topranks: shift inter-site traffic from exsiting 10G to new 100G transport circuit between eqiad<->codfw T395878
- 08:52 Amir1: tightening the rate limit for non-standard thumbnails (T402792 T414805)
- 08:42 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti5007.eqsin.wmnet
- 08:40 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti5007.eqsin.wmnet
- 08:38 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti5007.eqsin.wmnet
- 08:38 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.drain-node (exit_code=99) for draining ganeti node ganeti5007.eqsin.wmnet
- 08:37 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti5007.eqsin.wmnet
- 08:25 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 42
- 08:22 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'configure' for AS: 42
- 08:18 XioNoX: update pfw1-eqiad NAT - T422380
- 08:05 hashar: Moved Debian Glue jobs to Jenkins agents running Bookworm (integration-agent-pkgbuilder-1005 and integration-agent-pkgbuilder-1006)| T421114
- 08:00 marostegui: Upgrade clouddb1017 to mariadb 10.11.16 (v3) T420177
- 07:59 XioNoX: push pfw policies - T422204
- 07:59 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1017.eqiad.wmnet with reason: Maintenance
- 07:54 hashar: Moved `operations-puppet-tests-bullseye` job from a Jenkins agent running Bullseye to one running Bookworm. The image is still on Bullseye! | T421114
- 07:44 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
- 07:14 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2157: after upgrade
- 07:14 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1159: after upgrade
- 06:32 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2142: Upgrade package
- 06:32 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
- 06:32 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
- 06:32 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2142: Upgrade package
- 06:32 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2248: Upgrade package
- 06:29 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2157: after upgrade
- 06:28 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1159: after upgrade
- 06:16 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2248: Upgrade package
- 06:16 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2249: Upgrade package
- 06:02 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool pc1011: after upgrade
- 06:02 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
- 06:02 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
- 06:02 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool pc1011: after upgrade
- 06:01 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2249: Upgrade package
- 06:01 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1169: Upgrade package
- 05:45 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1169: Upgrade package
- 05:37 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1169: Upgrade package
- 05:37 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1169: Upgrade package
- 05:37 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2142: Upgrade package
- 05:36 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
- 05:36 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
- 05:36 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2142: Upgrade package
- 05:35 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2248: Upgrade package
- 05:35 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2248: Upgrade package
- 05:33 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.depool (exit_code=99) depool db2248: Upgrade package
- 05:33 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2248: Upgrade package
- 05:31 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.depool (exit_code=99) depool db2249.codfw.wmnet: Upgrade package
- 05:31 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2249.codfw.wmnet: Upgrade package
- 05:31 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.depool (exit_code=99) depool db2249: Upgrade package
- 05:31 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2249: Upgrade package
- 05:30 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.depool (exit_code=99) depool db2249: Upgrade package
- 05:30 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2249: Upgrade package
- 05:29 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db[2142,2248-2249].codfw.wmnet,db1169.eqiad.wmnet with reason: Upgrade to 10.11.16.v3
- 05:29 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db[2142,2248-2249].codfw.wmnet with reason: Upgrade to 10.11.16.v3
- 05:21 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on pc2011.codfw.wmnet,pc1011.eqiad.wmnet with reason: Upgrade to 10.11.16.v3
- 05:20 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
- 05:20 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
- 04:02 mwpresync@deploy1003: Pruned MediaWiki: 1.46.0-wmf.20 (duration: 02m 27s)
- 03:39 mwpresync@deploy1003: Finished scap sync-world: testwikis to 1.46.0-wmf.23 refs T420481 (duration: 35m 55s)
- 03:03 mwpresync@deploy1003: Started scap sync-world: testwikis to 1.46.0-wmf.23 refs T420481
- 00:10 zabe@deploy1003: Finished scap sync-world: Backport for Start reading from the new file tables on more large wikis (T416548) (duration: 06m 22s)
- 00:05 zabe@deploy1003: zabe: Continuing with sync
- 00:05 zabe@deploy1003: zabe: Backport for Start reading from the new file tables on more large wikis (T416548) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 00:03 zabe@deploy1003: Started scap sync-world: Backport for Start reading from the new file tables on more large wikis (T416548)
- 00:02 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host gitlab-runner1003.eqiad.wmnet with OS bookworm
2026-04-06
- 23:43 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on gitlab-runner1003.eqiad.wmnet with reason: host reimage
- 23:40 dzahn@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on gitlab-runner1003.eqiad.wmnet with reason: host reimage
- 23:25 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host gitlab-runner1003
- 23:25 dzahn@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host gitlab-runner1003
- 23:25 mutante: gitlab: reimaging trusted runners with --move-vlan parameter which changed their IPs - verified was showing up as online after the change and using the new IPs (T421717)
- 23:25 dzahn@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host gitlab-runner1003
- 23:25 dzahn@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) gitlab-runner1003.eqiad.wmnet 184.32.64.10.in-addr.arpa 4.8.1.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
- 23:25 dzahn@cumin2002: START - Cookbook sre.dns.wipe-cache gitlab-runner1003.eqiad.wmnet 184.32.64.10.in-addr.arpa 4.8.1.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
- 23:25 dzahn@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 23:25 dzahn@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host gitlab-runner1003 - dzahn@cumin2002"
- 23:24 dzahn@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host gitlab-runner1003 - dzahn@cumin2002"
- 23:18 dzahn@cumin2002: START - Cookbook sre.dns.netbox
- 23:12 dzahn@cumin2002: START - Cookbook sre.hosts.move-vlan for host gitlab-runner1003
- 23:12 dzahn@cumin2002: START - Cookbook sre.hosts.reimage for host gitlab-runner1003.eqiad.wmnet with OS bookworm
- 22:56 eevans@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:aqs-codfw: Actually upgrade Cassandra to 4.1.11 — T418417 - eevans@cumin1003
- 22:18 sbassett@deploy1003: Finished scap sync-world: Backport for Check if $res->message is null within ApiAuthManagerHelper (T422320) (duration: 06m 18s)
- 22:14 sbassett@deploy1003: sbassett: Continuing with sync
- 22:13 sbassett@deploy1003: sbassett: Backport for Check if $res->message is null within ApiAuthManagerHelper (T422320) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 22:12 sbassett@deploy1003: Started scap sync-world: Backport for Check if $res->message is null within ApiAuthManagerHelper (T422320)
- 21:26 eevans@cumin1003: START - Cookbook sre.cassandra.roll-restart for nodes matching A:aqs-codfw: Actually upgrade Cassandra to 4.1.11 — T418417 - eevans@cumin1003
- 21:25 eevans@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:aqs-eqiad: Actually upgrade Cassandra to 4.1.11 — T418417 - eevans@cumin1003
- 21:15 dancy@deploy1003: Installation of scap version "4.244.0" completed for 2 hosts
- 21:13 dancy@deploy1003: Installing scap version "4.244.0" for 2 host(s)
- 21:06 urbanecm: Unlocking mw-experimental@eqiad
- 21:00 urbanecm: Locking mw-experimental@eqiad
- 20:55 urbanecm@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-experimental: apply
- 20:54 urbanecm@deploy1003: helmfile [codfw] START helmfile.d/services/mw-experimental: apply
- 20:54 urbanecm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
- 20:53 urbanecm@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
- 20:50 urbanecm@deploy1003: Finished scap sync-world: Backport for Respect the echo-read-notifications right in user interface (T420154), Grant new 'echo-read-notifications' right to all users (T422297) (duration: 06m 30s)
- 20:46 urbanecm@deploy1003: urbanecm: Continuing with sync
- 20:45 urbanecm@deploy1003: urbanecm: Backport for Respect the echo-read-notifications right in user interface (T420154), Grant new 'echo-read-notifications' right to all users (T422297) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 20:44 urbanecm@deploy1003: Started scap sync-world: Backport for Respect the echo-read-notifications right in user interface (T420154), Grant new 'echo-read-notifications' right to all users (T422297)
- 20:28 kemayo@deploy1003: Finished scap sync-world: Backport for VisualEditorSuggestionFeedback: undo the addition of Talk to the URL (T420123) (duration: 07m 07s)
- 20:24 kemayo@deploy1003: kemayo: Continuing with sync
- 20:23 kemayo@deploy1003: kemayo: Backport for VisualEditorSuggestionFeedback: undo the addition of Talk to the URL (T420123) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 20:21 kemayo@deploy1003: Started scap sync-world: Backport for VisualEditorSuggestionFeedback: undo the addition of Talk to the URL (T420123)
- 20:18 kemayo@deploy1003: Finished scap sync-world: Backport for VisualEditor editcheck suggestion feedback is always consolidated (T420123), Set $wgReadingListsEnableBetaQuickSurvey to true for beta cluster (T422275) (duration: 10m 56s)
- 20:11 kemayo@deploy1003: kemayo, aude: Continuing with sync
- 20:08 kemayo@deploy1003: kemayo, aude: Backport for VisualEditor editcheck suggestion feedback is always consolidated (T420123), Set $wgReadingListsEnableBetaQuickSurvey to true for beta cluster (T422275) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 20:07 kemayo@deploy1003: Started scap sync-world: Backport for VisualEditor editcheck suggestion feedback is always consolidated (T420123), Set $wgReadingListsEnableBetaQuickSurvey to true for beta cluster (T422275)
- 20:00 eevans@cumin1003: START - Cookbook sre.cassandra.roll-restart for nodes matching A:aqs-eqiad: Actually upgrade Cassandra to 4.1.11 — T418417 - eevans@cumin1003
- 19:52 ryankemper: [wdqs] Restarted `wmf_auto_restart_prometheus-blazegraph-exporter-wdqs-blazegraph.service` on `wdqs1012` to clear systemdunitfailed alert
- 19:32 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P{lvs5004.eqsin.wmnet} and A:liberica
- 19:28 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host gitlab-runner1004.eqiad.wmnet with OS bookworm
- 19:28 brett@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P{lvs5004.eqsin.wmnet} and A:liberica
- 19:20 bking@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthbook: apply
- 19:19 bking@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook: apply
- 19:17 bking@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthboo-next: apply
- 19:16 bking@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook-next: apply
- 19:10 bking@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/superset-next: apply
- 19:09 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on gitlab-runner1004.eqiad.wmnet with reason: host reimage
- 19:09 bking@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/superset-next: apply
- 19:05 dzahn@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on gitlab-runner1004.eqiad.wmnet with reason: host reimage
- 18:58 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P{lvs5005.eqsin.wmnet} and A:liberica
- 18:55 brett@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P{lvs5005.eqsin.wmnet} and A:liberica
- 18:50 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host gitlab-runner1004
- 18:50 dzahn@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host gitlab-runner1004
- 18:49 dzahn@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host gitlab-runner1004
- 18:49 dzahn@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) gitlab-runner1004.eqiad.wmnet 141.48.64.10.in-addr.arpa 1.4.1.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
- 18:49 dzahn@cumin2002: START - Cookbook sre.dns.wipe-cache gitlab-runner1004.eqiad.wmnet 141.48.64.10.in-addr.arpa 1.4.1.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
- 18:49 dzahn@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 18:49 dzahn@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host gitlab-runner1004 - dzahn@cumin2002"
- 18:49 dzahn@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host gitlab-runner1004 - dzahn@cumin2002"
- 18:45 dzahn@cumin2002: START - Cookbook sre.dns.netbox
- 18:40 dzahn@cumin2002: START - Cookbook sre.hosts.move-vlan for host gitlab-runner1004
- 18:40 dzahn@cumin2002: START - Cookbook sre.hosts.reimage for host gitlab-runner1004.eqiad.wmnet with OS bookworm
- 18:39 mutante: gitlab-runner1004 - reimaging with --move-vlan T421717
- 18:37 eevans@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:restbase-codfw: Actually upgrade Cassandra to 4.1.11 — T418417 - eevans@cumin1003
- 18:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
- 18:01 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1240.eqiad.wmnet with reason: Maintenance
- 18:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1223 (T419635)', diff saved to https://phabricator.wikimedia.org/P90287 and previous config saved to /var/cache/conftool/dbconfig/20260406-180118-fceratto.json
- 17:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1223', diff saved to https://phabricator.wikimedia.org/P90286 and previous config saved to /var/cache/conftool/dbconfig/20260406-175111-fceratto.json
- 17:42 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-upgrade-ats (exit_code=0) Rolling upgrade of ATS on P{cp7009.magru.wmnet} and A:cp - 9.2.13 Upgrade ()
- 17:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1223', diff saved to https://phabricator.wikimedia.org/P90285 and previous config saved to /var/cache/conftool/dbconfig/20260406-174104-fceratto.json
- 17:37 brett@cumin2002: START - Cookbook sre.cdn.roll-upgrade-ats Rolling upgrade of ATS on P{cp7009.magru.wmnet} and A:cp - 9.2.13 Upgrade ()
- 17:34 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-upgrade-ats (exit_code=0) Rolling upgrade of ATS on P{cp7001.magru.wmnet} and A:cp - 9.2.13 Upgrade ()
- 17:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1223 (T419635)', diff saved to https://phabricator.wikimedia.org/P90284 and previous config saved to /var/cache/conftool/dbconfig/20260406-173056-fceratto.json
- 17:29 brett@cumin2002: START - Cookbook sre.cdn.roll-upgrade-ats Rolling upgrade of ATS on P{cp7001.magru.wmnet} and A:cp - 9.2.13 Upgrade ()
- 17:20 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1223 (T419635)', diff saved to https://phabricator.wikimedia.org/P90283 and previous config saved to /var/cache/conftool/dbconfig/20260406-172055-fceratto.json
- 17:20 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1223.eqiad.wmnet with reason: Maintenance
- 17:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1212 (T419635)', diff saved to https://phabricator.wikimedia.org/P90282 and previous config saved to /var/cache/conftool/dbconfig/20260406-172030-fceratto.json
- 17:16 brett: import trafficserver 9.2.13-1wm1 into trixie-wikimedia - T422328
- 17:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1212', diff saved to https://phabricator.wikimedia.org/P90281 and previous config saved to /var/cache/conftool/dbconfig/20260406-171021-fceratto.json
- 17:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1212', diff saved to https://phabricator.wikimedia.org/P90280 and previous config saved to /var/cache/conftool/dbconfig/20260406-170013-fceratto.json
- 16:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1212 (T419635)', diff saved to https://phabricator.wikimedia.org/P90279 and previous config saved to /var/cache/conftool/dbconfig/20260406-165005-fceratto.json
- 16:43 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1212 (T419635)', diff saved to https://phabricator.wikimedia.org/P90278 and previous config saved to /var/cache/conftool/dbconfig/20260406-164323-fceratto.json
- 16:43 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on 6 hosts with reason: Maintenance
- 16:43 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1212.eqiad.wmnet with reason: Maintenance
- 16:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1198 (T419635)', diff saved to https://phabricator.wikimedia.org/P90277 and previous config saved to /var/cache/conftool/dbconfig/20260406-164257-fceratto.json
- 16:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1198', diff saved to https://phabricator.wikimedia.org/P90276 and previous config saved to /var/cache/conftool/dbconfig/20260406-163249-fceratto.json
- 16:32 eevans@cumin1003: START - Cookbook sre.cassandra.roll-restart for nodes matching A:restbase-codfw: Actually upgrade Cassandra to 4.1.11 — T418417 - eevans@cumin1003
- 16:31 eevans@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:restbase-eqiad: Actually upgrade Cassandra to 4.1.11 — T418417 - eevans@cumin1003
- 16:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1198', diff saved to https://phabricator.wikimedia.org/P90275 and previous config saved to /var/cache/conftool/dbconfig/20260406-162241-fceratto.json
- 16:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1198 (T419635)', diff saved to https://phabricator.wikimedia.org/P90274 and previous config saved to /var/cache/conftool/dbconfig/20260406-161232-fceratto.json
- 16:06 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1198 (T419635)', diff saved to https://phabricator.wikimedia.org/P90273 and previous config saved to /var/cache/conftool/dbconfig/20260406-160615-fceratto.json
- 16:06 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1198.eqiad.wmnet with reason: Maintenance
- 16:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1175 (T419635)', diff saved to https://phabricator.wikimedia.org/P90272 and previous config saved to /var/cache/conftool/dbconfig/20260406-160551-fceratto.json
- 15:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1175', diff saved to https://phabricator.wikimedia.org/P90271 and previous config saved to /var/cache/conftool/dbconfig/20260406-155542-fceratto.json
- 15:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1175', diff saved to https://phabricator.wikimedia.org/P90270 and previous config saved to /var/cache/conftool/dbconfig/20260406-154534-fceratto.json
- 15:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1175 (T419635)', diff saved to https://phabricator.wikimedia.org/P90269 and previous config saved to /var/cache/conftool/dbconfig/20260406-153526-fceratto.json
- 15:29 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1175 (T419635)', diff saved to https://phabricator.wikimedia.org/P90268 and previous config saved to /var/cache/conftool/dbconfig/20260406-152908-fceratto.json
- 15:29 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1175.eqiad.wmnet with reason: Maintenance
- 15:24 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1166.eqiad.wmnet with reason: Maintenance
- 15:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1157 (T419635)', diff saved to https://phabricator.wikimedia.org/P90267 and previous config saved to /var/cache/conftool/dbconfig/20260406-152409-fceratto.json
- 15:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1157', diff saved to https://phabricator.wikimedia.org/P90266 and previous config saved to /var/cache/conftool/dbconfig/20260406-151401-fceratto.json
- 15:03 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1157', diff saved to https://phabricator.wikimedia.org/P90265 and previous config saved to /var/cache/conftool/dbconfig/20260406-150353-fceratto.json
- 14:53 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1157 (T419635)', diff saved to https://phabricator.wikimedia.org/P90264 and previous config saved to /var/cache/conftool/dbconfig/20260406-145344-fceratto.json
- 14:53 taavi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 14:53 taavi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: allocate lvs vip for dumps-lb.eqiad - taavi@cumin1003"
- 14:53 taavi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: allocate lvs vip for dumps-lb.eqiad - taavi@cumin1003"
- 14:49 taavi@cumin1003: START - Cookbook sre.dns.netbox
- 14:47 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1157 (T419635)', diff saved to https://phabricator.wikimedia.org/P90263 and previous config saved to /var/cache/conftool/dbconfig/20260406-144734-fceratto.json
- 14:47 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1157.eqiad.wmnet with reason: Maintenance
- 14:40 vgutierrez@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on P{cp[6001,6009].*} and A:cp - 3.2.15 upgrade (T421402)
- 14:28 vgutierrez@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on P{cp[6001,6009].*} and A:cp - 3.2.15 upgrade (T421402)
- 14:27 vgutierrez: fetch haproxy 3.2.15 on thirdparty/haproxy32 (trixie-wikimedia) - T421402
- 14:26 eevans@cumin1003: START - Cookbook sre.cassandra.roll-restart for nodes matching A:restbase-eqiad: Actually upgrade Cassandra to 4.1.11 — T418417 - eevans@cumin1003
- 14:24 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1150.eqiad.wmnet with reason: Maintenance
- 13:20 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1013.eqiad.wmnet with reason: Maintenance
- 13:06 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1023.eqiad.wmnet with reason: Maintenance
- 12:13 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
- 12:13 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
- 12:11 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host pc2015.codfw.wmnet with OS trixie
- 12:06 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host pc1015.eqiad.wmnet with OS trixie
- 11:53 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on pc2015.codfw.wmnet with reason: host reimage
- 11:49 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on pc1015.eqiad.wmnet with reason: host reimage
- 11:48 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on pc2015.codfw.wmnet with reason: host reimage
- 11:43 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on pc1015.eqiad.wmnet with reason: host reimage
- 11:29 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host pc2015.codfw.wmnet with OS trixie
- 11:28 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host pc1015.eqiad.wmnet with OS trixie
- 11:26 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on pc2015.codfw.wmnet,pc1015.eqiad.wmnet with reason: Maintenance
- 11:26 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on pc2015.codfw.wmnet with reason: Maintenance
- 11:24 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
- 11:24 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
- 11:14 marostegui@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1022.eqiad.wmnet,service=s3
- 10:15 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
- 10:15 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
- 10:12 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2142.codfw.wmnet with OS trixie
- 09:54 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2142.codfw.wmnet with reason: host reimage
- 09:48 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2142.codfw.wmnet with reason: host reimage
- 09:30 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2142.codfw.wmnet with OS trixie
- 09:27 urbanecm@deploy1003: Finished scap sync-world: Backport for SECURITY: Protect ApiEchoNotifications with a new user right (T420154), [i18n] Correct the action message (T420154), refactor: Use a trait to check for reading permissions (T420154), Create a new grant for the echo-read-notifications (T420154) (duration: 31m 47s)
- 09:27 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2142.codfw.wmnet,db1152.eqiad.wmnet with reason: Upgrade
- 09:15 urbanecm@deploy1003: urbanecm: Continuing with sync
- 09:15 urbanecm@deploy1003: urbanecm: Backport for SECURITY: Protect ApiEchoNotifications with a new user right (T420154), [i18n] Correct the action message (T420154), refactor: Use a trait to check for reading permissions (T420154), Create a new grant for the echo-read-notifications (T420154) synced to the testservers (see https://wikitech.wikimedia
- 09:10 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
- 09:10 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
- 09:01 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1152: Upgrade
- 09:01 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
- 09:01 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
- 09:01 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1152: Upgrade
- 09:00 marostegui@cumin1003: dbctl commit (dc=all): 'Pool db1169', diff saved to https://phabricator.wikimedia.org/P90257 and previous config saved to /var/cache/conftool/dbconfig/20260406-090040-marostegui.json
- 09:00 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1169: test
- 08:59 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1169: test
- 08:55 urbanecm@deploy1003: Started scap sync-world: Backport for SECURITY: Protect ApiEchoNotifications with a new user right (T420154), [i18n] Correct the action message (T420154), refactor: Use a trait to check for reading permissions (T420154), Create a new grant for the echo-read-notifications (T420154)
- 08:48 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1023.eqiad.wmnet with reason: Maintenance
- 08:46 urbanecm@deploy1003: Finished scap sync-world: Backport for [Growth] Decrease user impact limits back to the defaults (T422288 T341599) (duration: 10m 50s)
- 08:40 urbanecm@deploy1003: urbanecm: Continuing with sync
- 08:37 urbanecm@deploy1003: urbanecm: Backport for [Growth] Decrease user impact limits back to the defaults (T422288 T341599) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 08:35 urbanecm@deploy1003: Started scap sync-world: Backport for [Growth] Decrease user impact limits back to the defaults (T422288 T341599)
- 08:15 kgraessle@deploy1003: Finished scap sync-world: Backport for Set live configuration for Extension:PersonalDashboard on English Wikipedia (T421415) (duration: 31m 54s)
- 08:02 kgraessle@deploy1003: kgraessle: Continuing with sync
- 07:59 kgraessle@deploy1003: kgraessle: Backport for Set live configuration for Extension:PersonalDashboard on English Wikipedia (T421415) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 07:43 kgraessle@deploy1003: Started scap sync-world: Backport for Set live configuration for Extension:PersonalDashboard on English Wikipedia (T421415)
- 05:51 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1022.eqiad.wmnet with reason: Patch clouddb1022
- 02:06 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 15s)
- 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
2026-04-05
- 02:01 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 00m 52s)
- 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
2026-04-04
- 18:10 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
- 18:10 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
- 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 12s)
- 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
2026-04-03
- 23:49 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4 days, 0:00:00 on zuul2002.codfw.wmnet with reason: T421398
- 23:48 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4 days, 0:00:00 on zuul1002.eqiad.wmnet with reason: T421398
- 20:13 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 20:13 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 18:56 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafkamon1003.eqiad.wmnet with OS trixie
- 18:39 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafkamon1003.eqiad.wmnet with reason: host reimage
- 18:38 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
- 18:38 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
- 18:32 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafkamon1003.eqiad.wmnet with reason: host reimage
- 18:19 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafkamon1003.eqiad.wmnet with OS trixie
- 18:16 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafkamon2003.codfw.wmnet with OS trixie
- 16:18 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp1115.eqiad.wmnet with OS trixie
- 16:05 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp1115.eqiad.wmnet with OS trixie
- 15:32 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2009
- 15:32 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2009
- 15:32 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2008
- 15:32 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2008
- 15:31 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2007
- 15:31 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2007
- 15:31 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 15:31 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2008 to codfw - jhancock@cumin2002"
- 15:31 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2008 to codfw - jhancock@cumin2002"
- 15:27 jhancock@cumin2002: START - Cookbook sre.dns.netbox
- 15:04 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafkamon2003.codfw.wmnet with reason: host reimage
- 14:58 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafkamon2003.codfw.wmnet with reason: host reimage
- 14:52 sbassett: Deployed security mitigation for T422244
- 14:40 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafkamon2003.codfw.wmnet with OS trixie
- 13:43 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 13:39 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 13:38 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 13:38 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 13:34 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 13:27 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
- 13:26 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ganeti1058
- 13:26 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host ganeti1058
- 13:25 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 13:22 vriley@cumin1003: START - Cookbook sre.dns.netbox
- 13:08 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 13:08 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [ganeti1055] - vriley@cumin1003"
- 13:08 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [ganeti1055] - vriley@cumin1003"
- 13:03 vriley@cumin1003: START - Cookbook sre.dns.netbox
- 12:22 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 12:22 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
- 12:22 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
- 12:16 jclark@cumin1003: START - Cookbook sre.dns.netbox
- 10:56 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1010.eqiad.wmnet with OS trixie
- 10:39 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1010.eqiad.wmnet with reason: host reimage
- 10:32 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1010.eqiad.wmnet with reason: host reimage
- 10:30 urbanecm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
- 10:29 urbanecm@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
- 10:19 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1010.eqiad.wmnet with OS trixie
- 10:17 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1009.eqiad.wmnet with OS trixie
- 09:59 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1009.eqiad.wmnet with reason: host reimage
- 09:54 brouberol@dns1004: END - running authdns-update
- 09:52 brouberol@dns1004: START - running authdns-update
- 09:52 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1009.eqiad.wmnet with reason: host reimage
- 09:36 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1009.eqiad.wmnet with OS trixie
- 09:33 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1008.eqiad.wmnet with OS trixie
- 09:14 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1008.eqiad.wmnet with reason: host reimage
- 09:07 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1008.eqiad.wmnet with reason: host reimage
- 08:51 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1008.eqiad.wmnet with OS trixie
- 08:48 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1007.eqiad.wmnet with OS trixie
- 08:29 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1007.eqiad.wmnet with reason: host reimage
- 08:22 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1007.eqiad.wmnet with reason: host reimage
- 08:06 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1007.eqiad.wmnet with OS trixie
- 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 21s)
- 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
- 00:59 zabe: zabe@deploy1003:~$ mwscript updateSpecialPages.php testcommonswiki # T422062
- 00:58 zabe@deploy1003: Finished scap sync-world: Backport for Disable QueryPage updates for Special:Unusedtemplates on testcommonswiki (T422062) (duration: 06m 50s)
- 00:53 zabe@deploy1003: zabe: Continuing with sync
- 00:53 zabe@deploy1003: zabe: Backport for Disable QueryPage updates for Special:Unusedtemplates on testcommonswiki (T422062) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 00:51 zabe@deploy1003: Started scap sync-world: Backport for Disable QueryPage updates for Special:Unusedtemplates on testcommonswiki (T422062)
2026-04-02
- 23:52 zabe@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
- 23:51 zabe@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
- 23:41 zabe@deploy1003: Finished scap sync-world: Backport for Start reading from new file table in dewiki and fawiki (T416548) (duration: 06m 10s)
- 23:37 zabe@deploy1003: zabe: Continuing with sync
- 23:37 zabe@deploy1003: zabe: Backport for Start reading from new file table in dewiki and fawiki (T416548) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 23:35 zabe@deploy1003: Started scap sync-world: Backport for Start reading from new file table in dewiki and fawiki (T416548)
- 23:06 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
- 23:06 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
- 22:20 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
- 22:20 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
- 22:05 jdlrobson@deploy1003: Finished scap sync-world: Backport for Fix section heading spacing on mobile (T414882) (duration: 07m 33s)
- 22:01 jdlrobson@deploy1003: jdlrobson: Continuing with sync
- 22:00 jdlrobson@deploy1003: jdlrobson: Backport for Fix section heading spacing on mobile (T414882) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 21:58 jdlrobson@deploy1003: Started scap sync-world: Backport for Fix section heading spacing on mobile (T414882)
- 21:32 kemayo@deploy1003: Finished scap sync-world: Backport for SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise (duration: 06m 18s)
- 21:28 kemayo@deploy1003: kemayo: Continuing with sync
- 21:28 kemayo@deploy1003: kemayo: Backport for SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 21:26 kemayo@deploy1003: Started scap sync-world: Backport for SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise
- 21:18 kemayo@deploy1003: kemayo: Continuing with sync
- 21:17 kemayo@deploy1003: kemayo: Backport for SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 21:15 kemayo@deploy1003: Started scap sync-world: Backport for SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise
- 21:03 kemayo@deploy1003: Finished scap sync-world: Backport for Add logged-in reader retention instrument (T420490) (duration: 11m 40s)
- 20:59 kemayo@deploy1003: annet, kemayo: Continuing with sync
- 20:53 kemayo@deploy1003: annet, kemayo: Backport for Add logged-in reader retention instrument (T420490) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 20:52 kemayo@deploy1003: Started scap sync-world: Backport for Add logged-in reader retention instrument (T420490)
- 20:37 kemayo@deploy1003: Finished scap sync-world: Backport for zhwikinews: 20th anniversary logo change (T420165) (duration: 11m 46s)
- 20:31 kemayo@deploy1003: 1f616emo, kemayo: Continuing with sync
- 20:29 kemayo@deploy1003: 1f616emo, kemayo: Backport for zhwikinews: 20th anniversary logo change (T420165) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 20:25 kemayo@deploy1003: Started scap sync-world: Backport for zhwikinews: 20th anniversary logo change (T420165)
- 20:09 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 20:09 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 20:03 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 20:02 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 19:57 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 19:57 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
- 19:24 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
- 19:24 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
- 19:09 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on releases2003.codfw.wmnet with reason: T418109
- 19:01 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
- 19:01 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
- 18:56 cmooney@dns2005: END - running authdns-update
- 18:55 cmooney@dns2005: START - running authdns-update
- 18:52 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 18:52 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add dns names for new lumen 100g transport - cmooney@cumin1003"
- 18:51 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add dns names for new lumen 100g transport - cmooney@cumin1003"
- 18:45 cmooney@cumin1003: START - Cookbook sre.dns.netbox
- 18:28 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P{lvs5006.eqsin.wmnet} and A:liberica
- 18:24 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P{lvs5006.eqsin.wmnet} and A:liberica
- 18:00 ebysans@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
- 18:00 ebysans@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
- 18:00 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
- 17:59 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
- 17:18 swfrench@dns1004: END - running authdns-update
- 17:16 swfrench@dns1004: START - running authdns-update
- 17:13 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
- 17:12 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
- 17:12 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
- 17:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
- 17:11 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
- 17:11 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
- 17:11 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
- 17:10 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
- 17:10 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
- 17:08 swfrench@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
- 17:07 swfrench@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
- 17:05 swfrench@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
- 17:05 swfrench@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
- 17:03 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P{lvs3008.esams.wmnet} and A:liberica
- 17:02 swfrench@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
- 16:59 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P{lvs3008.esams.wmnet} and A:liberica
- 16:57 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P{lvs3009.esams.wmnet} and A:liberica
- 16:53 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P{lvs3009.esams.wmnet} and A:liberica
- 16:13 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
- 16:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
- 16:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
- 16:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
- 16:09 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2239.codfw.wmnet with reason: Maintenance
- 16:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227 (T419635)', diff saved to https://phabricator.wikimedia.org/P90247 and previous config saved to /var/cache/conftool/dbconfig/20260402-160942-fceratto.json
- 16:03 Lucas_WMDE: UTC afternoon backport+config window (very belatedly) done ^^
- 16:02 swfrench@deploy1003: Finished scap sync-world: Manual full-rebuild sync-world to pick up 1267062, 1266985 (attempt 2) - T422143 (duration: 29m 56s)
- 15:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227', diff saved to https://phabricator.wikimedia.org/P90246 and previous config saved to /var/cache/conftool/dbconfig/20260402-155934-fceratto.json
- 15:51 swfrench@deploy1003: swfrench: Continuing with sync
- 15:50 swfrench@deploy1003: swfrench: Manual full-rebuild sync-world to pick up 1267062, 1266985 (attempt 2) - T422143 synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 15:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227', diff saved to https://phabricator.wikimedia.org/P90245 and previous config saved to /var/cache/conftool/dbconfig/20260402-154925-fceratto.json
- 15:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227 (T419635)', diff saved to https://phabricator.wikimedia.org/P90244 and previous config saved to /var/cache/conftool/dbconfig/20260402-153918-fceratto.json
- 15:33 swfrench@deploy1003: Started scap sync-world: Manual full-rebuild sync-world to pick up 1267062, 1266985 (attempt 2) - T422143
- 15:32 moritzm: installing freetype security updates
- 15:31 swfrench-wmf: restarted docker-registry-ml.service on registry200[45] - T422166
- 15:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
- 15:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
- 15:26 swfrench@deploy1003: sync-world aborted: Manual full-rebuild sync-world to pick up 1267062, 1266985 - T422143 (duration: 26m 48s)
- 15:26 swfrench-wmf: restarted docker-registry-restricted.service on registry200[45] - T422166
- 15:23 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
- 15:23 papaul: maintenance complete on mr1-eqiad
- 15:22 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
- 15:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
- 15:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
- 15:11 moritzm: installing apache2 security updates
- 15:07 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
- 15:06 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
- 15:05 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2227 (T419635)', diff saved to https://phabricator.wikimedia.org/P90242 and previous config saved to /var/cache/conftool/dbconfig/20260402-150542-fceratto.json
- 15:05 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2227.codfw.wmnet with reason: Maintenance
- 15:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205 (T419635)', diff saved to https://phabricator.wikimedia.org/P90241 and previous config saved to /var/cache/conftool/dbconfig/20260402-150517-fceratto.json
- 15:03 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
- 15:03 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
- 14:59 swfrench@deploy1003: Started scap sync-world: Manual full-rebuild sync-world to pick up 1267062, 1266985 - T422143
- 14:59 papaul: ongoing maintenance on mr1-eqiad
- 14:57 swfrench@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted/
- 14:56 pt1979@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mr1-eqiad,mr1-eqiad IPv6 with reason: switching from OSFP to BGP
- 14:56 swfrench@deploy1003: Started scap sync-world: Manual sync-world to pick up 1267062, 1266985 - T422143
- 14:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205', diff saved to https://phabricator.wikimedia.org/P90239 and previous config saved to /var/cache/conftool/dbconfig/20260402-145508-fceratto.json
- 14:53 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
- 14:53 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
- 14:52 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
- 14:52 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
- 14:51 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
- 14:51 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
- 14:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205', diff saved to https://phabricator.wikimedia.org/P90237 and previous config saved to /var/cache/conftool/dbconfig/20260402-144500-fceratto.json
- 14:42 moritzm: installing libxml-parser-perl security updates
- 14:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205 (T419635)', diff saved to https://phabricator.wikimedia.org/P90236 and previous config saved to /var/cache/conftool/dbconfig/20260402-143452-fceratto.json
- 14:28 moritzm: installing pyasn1 security updates
- 14:18 ladsgroup@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted
- 14:17 ladsgroup@deploy1003: Started scap sync-world: Backport for Bump maxConnCount
- 14:10 esanders@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted/
- 14:09 esanders@deploy1003: Started scap sync-world: Backport for Fix suggestion mode availability check (T422143)
- 14:03 hashar: Jenkins CI: reloading configuration from disk to poll new nodes # T421114
- 14:00 esanders@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted/
- 14:00 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2205 (T419635)', diff saved to https://phabricator.wikimedia.org/P90233 and previous config saved to /var/cache/conftool/dbconfig/20260402-140004-fceratto.json
- 13:59 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2205.codfw.wmnet with reason: Maintenance
- 13:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194 (T419635)', diff saved to https://phabricator.wikimedia.org/P90232 and previous config saved to /var/cache/conftool/dbconfig/20260402-135939-fceratto.json
- 13:58 esanders@deploy1003: Started scap sync-world: Backport for Fix suggestion mode availability check (T422143)
- 13:55 lucaswerkmeister-wmde@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmne
- 13:54 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for Fix suggestion mode availability check (T422143)
- 13:53 lucaswerkmeister-wmde@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmne
- 13:51 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for Fix suggestion mode availability check (T422143)
- 13:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P90231 and previous config saved to /var/cache/conftool/dbconfig/20260402-134931-fceratto.json
- 13:42 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
- 13:41 jasmine@cumin1003: END (PASS) - Cookbook sre.discovery.datacenter (exit_code=0) pool all active/active services in codfw: maintenance - T414486
- 13:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P90230 and previous config saved to /var/cache/conftool/dbconfig/20260402-133923-fceratto.json
- 13:37 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
- 13:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194 (T419635)', diff saved to https://phabricator.wikimedia.org/P90229 and previous config saved to /var/cache/conftool/dbconfig/20260402-132914-fceratto.json
- 13:18 jasmine@cumin1003: START - Cookbook sre.discovery.datacenter pool all active/active services in codfw: maintenance - T414486
- 13:17 jasmine@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: pool codfw [reason: no reason specified, T414486]
- 13:17 jasmine@cumin1003: START - Cookbook sre.dns.admin DNS admin: pool codfw [reason: no reason specified, T414486]
- 12:57 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2194 (T419635)', diff saved to https://phabricator.wikimedia.org/P90228 and previous config saved to /var/cache/conftool/dbconfig/20260402-125732-fceratto.json
- 12:57 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2194.codfw.wmnet with reason: Maintenance
- 12:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190 (T419635)', diff saved to https://phabricator.wikimedia.org/P90227 and previous config saved to /var/cache/conftool/dbconfig/20260402-125707-fceratto.json
- 12:56 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
- 12:50 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
- 12:49 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1374.eqiad.wmnet with OS trixie
- 12:48 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1042: Restoring section
- 12:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P90225 and previous config saved to /var/cache/conftool/dbconfig/20260402-124659-fceratto.json
- 12:45 klausman@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
- 12:45 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1373.eqiad.wmnet with OS trixie
- 12:45 klausman@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
- 12:44 klausman@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
- 12:41 klausman@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
- 12:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P90224 and previous config saved to /var/cache/conftool/dbconfig/20260402-123650-fceratto.json
- 12:33 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1042: Restoring section
- 12:32 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool es1042.eqiad.wmnet: Restoring section
- 12:32 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1042.eqiad.wmnet: Restoring section
- 12:32 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1374.eqiad.wmnet with reason: host reimage
- 12:32 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool es1042.eqiad.wmnet: Restoring section
- 12:31 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
- 12:30 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1042.eqiad.wmnet: Restoring section
- 12:29 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1373.eqiad.wmnet with reason: host reimage
- 12:28 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for es1042.eqiad.wmnet
- 12:28 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for es1042.eqiad.wmnet
- 12:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190 (T419635)', diff saved to https://phabricator.wikimedia.org/P90222 and previous config saved to /var/cache/conftool/dbconfig/20260402-122642-fceratto.json
- 12:22 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1374.eqiad.wmnet with reason: host reimage
- 12:22 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1373.eqiad.wmnet with reason: host reimage
- 12:13 volans@dns1004: END - running authdns-update
- 12:11 volans@dns1004: START - running authdns-update
- 12:11 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
- 12:10 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1374
- 12:10 jayme@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1374
- 12:10 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1373
- 12:09 jayme@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1373
- 12:09 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1374.eqiad.wmnet with OS trixie
- 12:09 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1373.eqiad.wmnet with OS trixie
- 11:55 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2190 (T419635)', diff saved to https://phabricator.wikimedia.org/P90221 and previous config saved to /var/cache/conftool/dbconfig/20260402-115511-fceratto.json
- 11:55 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2190.codfw.wmnet with reason: Maintenance
- 11:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177 (T419635)', diff saved to https://phabricator.wikimedia.org/P90220 and previous config saved to /var/cache/conftool/dbconfig/20260402-115446-fceratto.json
- 11:52 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
- 11:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P90219 and previous config saved to /var/cache/conftool/dbconfig/20260402-114437-fceratto.json
- 11:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P90218 and previous config saved to /var/cache/conftool/dbconfig/20260402-113429-fceratto.json
- 11:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177 (T419635)', diff saved to https://phabricator.wikimedia.org/P90217 and previous config saved to /var/cache/conftool/dbconfig/20260402-112421-fceratto.json
- 10:51 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2177 (T419635)', diff saved to https://phabricator.wikimedia.org/P90216 and previous config saved to /var/cache/conftool/dbconfig/20260402-105142-fceratto.json
- 10:51 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2177.codfw.wmnet with reason: Maintenance
- 10:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156 (T419635)', diff saved to https://phabricator.wikimedia.org/P90215 and previous config saved to /var/cache/conftool/dbconfig/20260402-105129-fceratto.json
- 10:45 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
- 10:45 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
- 10:43 jmm@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host cumin2002.codfw.wmnet
- 10:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P90214 and previous config saved to /var/cache/conftool/dbconfig/20260402-104121-fceratto.json
- 10:40 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
- 10:37 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
- 10:32 jmm@cumin1003: START - Cookbook sre.hosts.reboot-single for host cumin2002.codfw.wmnet
- 10:31 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
- 10:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P90213 and previous config saved to /var/cache/conftool/dbconfig/20260402-103113-fceratto.json
- 10:31 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
- 10:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156 (T419635)', diff saved to https://phabricator.wikimedia.org/P90212 and previous config saved to /var/cache/conftool/dbconfig/20260402-102105-fceratto.json
- 10:19 jayme@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
- 10:19 jayme@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
- 10:19 jayme@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
- 10:19 jayme@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
- 10:19 jayme@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
- 10:19 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker-exp2001.codfw.wmnet
- 10:19 moritzm: installing freetype security updates
- 10:19 jayme@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
- 10:19 jayme@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
- 10:18 jayme@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
- 10:18 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
- 10:18 jayme@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
- 10:18 jayme@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
- 10:18 jayme@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
- 10:18 jayme@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
- 10:18 jayme@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
- 10:17 jayme@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
- 10:17 jayme@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
- 10:17 jayme@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
- 10:17 jayme@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
- 10:17 jayme@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
- 10:17 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
- 10:17 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
- 10:17 jayme@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
- 10:17 jayme@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
- 10:16 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
- 10:16 jayme@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
- 10:15 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker-exp2001.codfw.wmnet
- 10:14 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/thumbor: apply
- 10:12 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
- 10:10 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/thumbor: apply
- 10:05 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/thumbor: apply
- 10:05 jmm@deploy1003: helmfile [staging] START helmfile.d/services/thumbor: apply
- 09:48 javiermonton@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventgate-main: sync
- 09:48 javiermonton@deploy1003: helmfile [codfw] START helmfile.d/services/eventgate-main: sync
- 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2156 (T419635)', diff saved to https://phabricator.wikimedia.org/P90211 and previous config saved to /var/cache/conftool/dbconfig/20260402-094834-fceratto.json
- 09:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2156.codfw.wmnet with reason: Maintenance
- 09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149 (T419635)', diff saved to https://phabricator.wikimedia.org/P90210 and previous config saved to /var/cache/conftool/dbconfig/20260402-094808-fceratto.json
- 09:48 javiermonton@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventgate-main: sync
- 09:47 javiermonton@deploy1003: helmfile [eqiad] START helmfile.d/services/eventgate-main: sync
- 09:45 javiermonton@deploy1003: helmfile [staging] DONE helmfile.d/services/eventgate-main: sync
- 09:45 javiermonton@deploy1003: helmfile [staging] START helmfile.d/services/eventgate-main: sync
- 09:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P90209 and previous config saved to /var/cache/conftool/dbconfig/20260402-093759-fceratto.json
- 09:33 jmm@cumin2002: END (PASS) - Cookbook sre.misc-clusters.restart-reboot-config-master (exit_code=0) rolling reboot on A:config-master-codfw
- 09:29 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) config-master.discovery.wmnet. on all recursors
- 09:29 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache config-master.discovery.wmnet. on all recursors
- 09:29 jmm@cumin2002: START - Cookbook sre.misc-clusters.restart-reboot-config-master rolling reboot on A:config-master-codfw
- 09:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P90208 and previous config saved to /var/cache/conftool/dbconfig/20260402-092751-fceratto.json
- 09:27 gkyziridis@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
- 09:27 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
- 09:19 moritzm: upgrading Envoy on the config-master servers to 1.35.9 T419637 T410975
- 09:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149 (T419635)', diff saved to https://phabricator.wikimedia.org/P90207 and previous config saved to /var/cache/conftool/dbconfig/20260402-091743-fceratto.json
- 08:55 gkyziridis@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
- 08:55 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
- 08:49 moritzm: added Atsuko to the cn=ops LDAP group T421860
- 08:46 dpogorzelski@deploy1003: helmfile [eqiad] DONE helmfile.d/services/changeprop: sync
- 08:45 dpogorzelski@deploy1003: helmfile [eqiad] START helmfile.d/services/changeprop: sync
- 08:45 dpogorzelski@deploy1003: helmfile [codfw] DONE helmfile.d/services/changeprop: sync
- 08:44 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2149 (T419635)', diff saved to https://phabricator.wikimedia.org/P90206 and previous config saved to /var/cache/conftool/dbconfig/20260402-084452-fceratto.json
- 08:44 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2149.codfw.wmnet with reason: Maintenance
- 08:44 dpogorzelski@deploy1003: helmfile [codfw] START helmfile.d/services/changeprop: sync
- 08:42 XioNoX: reboot mr1-esams - T416450
- 08:30 volans: briefly disabling puppet on P:installserver::proxy to deploy g/1266885
- 08:17 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
- 08:10 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group2 to 1.46.0-wmf.22 refs T420480
- 08:00 mszwarc@deploy1003: Finished scap sync-world: Backport for Disable external link analysis (T419837) (duration: 10m 13s)
- 07:56 mszwarc@deploy1003: mszwarc: Continuing with sync
- 07:55 jmm@dns1004: END - running authdns-update
- 07:54 jmm@dns1004: START - running authdns-update
- 07:52 mszwarc@deploy1003: mszwarc: Backport for Disable external link analysis (T419837) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 07:50 mszwarc@deploy1003: Started scap sync-world: Backport for Disable external link analysis (T419837)
- 07:47 jnuche@deploy1003: Finished scap sync-world: Backport for ApiAuthManagerHelper: Accept fields with undefined label (T422027) (duration: 06m 39s)
- 07:47 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer (, T421714) xfer wdqs-all from wdqs2016.codfw.wmnet -> wdqs1027.eqiad.wmnet, repooling both afterwards
- 07:43 jnuche@deploy1003: jnuche: Continuing with sync
- 07:43 jnuche@deploy1003: jnuche: Backport for ApiAuthManagerHelper: Accept fields with undefined label (T422027) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 07:41 jnuche@deploy1003: Started scap sync-world: Backport for ApiAuthManagerHelper: Accept fields with undefined label (T422027)
- 07:38 moritzm: purge prometheus-nginx-exporter from url downloaders, remnants of early hcapcha rollout
- 07:38 ryankemper@deploy1003: Finished deploy [wdqs/wdqs@fea7794]: deploy to freshly reimaged wdqs host (duration: 00m 05s)
- 07:38 ryankemper@deploy1003: Started deploy [wdqs/wdqs@fea7794]: deploy to freshly reimaged wdqs host
- 07:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 64049
- 07:30 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 64049
- 07:25 gkyziridis@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
- 07:24 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
- 07:12 gkyziridis@deploy1003: Finished scap sync-world: Backport for EventStreamConfig: Add rr-multilingual prediction_change stream (T415892) (duration: 07m 00s)
- 07:08 gkyziridis@deploy1003: gkyziridis: Continuing with sync
- 07:07 gkyziridis@deploy1003: gkyziridis: Backport for EventStreamConfig: Add rr-multilingual prediction_change stream (T415892) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 07:05 gkyziridis@deploy1003: Started scap sync-world: Backport for EventStreamConfig: Add rr-multilingual prediction_change stream (T415892)
- 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 23s)
- 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
- 01:41 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
- 01:30 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
- 01:30 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
- 01:19 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
- 01:18 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
- 01:08 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
- 01:06 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
- 00:56 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
2026-04-01
- 23:03 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wdqs1027.eqiad.wmnet with OS bullseye
- 22:58 swfrench-wmf: removed unused image-suggestion service in eqiad - T368096
- 22:48 swfrench-wmf: removed unused image-suggestion service in codfw - T368096
- 22:48 jdlrobson@deploy1003: Finished scap sync-world: Backport for Legal Footer Link Deploys (T420348) (duration: 08m 25s)
- 22:43 jdlrobson@deploy1003: lmora, jdlrobson: Continuing with sync
- 22:41 jdlrobson@deploy1003: lmora, jdlrobson: Backport for Legal Footer Link Deploys (T420348) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 22:39 jdlrobson@deploy1003: Started scap sync-world: Backport for Legal Footer Link Deploys (T420348)
- 22:38 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs1027.eqiad.wmnet with reason: host reimage
- 22:34 ladsgroup@deploy1003: Finished scap sync-world: Backport for Deferred: Fix function to get virtual domain (T421914 T398709), Deferred: Fix function to get virtual domain (T421914 T398709) (duration: 06m 37s)
- 22:33 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs1027.eqiad.wmnet with reason: host reimage
- 22:29 ladsgroup@deploy1003: ladsgroup: Continuing with sync
- 22:29 ladsgroup@deploy1003: ladsgroup: Backport for Deferred: Fix function to get virtual domain (T421914 T398709), Deferred: Fix function to get virtual domain (T421914 T398709) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 22:27 ladsgroup@deploy1003: Started scap sync-world: Backport for Deferred: Fix function to get virtual domain (T421914 T398709), Deferred: Fix function to get virtual domain (T421914 T398709)
- 22:07 bking@cumin2002: conftool action : set/pooled=true; selector: dnsdisc=wdqs-internal-scholarly,name=codfw
- 22:05 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wdqs1027
- 22:05 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs1027
- 22:01 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs1027
- 22:01 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wdqs1027.eqiad.wmnet 98.32.64.10.in-addr.arpa 8.9.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
- 22:01 bking@cumin2002: START - Cookbook sre.dns.wipe-cache wdqs1027.eqiad.wmnet 98.32.64.10.in-addr.arpa 8.9.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
- 22:01 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 22:01 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wdqs1027 - bking@cumin2002"
- 22:01 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wdqs1027 - bking@cumin2002"
- 21:42 swfrench@deploy1003: Finished scap sync-world: Backport for Only set the thumb step when width is given (T422074), Only set the thumb step when width is given (T422074) (duration: 07m 15s)
- 21:38 swfrench@deploy1003: swfrench: Continuing with sync
- 21:36 swfrench@deploy1003: swfrench: Backport for Only set the thumb step when width is given (T422074), Only set the thumb step when width is given (T422074) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 21:35 swfrench@deploy1003: Started scap sync-world: Backport for Only set the thumb step when width is given (T422074), Only set the thumb step when width is given (T422074)
- 21:34 bking@cumin2002: START - Cookbook sre.dns.netbox
- 21:34 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host wdqs1027
- 21:33 bking@cumin2002: START - Cookbook sre.hosts.reimage for host wdqs1027.eqiad.wmnet with OS bullseye
- 21:14 brett: Reboot lvs1013, lvs1014, lvs1015, and lvs1017 for kernel updates
- 21:07 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudelastic1010.eqiad.wmnet with OS bullseye
- 20:53 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
- 20:53 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
- 20:52 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
- 20:52 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
- 20:50 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
- 20:50 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
- 20:47 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P{lvs3010.esams.wmnet} and A:liberica
- 20:43 brett@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P{lvs3010.esams.wmnet} and A:liberica
- 20:23 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudelastic1010.eqiad.wmnet with reason: host reimage
- 20:19 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudelastic1010.eqiad.wmnet with reason: host reimage
- 20:13 cjming@deploy1003: Finished scap sync-world: Backport for config: Enable EmailConfirmationBanner on mediawikiwiki (T421366) (duration: 08m 47s)
- 20:09 cjming@deploy1003: mmartorana, cjming: Continuing with sync
- 20:06 cjming@deploy1003: mmartorana, cjming: Backport for config: Enable EmailConfirmationBanner on mediawikiwiki (T421366) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 20:04 cjming@deploy1003: Started scap sync-world: Backport for config: Enable EmailConfirmationBanner on mediawikiwiki (T421366)
- 20:00 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cloudelastic1010
- 20:00 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudelastic1010
- 19:58 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cloudelastic1010
- 19:58 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cloudelastic1010.eqiad.wmnet 24.48.64.10.in-addr.arpa 4.2.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
- 19:58 bking@cumin2002: START - Cookbook sre.dns.wipe-cache cloudelastic1010.eqiad.wmnet 24.48.64.10.in-addr.arpa 4.2.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
- 19:58 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 19:58 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1010 - bking@cumin2002"
- 19:58 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1010 - bking@cumin2002"
- 19:53 bking@cumin2002: START - Cookbook sre.dns.netbox
- 19:53 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host cloudelastic1010
- 19:52 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1010.eqiad.wmnet with OS bullseye
- 18:59 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudelastic1009.eqiad.wmnet with OS bullseye
- 18:11 ebysans@deploy1003: helmfile [codfw] DONE helmfile.d/services/media-analytics: apply
- 18:10 ebysans@deploy1003: helmfile [codfw] START helmfile.d/services/media-analytics: apply
- 18:10 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/media-analytics: apply
- 18:10 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/media-analytics: apply
- 18:10 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudelastic1009.eqiad.wmnet with reason: host reimage
- 18:04 aokoth@cumin1003: END (PASS) - Cookbook sre.vrts.upgrade (exit_code=0) on VRTS host vrts1003.eqiad.wmnet
- 18:03 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudelastic1009.eqiad.wmnet with reason: host reimage
- 18:02 aokoth@cumin1003: START - Cookbook sre.vrts.upgrade on VRTS host vrts1003.eqiad.wmnet
- 18:01 SandraEbele_: Deployed refinery using scap, then deployed onto hdfs
- 17:56 ladsgroup@deploy1003: Finished scap sync-world: Backport for Refix thumb steps for the poster image of videos (T414805), Refix thumb steps for the poster image of videos (T414805) (duration: 08m 18s)
- 17:52 ladsgroup@deploy1003: ladsgroup: Continuing with sync
- 17:50 ladsgroup@deploy1003: ladsgroup: Backport for Refix thumb steps for the poster image of videos (T414805), Refix thumb steps for the poster image of videos (T414805) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 17:48 ladsgroup@deploy1003: Started scap sync-world: Backport for Refix thumb steps for the poster image of videos (T414805), Refix thumb steps for the poster image of videos (T414805)
- 17:43 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cloudelastic1009
- 17:43 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudelastic1009
- 17:42 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cloudelastic1009
- 17:42 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cloudelastic1009.eqiad.wmnet 30.32.64.10.in-addr.arpa 0.3.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
- 17:42 bking@cumin2002: START - Cookbook sre.dns.wipe-cache cloudelastic1009.eqiad.wmnet 30.32.64.10.in-addr.arpa 0.3.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
- 17:42 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 17:42 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1009 - bking@cumin2002"
- 17:42 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1009 - bking@cumin2002"
- 17:39 ebysans@deploy1003: Finished deploy [analytics/refinery@fa28ad8] (thin): Extend mediarequest Cassandra loads with poster/plays for video-requests API T415202 [analytics/refinery@fa28ad83] (duration: 01m 53s)
- 17:38 ebysans@deploy1003: Started deploy [analytics/refinery@fa28ad8] (thin): Extend mediarequest Cassandra loads with poster/plays for video-requests API T415202 [analytics/refinery@fa28ad83]
- 17:37 ebysans@deploy1003: Finished deploy [analytics/refinery@fa28ad8]: Extend mediarequest Cassandra loads with poster/plays for video-requests API T415202 [analytics/refinery@fa28ad83] (duration: 04m 15s)
- 17:33 ebysans@deploy1003: Started deploy [analytics/refinery@fa28ad8]: Extend mediarequest Cassandra loads with poster/plays for video-requests API T415202 [analytics/refinery@fa28ad83]
- 17:33 bking@cumin2002: START - Cookbook sre.dns.netbox
- 17:32 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host cloudelastic1009
- 17:32 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1009.eqiad.wmnet with OS bullseye
- 17:32 ebysans@deploy1003: Finished deploy [analytics/refinery@fa28ad8] (hadoop-test): Extend mediarequest Cassandra loads with poster/plays for video-requests API T415202 TEST [analytics/refinery@fa28ad83] (duration: 01m 52s)
- 17:30 ebysans@deploy1003: Started deploy [analytics/refinery@fa28ad8] (hadoop-test): Extend mediarequest Cassandra loads with poster/plays for video-requests API T415202 TEST [analytics/refinery@fa28ad83]
- 17:21 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-cron: apply
- 17:21 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-cron: apply
- 17:21 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-cron: apply
- 17:21 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/mw-cron: apply
- 17:18 swfrench@deploy1003: Finished scap sync-world: helmfile-only deployment to remove unused image-suggestion listener - T368096 (duration: 07m 25s)
- 17:12 swfrench@deploy1003: Started scap sync-world: helmfile-only deployment to remove unused image-suggestion listener - T368096
- 17:09 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-restart-haproxy (exit_code=0) rolling restart of HAProxy on A:cp-drmrs - New configuration/test (T421402)
- 16:53 dreamyjazz@deploy1003: Finished scap sync-world: Backport for hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678), hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678) (duration: 11m 30s)
- 16:49 dreamyjazz@deploy1003: dreamyjazz: Continuing with sync
- 16:44 dreamyjazz@deploy1003: dreamyjazz: Backport for hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678), hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 16:42 dreamyjazz@deploy1003: Started scap sync-world: Backport for hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678), hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)
- 16:36 bking@cumin2002: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0) (T421714, prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
- 16:33 fabfur@cumin1003: START - Cookbook sre.cdn.roll-restart-haproxy rolling restart of HAProxy on A:cp-drmrs - New configuration/test (T421402)
- 16:29 urbanecm@deploy1003: Finished scap sync-world: Backport for Set the default for UserEmailConfirmationUseHTML to true (T411147), cleanup: Remove UserEmailConfirmationUseHTML (defaults to true) (T411147) (duration: 09m 31s)
- 16:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
- 16:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
- 16:24 urbanecm@deploy1003: urbanecm: Continuing with sync
- 16:21 urbanecm@deploy1003: urbanecm: Backport for Set the default for UserEmailConfirmationUseHTML to true (T411147), cleanup: Remove UserEmailConfirmationUseHTML (defaults to true) (T411147) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 16:19 urbanecm@deploy1003: Started scap sync-world: Backport for Set the default for UserEmailConfirmationUseHTML to true (T411147), cleanup: Remove UserEmailConfirmationUseHTML (defaults to true) (T411147)
- 16:15 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host conf2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 16:14 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host conf2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
- 16:14 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2007
- 16:14 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2007
- 16:13 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 16:13 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2007 to codfw - jhancock@cumin2002"
- 16:13 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2007 to codfw - jhancock@cumin2002"
- 16:09 jhancock@cumin2002: START - Cookbook sre.dns.netbox
- 15:32 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_ulsfo - 3.2 upgrade (T421402)
- 15:26 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_ulsfo - 3.2 upgrade (T421402)
- 15:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
- 15:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
- 15:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
- 15:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
- 15:24 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer (T421714, prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
- 15:23 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) (T421714, prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
- 15:23 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer (T421714, prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
- 15:22 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) (T421714, prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
- 15:22 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer (T421714, prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
- 15:22 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) (T421714, prepare newly-reimaged host) xfer wikidata from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
- 15:20 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer (T421714, prepare newly-reimaged host) xfer wikidata from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
- 15:13 jforrester@deploy1003: Finished scap sync-world: Backport for Wikifunctions: Switch cache from mcrouter-wikifunctions to special access (T411807) (duration: 12m 53s)
- 15:12 lucaswerkmeister-wmde@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-experimental: apply
- 15:11 lucaswerkmeister-wmde@deploy1003: helmfile [codfw] START helmfile.d/services/mw-experimental: apply
- 15:11 lucaswerkmeister-wmde@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
- 15:10 lucaswerkmeister-wmde@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
- 15:09 jforrester@deploy1003: jforrester: Continuing with sync
- 15:03 jforrester@deploy1003: jforrester: Backport for Wikifunctions: Switch cache from mcrouter-wikifunctions to special access (T411807) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 15:01 jforrester@deploy1003: Started scap sync-world: Backport for Wikifunctions: Switch cache from mcrouter-wikifunctions to special access (T411807)
- 15:00 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host db1208.eqiad.wmnet
- 14:59 taavi@dns1004: END - running authdns-update
- 14:59 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wcqs1003.eqiad.wmnet with OS bullseye
- 14:57 taavi@dns1004: START - running authdns-update
- 14:56 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker2005.codfw.wmnet
- 14:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
- 14:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
- 14:53 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
- 14:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259 (T419635)', diff saved to https://phabricator.wikimedia.org/P90188 and previous config saved to /var/cache/conftool/dbconfig/20260401-145256-fceratto.json
- 14:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 34968
- 14:52 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 34968
- 14:50 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker2005.codfw.wmnet
- 14:50 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker2004.codfw.wmnet
- 14:48 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host db1208.eqiad.wmnet
- 14:47 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker2004.codfw.wmnet
- 14:45 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_ulsfo - 3.2 upgrade (T421402)
- 14:44 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_ulsfo - 3.2 upgrade (T421402)
- 14:44 fabfur: upgrading ulsfo to haproxy 3.2 (T421402)
- 14:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259', diff saved to https://phabricator.wikimedia.org/P90187 and previous config saved to /var/cache/conftool/dbconfig/20260401-144247-fceratto.json
- 14:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259', diff saved to https://phabricator.wikimedia.org/P90186 and previous config saved to /var/cache/conftool/dbconfig/20260401-143239-fceratto.json
- 14:28 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wcqs1003.eqiad.wmnet with reason: host reimage
- 14:28 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
- 14:28 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
- 14:27 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
- 14:26 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
- 14:26 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
- 14:25 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
- 14:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259 (T419635)', diff saved to https://phabricator.wikimedia.org/P90184 and previous config saved to /var/cache/conftool/dbconfig/20260401-142231-fceratto.json
- 14:22 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wcqs1003.eqiad.wmnet with reason: host reimage
- 14:21 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
- 14:21 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
- 14:20 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
- 14:19 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
- 14:19 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
- 14:18 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
- 14:16 jforrester@deploy1003: Finished scap sync-world: Backport for MemcachedWrapper: Hash key when longer than 250 characters, Extend queue processing times for abstract fragments (T421581) (duration: 08m 14s)
- 14:13 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts an-worker1148.eqiad.wmnet
- 14:13 brouberol@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 14:13 brouberol@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-worker1148.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - brouberol@cumin1003"
- 14:13 brouberol@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-worker1148.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - brouberol@cumin1003"
- 14:12 jforrester@deploy1003: jforrester: Continuing with sync
- 14:11 volans: uploaded cumin_6.0.0 to apt.wikimedia.org bookworm-wikimedia,trixie-wikimedia
- 14:11 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_magru - 3.2 upgrade (T421402)
- 14:11 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_magru - 3.2 upgrade (T421402)
- 14:10 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
- 14:10 jforrester@deploy1003: jforrester: Backport for MemcachedWrapper: Hash key when longer than 250 characters, Extend queue processing times for abstract fragments (T421581) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 14:09 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
- 14:09 brouberol@cumin1003: START - Cookbook sre.dns.netbox
- 14:08 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
- 14:08 jforrester@deploy1003: Started scap sync-world: Backport for MemcachedWrapper: Hash key when longer than 250 characters, Extend queue processing times for abstract fragments (T421581)
- 14:07 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
- 14:07 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
- 14:07 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
- 14:07 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1259 (T419635)', diff saved to https://phabricator.wikimedia.org/P90182 and previous config saved to /var/cache/conftool/dbconfig/20260401-140707-fceratto.json
- 14:07 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1259.eqiad.wmnet with reason: Maintenance
- 14:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254 (T419635)', diff saved to https://phabricator.wikimedia.org/P90181 and previous config saved to /var/cache/conftool/dbconfig/20260401-140654-fceratto.json
- 14:05 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
- 14:05 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
- 14:05 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
- 14:05 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wcqs1003
- 14:05 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wcqs1003
- 14:04 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
- 14:04 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wcqs1003
- 14:04 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wcqs1003.eqiad.wmnet 9.32.64.10.in-addr.arpa 9.0.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
- 14:04 bking@cumin2002: START - Cookbook sre.dns.wipe-cache wcqs1003.eqiad.wmnet 9.32.64.10.in-addr.arpa 9.0.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
- 14:04 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 14:04 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wcqs1003 - bking@cumin2002"
- 14:04 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wcqs1003 - bking@cumin2002"
- 14:03 brouberol@cumin1003: START - Cookbook sre.hosts.decommission for hosts an-worker1148.eqiad.wmnet
- 14:03 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
- 14:02 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
- 14:00 bking@cumin2002: START - Cookbook sre.dns.netbox
- 14:00 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host wcqs1003
- 13:59 bking@cumin2002: START - Cookbook sre.hosts.reimage for host wcqs1003.eqiad.wmnet with OS bullseye
- 13:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254', diff saved to https://phabricator.wikimedia.org/P90179 and previous config saved to /var/cache/conftool/dbconfig/20260401-135646-fceratto.json
- 13:51 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for ml-lab1002.eqiad.wmnet
- 13:51 klausman@cumin1003: START - Cookbook sre.hosts.remove-downtime for ml-lab1002.eqiad.wmnet
- 13:50 klausman@cumin1003: END (ERROR) - Cookbook sre.hosts.reboot-single (exit_code=97) for host ml-lab1002.eqiad.wmnet
- 13:46 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-lab1002.eqiad.wmnet
- 13:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254', diff saved to https://phabricator.wikimedia.org/P90178 and previous config saved to /var/cache/conftool/dbconfig/20260401-134638-fceratto.json
- 13:44 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 13:43 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
- 13:41 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/media-analytics: apply
- 13:41 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/media-analytics: apply
- 13:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254 (T419635)', diff saved to https://phabricator.wikimedia.org/P90177 and previous config saved to /var/cache/conftool/dbconfig/20260401-133629-fceratto.json
- 13:33 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
- 13:33 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
- 13:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
- 13:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
- 13:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
- 13:29 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
- 13:22 moritzm: purge prometheus-nginx-exporter from url downloaders, remnants of early hcapcha rollout
- 13:21 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1254 (T419635)', diff saved to https://phabricator.wikimedia.org/P90176 and previous config saved to /var/cache/conftool/dbconfig/20260401-132149-fceratto.json
- 13:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1254.eqiad.wmnet with reason: Maintenance
- 13:21 fabfur: upgrading magru to haproxy 3.2 (T421402)
- 13:20 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_magru - 3.2 upgrade (T421402)
- 13:19 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_magru - 3.2 upgrade (T421402)
- 13:09 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader2004.wikimedia.org
- 13:07 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1239.eqiad.wmnet with reason: Maintenance
- 13:07 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233 (T419635)', diff saved to https://phabricator.wikimedia.org/P90174 and previous config saved to /var/cache/conftool/dbconfig/20260401-130753-fceratto.json
- 13:06 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group1 to 1.46.0-wmf.22 refs T420480
- 13:05 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host urldownloader2004.wikimedia.org
- 13:00 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader1004.wikimedia.org
- 12:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P90173 and previous config saved to /var/cache/conftool/dbconfig/20260401-125744-fceratto.json
- 12:56 kharlan@deploy1003: Finished scap sync-world: Backport for hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678), hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678) (duration: 09m 21s)
- 12:55 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host urldownloader1004.wikimedia.org
- 12:52 kharlan@deploy1003: kharlan: Continuing with sync
- 12:49 kharlan@deploy1003: kharlan: Backport for hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678), hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 12:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P90171 and previous config saved to /var/cache/conftool/dbconfig/20260401-124736-fceratto.json
- 12:47 kharlan@deploy1003: Started scap sync-world: Backport for hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678), hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)
- 12:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233 (T419635)', diff saved to https://phabricator.wikimedia.org/P90170 and previous config saved to /var/cache/conftool/dbconfig/20260401-123728-fceratto.json
- 12:33 kharlan@deploy1003: Finished scap sync-world: Backport for Revert "SuggestedInvestigations: Import session into signal matching job" (T421062), Revert "SuggestedInvestigations: Import session into signal matching job" (T421062) (duration: 07m 34s)
- 12:29 kharlan@deploy1003: kharlan: Continuing with sync
- 12:28 kharlan@deploy1003: kharlan: Backport for Revert "SuggestedInvestigations: Import session into signal matching job" (T421062), Revert "SuggestedInvestigations: Import session into signal matching job" (T421062) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 12:26 kharlan@deploy1003: Started scap sync-world: Backport for Revert "SuggestedInvestigations: Import session into signal matching job" (T421062), Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)
- 12:22 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1233 (T419635)', diff saved to https://phabricator.wikimedia.org/P90169 and previous config saved to /var/cache/conftool/dbconfig/20260401-122203-fceratto.json
- 12:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1233.eqiad.wmnet with reason: Maintenance
- 12:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229 (T419635)', diff saved to https://phabricator.wikimedia.org/P90168 and previous config saved to /var/cache/conftool/dbconfig/20260401-122138-fceratto.json
- 12:17 kart_: Updated cxserver to 2026-03-25-072715-production
- 12:17 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-lab1002.eqiad.wmnet
- 12:13 kartik@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
- 12:12 kartik@deploy1003: helmfile [eqiad] START helmfile.d/services/cxserver: apply
- 12:12 kartik@deploy1003: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
- 12:11 kartik@deploy1003: helmfile [codfw] START helmfile.d/services/cxserver: apply
- 12:11 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-lab1002.eqiad.wmnet
- 12:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P90167 and previous config saved to /var/cache/conftool/dbconfig/20260401-121130-fceratto.json
- 12:02 kartik@deploy1003: helmfile [staging] DONE helmfile.d/services/cxserver: apply
- 12:02 kartik@deploy1003: helmfile [staging] START helmfile.d/services/cxserver: apply
- 12:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P90166 and previous config saved to /var/cache/conftool/dbconfig/20260401-120122-fceratto.json
- 11:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229 (T419635)', diff saved to https://phabricator.wikimedia.org/P90164 and previous config saved to /var/cache/conftool/dbconfig/20260401-115114-fceratto.json
- 11:48 moritzm: upgrading Envoy on the idp-test servers to 1.35.9 T419637 T410975
- 11:35 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1229 (T419635)', diff saved to https://phabricator.wikimedia.org/P90162 and previous config saved to /var/cache/conftool/dbconfig/20260401-113556-fceratto.json
- 11:35 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1229.eqiad.wmnet with reason: Maintenance
- 11:33 moritzm: installing tomcat10 security updates
- 11:27 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
- 11:27 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
- 11:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1225.eqiad.wmnet with reason: Maintenance
- 11:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197 (T419635)', diff saved to https://phabricator.wikimedia.org/P90161 and previous config saved to /var/cache/conftool/dbconfig/20260401-112125-fceratto.json
- 11:18 mvolz@deploy1003: helmfile [eqiad] DONE helmfile.d/services/citoid: apply
- 11:17 mvolz@deploy1003: helmfile [eqiad] START helmfile.d/services/citoid: apply
- 11:17 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
- 11:16 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
- 11:12 mvolz@deploy1003: helmfile [codfw] DONE helmfile.d/services/citoid: apply
- 11:11 mvolz@deploy1003: helmfile [codfw] START helmfile.d/services/citoid: apply
- 11:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P90160 and previous config saved to /var/cache/conftool/dbconfig/20260401-111117-fceratto.json
- 11:10 mvolz@deploy1003: helmfile [staging] DONE helmfile.d/services/citoid: apply
- 11:09 mvolz@deploy1003: helmfile [staging] START helmfile.d/services/citoid: apply
- 11:06 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
- 11:05 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
- 11:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P90159 and previous config saved to /var/cache/conftool/dbconfig/20260401-110109-fceratto.json
- 10:58 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
- 10:58 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
- 10:57 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1370.eqiad.wmnet with OS trixie
- 10:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197 (T419635)', diff saved to https://phabricator.wikimedia.org/P90158 and previous config saved to /var/cache/conftool/dbconfig/20260401-105059-fceratto.json
- 10:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1197 (T419635)', diff saved to https://phabricator.wikimedia.org/P90157 and previous config saved to /var/cache/conftool/dbconfig/20260401-104847-fceratto.json
- 10:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1197.eqiad.wmnet with reason: Maintenance
- 10:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188 (T419635)', diff saved to https://phabricator.wikimedia.org/P90156 and previous config saved to /var/cache/conftool/dbconfig/20260401-104823-fceratto.json
- 10:48 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
- 10:47 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
- 10:47 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
- 10:47 moritzm: installing libpng1.6 security updates on Trixie/Bookworm
- 10:47 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host an-druid1004.eqiad.wmnet with OS bookworm
- 10:47 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
- 10:46 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
- 10:44 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host irc1003.wikimedia.org
- 10:41 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
- 10:40 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host irc1003.wikimedia.org
- 10:40 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
- 10:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1370.eqiad.wmnet with reason: host reimage
- 10:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P90154 and previous config saved to /var/cache/conftool/dbconfig/20260401-103816-fceratto.json
- 10:36 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1370.eqiad.wmnet with reason: host reimage
- 10:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P90152 and previous config saved to /var/cache/conftool/dbconfig/20260401-102807-fceratto.json
- 10:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
- 10:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
- 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1370
- 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1370
- 10:24 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1370
- 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1370.eqiad.wmnet 204.48.64.10.in-addr.arpa 4.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
- 10:24 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1370.eqiad.wmnet 204.48.64.10.in-addr.arpa 4.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
- 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1370 - ayounsi@cumin1003"
- 10:24 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1370 - ayounsi@cumin1003"
- 10:19 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
- 10:19 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1370
- 10:19 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-druid1004.eqiad.wmnet with reason: host reimage
- 10:19 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1370.eqiad.wmnet with OS trixie
- 10:18 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
- 10:18 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
- 10:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188 (T419635)', diff saved to https://phabricator.wikimedia.org/P90151 and previous config saved to /var/cache/conftool/dbconfig/20260401-101758-fceratto.json
- 10:13 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on an-druid1004.eqiad.wmnet with reason: host reimage
- 10:13 jmm@dns1004: END - running authdns-update
- 10:12 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/proton: apply
- 10:11 jmm@dns1004: START - running authdns-update
- 10:09 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/proton: apply
- 10:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
- 10:08 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/proton: apply
- 10:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
- 10:06 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/proton: apply
- 10:06 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1369.eqiad.wmnet with OS trixie
- 10:06 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
- 10:06 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
- 10:04 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
- 10:04 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
- 09:59 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1188 (T419635)', diff saved to https://phabricator.wikimedia.org/P90150 and previous config saved to /var/cache/conftool/dbconfig/20260401-095943-fceratto.json
- 09:59 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1188.eqiad.wmnet with reason: Maintenance
- 09:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182 (T419635)', diff saved to https://phabricator.wikimedia.org/P90149 and previous config saved to /var/cache/conftool/dbconfig/20260401-095920-fceratto.json
- 09:58 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host an-druid1004
- 09:58 btullis@cumin1003: START - Cookbook sre.hosts.move-vlan for host an-druid1004
- 09:57 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host an-druid1004.eqiad.wmnet with OS bookworm
- 09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
- 09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
- 09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
- 09:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
- 09:54 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/proton: apply
- 09:53 jmm@deploy1003: helmfile [staging] START helmfile.d/services/proton: apply
- 09:50 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1369.eqiad.wmnet with reason: host reimage
- 09:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P90148 and previous config saved to /var/cache/conftool/dbconfig/20260401-094912-fceratto.json
- 09:44 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1369.eqiad.wmnet with reason: host reimage
- 09:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P90147 and previous config saved to /var/cache/conftool/dbconfig/20260401-093903-fceratto.json
- 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1369
- 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1369
- 09:32 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1369
- 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1369.eqiad.wmnet 203.48.64.10.in-addr.arpa 3.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
- 09:32 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1369.eqiad.wmnet 203.48.64.10.in-addr.arpa 3.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
- 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1369 - ayounsi@cumin1003"
- 09:32 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1369 - ayounsi@cumin1003"
- 09:31 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host an-druid1003.eqiad.wmnet with OS bookworm
- 09:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182 (T419635)', diff saved to https://phabricator.wikimedia.org/P90146 and previous config saved to /var/cache/conftool/dbconfig/20260401-092855-fceratto.json
- 09:27 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
- 09:27 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1369
- 09:27 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1369.eqiad.wmnet with OS trixie
- 09:26 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1368.eqiad.wmnet with OS trixie
- 09:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
- 09:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
- 09:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
- 09:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
- 09:11 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1182 (T419635)', diff saved to https://phabricator.wikimedia.org/P90143 and previous config saved to /var/cache/conftool/dbconfig/20260401-091134-fceratto.json
- 09:11 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1182.eqiad.wmnet with reason: Maintenance
- 09:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162 (T419635)', diff saved to https://phabricator.wikimedia.org/P90142 and previous config saved to /var/cache/conftool/dbconfig/20260401-091109-fceratto.json
- 09:09 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1368.eqiad.wmnet with reason: host reimage
- 09:05 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
- 09:05 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
- 09:05 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1368.eqiad.wmnet with reason: host reimage
- 09:03 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-druid1003.eqiad.wmnet with reason: host reimage
- 09:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P90141 and previous config saved to /var/cache/conftool/dbconfig/20260401-090101-fceratto.json
- 09:00 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on an-druid1003.eqiad.wmnet with reason: host reimage
- 08:57 Amir1: mwscript-k8s --dblist=all -- purgeUserOptions.php --login-age 5 skin (T406724)
- 08:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
- 08:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
- 08:53 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1368
- 08:53 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1368
- 08:52 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1368
- 08:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1368.eqiad.wmnet 202.48.64.10.in-addr.arpa 2.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
- 08:52 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1368.eqiad.wmnet 202.48.64.10.in-addr.arpa 2.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
- 08:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 08:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1368 - ayounsi@cumin1003"
- 08:52 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1368 - ayounsi@cumin1003"
- 08:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P90139 and previous config saved to /var/cache/conftool/dbconfig/20260401-085053-fceratto.json
- 08:45 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
- 08:45 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
- 08:44 moritzm: installing Apache security updates
- 08:44 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host an-druid1003
- 08:44 btullis@cumin1003: START - Cookbook sre.hosts.move-vlan for host an-druid1003
- 08:43 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host an-druid1003.eqiad.wmnet with OS bookworm
- 08:42 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
- 08:42 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
- 08:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162 (T419635)', diff saved to https://phabricator.wikimedia.org/P90138 and previous config saved to /var/cache/conftool/dbconfig/20260401-084047-fceratto.json
- 08:38 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
- 08:37 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1162 (T419635)', diff saved to https://phabricator.wikimedia.org/P90137 and previous config saved to /var/cache/conftool/dbconfig/20260401-083733-fceratto.json
- 08:37 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1162.eqiad.wmnet with reason: Maintenance
- 08:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156 (T419635)', diff saved to https://phabricator.wikimedia.org/P90136 and previous config saved to /var/cache/conftool/dbconfig/20260401-083709-fceratto.json
- 08:28 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1014.eqiad.wmnet
- 08:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P90135 and previous config saved to /var/cache/conftool/dbconfig/20260401-082701-fceratto.json
- 08:25 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
- 08:24 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1368
- 08:23 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1368.eqiad.wmnet with OS trixie
- 08:21 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group0 to 1.46.0-wmf.22 refs T420480
- 08:21 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1014.eqiad.wmnet
- 08:18 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1013.eqiad.wmnet
- 08:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P90134 and previous config saved to /var/cache/conftool/dbconfig/20260401-081652-fceratto.json
- 08:16 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1367.eqiad.wmnet with OS trixie
- 08:10 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1013.eqiad.wmnet
- 08:07 moritzm: upgrading Envoy on the Puppet servers to 1.35.9 T419637 T410975
- 08:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156 (T419635)', diff saved to https://phabricator.wikimedia.org/P90132 and previous config saved to /var/cache/conftool/dbconfig/20260401-080644-fceratto.json
- 08:03 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
- 07:59 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1367.eqiad.wmnet with reason: host reimage
- 07:52 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1367.eqiad.wmnet with reason: host reimage
- 07:47 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1156 (T419635)', diff saved to https://phabricator.wikimedia.org/P90130 and previous config saved to /var/cache/conftool/dbconfig/20260401-074704-fceratto.json
- 07:46 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1014,1018].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
- 07:46 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1156.eqiad.wmnet with reason: Maintenance
- 07:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1367
- 07:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1367
- 07:39 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1367
- 07:39 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1367.eqiad.wmnet 201.48.64.10.in-addr.arpa 1.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
- 07:39 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1367.eqiad.wmnet 201.48.64.10.in-addr.arpa 1.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
- 07:39 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 07:39 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1367 - ayounsi@cumin1003"
- 07:39 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1367 - ayounsi@cumin1003"
- 07:36 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
- 07:36 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
- 07:35 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
- 07:35 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1367
- 07:34 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1367.eqiad.wmnet with OS trixie
- 07:34 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1012.eqiad.wmnet
- 07:27 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1012.eqiad.wmnet
- 07:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
- 07:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
- 07:26 moritzm: installing postgresql security updates
- 07:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
- 07:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
- 07:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
- 07:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
- 07:22 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1011.eqiad.wmnet
- 07:15 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1011.eqiad.wmnet
- 07:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
- 07:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
- 07:10 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1366.eqiad.wmnet with OS trixie
- 07:09 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
- 07:09 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
- 06:54 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1366.eqiad.wmnet with reason: host reimage
- 06:50 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1366.eqiad.wmnet with reason: host reimage
- 06:38 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1366
- 06:38 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1366
- 06:37 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
- 06:37 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
- 06:35 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1366
- 06:35 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1366.eqiad.wmnet 200.48.64.10.in-addr.arpa 0.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
- 06:34 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1366.eqiad.wmnet 200.48.64.10.in-addr.arpa 0.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
- 06:34 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
- 06:34 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1366 - ayounsi@cumin1003"
- 06:34 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1366 - ayounsi@cumin1003"
- 06:30 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
- 06:30 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1366
- 06:29 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1366.eqiad.wmnet with OS trixie
- 05:56 marostegui: Drop empty tables cusi_case, cusi_user, and cusi_signal on wikis not listed at checkuser-suggested-investigations.dblist T421353
- 05:33 marostegui: Drop empty ores_classification and ores_model on closed wikis T420093
- 05:26 marostegui: Drop global_block_whitelist on closed wikis T420525
- 02:24 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/media-analytics: apply
- 02:14 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/media-analytics: apply
- 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 23s)
- 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
- 01:51 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
- 01:51 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
- 01:12 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
- 01:12 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
- 01:02 ladsgroup@deploy1003: Finished scap sync-world: Backport for Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589) (duration: 08m 35s)
- 00:57 ladsgroup@deploy1003: ladsgroup: Continuing with sync
- 00:56 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
- 00:56 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
- 00:55 ladsgroup@deploy1003: ladsgroup: Backport for Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 00:53 ladsgroup@deploy1003: Started scap sync-world: Backport for Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)
- 00:40 ladsgroup@deploy1003: Finished scap sync-world: Backport for util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589), util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589), Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589) (duration: 12m 40s)
- 00:35 ladsgroup@deploy1003: ladsgroup: Continuing with sync
- 00:29 ladsgroup@deploy1003: ladsgroup: Backport for util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589), util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589), Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589) synced to the testservers (see https://wikitech.wiki
- 00:27 ladsgroup@deploy1003: Started scap sync-world: Backport for util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589), util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589), Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)
- 00:07 ladsgroup@deploy1003: Finished scap sync-world: Backport for LinksUpdate: Consolidate links virtual domains (T421914), LinksUpdate: Consolidate links virtual domains (T421914) (duration: 06m 50s)
- 00:03 ladsgroup@deploy1003: ladsgroup: Continuing with sync
- 00:03 ladsgroup@deploy1003: ladsgroup: Backport for LinksUpdate: Consolidate links virtual domains (T421914), LinksUpdate: Consolidate links virtual domains (T421914) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
- 00:00 ladsgroup@deploy1003: Started scap sync-world: Backport for LinksUpdate: Consolidate links virtual domains (T421914), LinksUpdate: Consolidate links virtual domains (T421914)