Server Admin Log

2026-04-25

01:55 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2195', diff saved to https://phabricator.wikimedia.org/P91522 and previous config saved to /var/cache/conftool/dbconfig/20260425-015535-ladsgroup.json
01:45 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2195', diff saved to https://phabricator.wikimedia.org/P91521 and previous config saved to /var/cache/conftool/dbconfig/20260425-014528-ladsgroup.json
01:35 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2195 (T410589)', diff saved to https://phabricator.wikimedia.org/P91520 and previous config saved to /var/cache/conftool/dbconfig/20260425-013520-ladsgroup.json

2026-04-24

20:28 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host druid-internal1005.eqiad.wmnet with OS trixie
20:28 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
20:27 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
20:23 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host druid-internal1002.eqiad.wmnet with OS trixie
20:23 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
20:19 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
20:15 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host druid-internal1004.eqiad.wmnet with OS trixie
20:15 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
20:14 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
20:12 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host druid-internal1006.eqiad.wmnet with OS trixie
20:12 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
20:09 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
20:09 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on druid-internal1005.eqiad.wmnet with reason: host reimage
20:07 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host druid-internal1001.eqiad.wmnet with OS trixie
20:07 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
20:06 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
20:04 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on druid-internal1002.eqiad.wmnet with reason: host reimage
20:04 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host druid-internal1003.eqiad.wmnet with OS trixie
20:04 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
20:03 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on druid-internal1005.eqiad.wmnet with reason: host reimage
20:01 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
19:59 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on druid-internal1004.eqiad.wmnet with reason: host reimage
19:54 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on druid-internal1006.eqiad.wmnet with reason: host reimage
19:52 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
19:51 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
19:51 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
19:51 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
19:51 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host druid-internal1005.eqiad.wmnet with OS trixie
19:50 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on druid-internal1001.eqiad.wmnet with reason: host reimage
19:49 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on druid-internal1006.eqiad.wmnet with reason: host reimage
19:49 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host druid-internal1005.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
19:45 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on druid-internal1003.eqiad.wmnet with reason: host reimage
19:41 jclark@cumin1003: START - Cookbook sre.hosts.provision for host druid-internal1005.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
19:40 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host druid-internal1005.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
19:40 jclark@cumin1003: START - Cookbook sre.hosts.provision for host druid-internal1005.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
19:38 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on druid-internal1002.eqiad.wmnet with reason: host reimage
19:38 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on druid-internal1004.eqiad.wmnet with reason: host reimage
19:38 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on druid-internal1001.eqiad.wmnet with reason: host reimage
19:38 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on druid-internal1003.eqiad.wmnet with reason: host reimage
19:37 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host druid-internal1006.eqiad.wmnet with OS trixie
19:37 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
19:37 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
19:36 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host druid-internal1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
19:35 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host druid-internal1005.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
19:33 jclark@cumin1003: START - Cookbook sre.hosts.provision for host druid-internal1005.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
19:32 jclark@cumin1003: START - Cookbook sre.hosts.provision for host druid-internal1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
19:27 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host druid-internal1004.eqiad.wmnet with OS trixie
19:27 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host druid-internal1003.eqiad.wmnet with OS trixie
19:27 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host druid-internal1002.eqiad.wmnet with OS trixie
19:27 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host druid-internal1001.eqiad.wmnet with OS trixie
19:25 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host druid-internal1005.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
19:24 jclark@cumin1003: START - Cookbook sre.hosts.provision for host druid-internal1005.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
19:24 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host druid-internal1005.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
19:22 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host druid-internal1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
19:21 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host druid-internal1002.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
19:20 jclark@cumin1003: START - Cookbook sre.hosts.provision for host druid-internal1005.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
19:20 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host druid-internal1001.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
19:20 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host druid-internal1003.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
19:20 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host druid-internal1004.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
19:18 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
19:17 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
19:16 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host druid-internal1005.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
19:16 jclark@cumin1003: START - Cookbook sre.hosts.provision for host druid-internal1005.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
19:16 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host druid-internal1005.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
19:15 jclark@cumin1003: START - Cookbook sre.hosts.provision for host druid-internal1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
19:15 jclark@cumin1003: START - Cookbook sre.hosts.provision for host druid-internal1005.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
19:14 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host druid-internal1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
19:13 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host druid-internal1005.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
19:12 jclark@cumin1003: START - Cookbook sre.hosts.provision for host druid-internal1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
19:12 jclark@cumin1003: START - Cookbook sre.hosts.provision for host druid-internal1003.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
19:12 jclark@cumin1003: START - Cookbook sre.hosts.provision for host druid-internal1005.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
19:12 jclark@cumin1003: START - Cookbook sre.hosts.provision for host druid-internal1002.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
19:12 jclark@cumin1003: START - Cookbook sre.hosts.provision for host druid-internal1004.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
19:11 jclark@cumin1003: START - Cookbook sre.hosts.provision for host druid-internal1001.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
19:10 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
19:10 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding druid-internal1001 to eqiad - jclark@cumin1003"
19:10 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding druid-internal1001 to eqiad - jclark@cumin1003"
19:06 jclark@cumin1003: START - Cookbook sre.dns.netbox
18:50 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-ctrl1006.eqiad.wmnet with OS trixie
18:50 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
18:49 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
18:48 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-ctrl1005.eqiad.wmnet with OS trixie
18:48 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
18:45 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
18:34 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-ctrl1006.eqiad.wmnet with reason: host reimage
18:29 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-ctrl1005.eqiad.wmnet with reason: host reimage
18:23 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-ctrl1006.eqiad.wmnet with reason: host reimage
18:23 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-ctrl1005.eqiad.wmnet with reason: host reimage
18:17 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
18:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1251 (T419635)', diff saved to https://phabricator.wikimedia.org/P91517 and previous config saved to /var/cache/conftool/dbconfig/20260424-181705-fceratto.json
18:11 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-ctrl1006.eqiad.wmnet with OS trixie
18:11 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-ctrl1005.eqiad.wmnet with OS trixie
18:09 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl1005.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
18:08 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
18:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1251', diff saved to https://phabricator.wikimedia.org/P91516 and previous config saved to /var/cache/conftool/dbconfig/20260424-180657-fceratto.json
18:01 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl1005.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
18:01 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
18:01 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
18:01 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding rdb1016 to eqiad - jclark@cumin1003"
18:01 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding rdb1016 to eqiad - jclark@cumin1003"
17:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1251', diff saved to https://phabricator.wikimedia.org/P91515 and previous config saved to /var/cache/conftool/dbconfig/20260424-175649-fceratto.json
17:56 jclark@cumin1003: START - Cookbook sre.dns.netbox
17:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1251 (T419635)', diff saved to https://phabricator.wikimedia.org/P91513 and previous config saved to /var/cache/conftool/dbconfig/20260424-174641-fceratto.json
17:29 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1251 (T419635)', diff saved to https://phabricator.wikimedia.org/P91512 and previous config saved to /var/cache/conftool/dbconfig/20260424-172952-fceratto.json
17:29 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1251.eqiad.wmnet with reason: Maintenance
17:16 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1240.eqiad.wmnet with reason: Maintenance
17:02 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1239.eqiad.wmnet with reason: Maintenance
17:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1235 (T419635)', diff saved to https://phabricator.wikimedia.org/P91511 and previous config saved to /var/cache/conftool/dbconfig/20260424-170225-fceratto.json
16:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1235', diff saved to https://phabricator.wikimedia.org/P91510 and previous config saved to /var/cache/conftool/dbconfig/20260424-165217-fceratto.json
16:51 dancy@deploy1003: Installation of scap version "4.251.0" completed for 2 hosts
16:49 dancy@deploy1003: Installing scap version "4.251.0" for 2 host(s)
16:44 akhatun@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
16:44 akhatun@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
16:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1235', diff saved to https://phabricator.wikimedia.org/P91509 and previous config saved to /var/cache/conftool/dbconfig/20260424-164209-fceratto.json
16:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1235 (T419635)', diff saved to https://phabricator.wikimedia.org/P91508 and previous config saved to /var/cache/conftool/dbconfig/20260424-163200-fceratto.json
16:16 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1235 (T419635)', diff saved to https://phabricator.wikimedia.org/P91507 and previous config saved to /var/cache/conftool/dbconfig/20260424-161607-fceratto.json
16:15 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1235.eqiad.wmnet with reason: Maintenance
16:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1234 (T419635)', diff saved to https://phabricator.wikimedia.org/P91506 and previous config saved to /var/cache/conftool/dbconfig/20260424-161541-fceratto.json
16:14 akhatun@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
16:14 akhatun@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-feature-counts-change-enrich: apply
16:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1234', diff saved to https://phabricator.wikimedia.org/P91505 and previous config saved to /var/cache/conftool/dbconfig/20260424-160531-fceratto.json
16:01 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
16:00 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
15:59 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
15:59 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
15:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1234', diff saved to https://phabricator.wikimedia.org/P91504 and previous config saved to /var/cache/conftool/dbconfig/20260424-155523-fceratto.json
15:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1234 (T419635)', diff saved to https://phabricator.wikimedia.org/P91503 and previous config saved to /var/cache/conftool/dbconfig/20260424-154515-fceratto.json
15:38 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db2195 (T410589)', diff saved to https://phabricator.wikimedia.org/P91502 and previous config saved to /var/cache/conftool/dbconfig/20260424-153827-ladsgroup.json
15:38 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db2195.codfw.wmnet with reason: Maintenance
15:38 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2181 (T410589)', diff saved to https://phabricator.wikimedia.org/P91501 and previous config saved to /var/cache/conftool/dbconfig/20260424-153802-ladsgroup.json
15:35 bking@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cloudelastic1010.eqiad.wmnet with OS trixie
15:32 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1010.eqiad.wmnet with OS trixie
15:30 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1234 (T419635)', diff saved to https://phabricator.wikimedia.org/P91500 and previous config saved to /var/cache/conftool/dbconfig/20260424-153020-fceratto.json
15:30 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1234.eqiad.wmnet with reason: Maintenance
15:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1232 (T419635)', diff saved to https://phabricator.wikimedia.org/P91499 and previous config saved to /var/cache/conftool/dbconfig/20260424-153005-fceratto.json
15:27 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2181', diff saved to https://phabricator.wikimedia.org/P91498 and previous config saved to /var/cache/conftool/dbconfig/20260424-152755-ladsgroup.json
15:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1232', diff saved to https://phabricator.wikimedia.org/P91497 and previous config saved to /var/cache/conftool/dbconfig/20260424-151957-fceratto.json
15:17 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2181', diff saved to https://phabricator.wikimedia.org/P91496 and previous config saved to /var/cache/conftool/dbconfig/20260424-151746-ladsgroup.json
15:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1232', diff saved to https://phabricator.wikimedia.org/P91495 and previous config saved to /var/cache/conftool/dbconfig/20260424-150949-fceratto.json
15:07 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2181 (T410589)', diff saved to https://phabricator.wikimedia.org/P91494 and previous config saved to /var/cache/conftool/dbconfig/20260424-150738-ladsgroup.json
14:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1232 (T419635)', diff saved to https://phabricator.wikimedia.org/P91493 and previous config saved to /var/cache/conftool/dbconfig/20260424-145940-fceratto.json
14:44 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1232 (T419635)', diff saved to https://phabricator.wikimedia.org/P91492 and previous config saved to /var/cache/conftool/dbconfig/20260424-144405-fceratto.json
14:43 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1232.eqiad.wmnet with reason: Maintenance
14:43 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1219 (T419635)', diff saved to https://phabricator.wikimedia.org/P91491 and previous config saved to /var/cache/conftool/dbconfig/20260424-144340-fceratto.json
14:33 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1219', diff saved to https://phabricator.wikimedia.org/P91490 and previous config saved to /var/cache/conftool/dbconfig/20260424-143332-fceratto.json
14:29 moritzm: imported debdeploy 0.0.99.15 for bullseye-wikimedia (compat release for Cumin 6)
14:29 moritzm: updating debdeploy on bullseye to 0.0.99.15
14:23 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1219', diff saved to https://phabricator.wikimedia.org/P91489 and previous config saved to /var/cache/conftool/dbconfig/20260424-142323-fceratto.json
14:19 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-serve1015.eqiad.wmnet
14:15 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
14:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1219 (T419635)', diff saved to https://phabricator.wikimedia.org/P91488 and previous config saved to /var/cache/conftool/dbconfig/20260424-141315-fceratto.json
14:13 jclark@cumin1003: START - Cookbook sre.dns.netbox
14:11 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
14:11 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
14:09 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts db2148.codfw.wmnet
14:09 marostegui@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
14:09 marostegui@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2148.codfw.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
14:09 marostegui@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2148.codfw.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
14:07 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2223: after reimage to trixie
14:05 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-serve1015.eqiad.wmnet
14:05 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-serve1014.eqiad.wmnet
14:05 marostegui@cumin1003: START - Cookbook sre.dns.netbox
14:01 marostegui@cumin1003: START - Cookbook sre.hosts.decommission for hosts db2148.codfw.wmnet
14:00 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-serve1014.eqiad.wmnet
14:00 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1185: after reimage to trixie
14:00 moritzm: imported zookeeper 3.4.13-6+deb11u1~wmf13u1 into component/zookeeper34 for trixie-wikimedia (forward port of Zookeeper 3.4 from Bullseye to Trixie) T424266
13:59 moritzm: imported zookeeper 3.4.13-6+deb11u1~wmf13u1 into component/zookeeper34 for trixie-wikimedia (forward port of Zookeeper 3.4 from Bullseye to Trixie)
13:55 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1219 (T419635)', diff saved to https://phabricator.wikimedia.org/P91485 and previous config saved to /var/cache/conftool/dbconfig/20260424-135555-fceratto.json
13:55 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1219.eqiad.wmnet with reason: Maintenance
13:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1218 (T419635)', diff saved to https://phabricator.wikimedia.org/P91484 and previous config saved to /var/cache/conftool/dbconfig/20260424-135529-fceratto.json
13:47 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-serve1014.eqiad.wmnet
13:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1218', diff saved to https://phabricator.wikimedia.org/P91482 and previous config saved to /var/cache/conftool/dbconfig/20260424-134522-fceratto.json
13:41 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-serve1014.eqiad.wmnet
13:40 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-serve1014.eqiad.wmnet
13:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1218', diff saved to https://phabricator.wikimedia.org/P91479 and previous config saved to /var/cache/conftool/dbconfig/20260424-133513-fceratto.json
13:33 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-serve1014.eqiad.wmnet
13:31 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-serve1015.eqiad.wmnet
13:26 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-serve1015.eqiad.wmnet
13:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1218 (T419635)', diff saved to https://phabricator.wikimedia.org/P91476 and previous config saved to /var/cache/conftool/dbconfig/20260424-132505-fceratto.json
13:21 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2223: after reimage to trixie
13:19 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2223.codfw.wmnet with OS trixie
13:18 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-serve1015.eqiad.wmnet
13:14 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1185: after reimage to trixie
13:12 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1185.eqiad.wmnet with OS trixie
13:11 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-serve1015.eqiad.wmnet
13:08 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1218 (T419635)', diff saved to https://phabricator.wikimedia.org/P91473 and previous config saved to /var/cache/conftool/dbconfig/20260424-130840-fceratto.json
13:08 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1218.eqiad.wmnet with reason: Maintenance
13:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1206 (T419635)', diff saved to https://phabricator.wikimedia.org/P91472 and previous config saved to /var/cache/conftool/dbconfig/20260424-130815-fceratto.json
12:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1206', diff saved to https://phabricator.wikimedia.org/P91471 and previous config saved to /var/cache/conftool/dbconfig/20260424-125807-fceratto.json
12:54 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2223.codfw.wmnet with reason: host reimage
12:50 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1185.eqiad.wmnet with reason: host reimage
12:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1206', diff saved to https://phabricator.wikimedia.org/P91470 and previous config saved to /var/cache/conftool/dbconfig/20260424-124759-fceratto.json
12:45 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2223.codfw.wmnet with reason: host reimage
12:42 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1185.eqiad.wmnet with reason: host reimage
12:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1206 (T419635)', diff saved to https://phabricator.wikimedia.org/P91468 and previous config saved to /var/cache/conftool/dbconfig/20260424-123751-fceratto.json
12:29 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1244 (T419961)', diff saved to https://phabricator.wikimedia.org/P91467 and previous config saved to /var/cache/conftool/dbconfig/20260424-122939-fceratto.json
12:29 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1244.eqiad.wmnet with reason: Maintenance
12:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1243 (T419961)', diff saved to https://phabricator.wikimedia.org/P91466 and previous config saved to /var/cache/conftool/dbconfig/20260424-122910-fceratto.json
12:28 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1185.eqiad.wmnet with OS trixie
12:28 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2223.codfw.wmnet with OS trixie
12:27 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2223: Reimage to Trixie
12:27 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1185: Reimage to Trixie
12:27 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2223: Reimage to Trixie
12:27 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1185: Reimage to Trixie
12:27 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2223.codfw.wmnet with reason: Reimage to Trixie
12:27 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1185.eqiad.wmnet with reason: Reimage to Trixie
12:21 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1206 (T419635)', diff saved to https://phabricator.wikimedia.org/P91463 and previous config saved to /var/cache/conftool/dbconfig/20260424-122125-fceratto.json
12:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1206.eqiad.wmnet with reason: Maintenance
12:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1196 (T419635)', diff saved to https://phabricator.wikimedia.org/P91462 and previous config saved to /var/cache/conftool/dbconfig/20260424-122100-fceratto.json
12:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1243', diff saved to https://phabricator.wikimedia.org/P91461 and previous config saved to /var/cache/conftool/dbconfig/20260424-121902-fceratto.json
12:17 elukey@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'sync'.
12:17 elukey@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'sync'.
12:17 elukey@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'sync'.
12:17 elukey@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'sync'.
12:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1196', diff saved to https://phabricator.wikimedia.org/P91460 and previous config saved to /var/cache/conftool/dbconfig/20260424-121053-fceratto.json
12:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1243', diff saved to https://phabricator.wikimedia.org/P91458 and previous config saved to /var/cache/conftool/dbconfig/20260424-120854-fceratto.json
12:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1196', diff saved to https://phabricator.wikimedia.org/P91456 and previous config saved to /var/cache/conftool/dbconfig/20260424-120045-fceratto.json
11:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1243 (T419961)', diff saved to https://phabricator.wikimedia.org/P91455 and previous config saved to /var/cache/conftool/dbconfig/20260424-115845-fceratto.json
11:58 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2228: after reimage to trixie
11:53 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1159: after reimage to trixie
11:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1196 (T419635)', diff saved to https://phabricator.wikimedia.org/P91452 and previous config saved to /var/cache/conftool/dbconfig/20260424-115036-fceratto.json
11:50 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1243 (T419961)', diff saved to https://phabricator.wikimedia.org/P91451 and previous config saved to /var/cache/conftool/dbconfig/20260424-115025-fceratto.json
11:50 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1243.eqiad.wmnet with reason: Maintenance
11:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1242 (T419961)', diff saved to https://phabricator.wikimedia.org/P91450 and previous config saved to /var/cache/conftool/dbconfig/20260424-114956-fceratto.json
11:44 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-serve1015.eqiad.wmnet
11:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1242', diff saved to https://phabricator.wikimedia.org/P91447 and previous config saved to /var/cache/conftool/dbconfig/20260424-113948-fceratto.json
11:32 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1196 (T419635)', diff saved to https://phabricator.wikimedia.org/P91445 and previous config saved to /var/cache/conftool/dbconfig/20260424-113235-fceratto.json
11:32 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1013,1017].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
11:32 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1196.eqiad.wmnet with reason: Maintenance
11:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1195 (T419635)', diff saved to https://phabricator.wikimedia.org/P91444 and previous config saved to /var/cache/conftool/dbconfig/20260424-113149-fceratto.json
11:31 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-serve1015.eqiad.wmnet
11:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1242', diff saved to https://phabricator.wikimedia.org/P91443 and previous config saved to /var/cache/conftool/dbconfig/20260424-112939-fceratto.json
11:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1195', diff saved to https://phabricator.wikimedia.org/P91440 and previous config saved to /var/cache/conftool/dbconfig/20260424-112141-fceratto.json
11:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1242 (T419961)', diff saved to https://phabricator.wikimedia.org/P91439 and previous config saved to /var/cache/conftool/dbconfig/20260424-111931-fceratto.json
11:16 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-serve1015.eqiad.wmnet
11:13 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2228: after reimage to trixie
11:13 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-serve1015.eqiad.wmnet
11:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1195', diff saved to https://phabricator.wikimedia.org/P91436 and previous config saved to /var/cache/conftool/dbconfig/20260424-111132-fceratto.json
11:11 aokoth@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1004.wikimedia.org with reason: Security Release - T424175
11:11 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1242 (T419961)', diff saved to https://phabricator.wikimedia.org/P91434 and previous config saved to /var/cache/conftool/dbconfig/20260424-111108-fceratto.json
11:11 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2228.codfw.wmnet with OS trixie
11:11 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1242.eqiad.wmnet with reason: Maintenance
11:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1241 (T419961)', diff saved to https://phabricator.wikimedia.org/P91433 and previous config saved to /var/cache/conftool/dbconfig/20260424-111039-fceratto.json
11:08 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1159: after reimage to trixie
11:05 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1159.eqiad.wmnet with OS trixie
11:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1195 (T419635)', diff saved to https://phabricator.wikimedia.org/P91431 and previous config saved to /var/cache/conftool/dbconfig/20260424-110125-fceratto.json
11:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1241', diff saved to https://phabricator.wikimedia.org/P91430 and previous config saved to /var/cache/conftool/dbconfig/20260424-110031-fceratto.json
10:59 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-serve1015.eqiad.wmnet
10:56 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-serve1015.eqiad.wmnet
10:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1241', diff saved to https://phabricator.wikimedia.org/P91429 and previous config saved to /var/cache/conftool/dbconfig/20260424-105023-fceratto.json
10:48 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2228.codfw.wmnet with reason: host reimage
10:42 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1195 (T419635)', diff saved to https://phabricator.wikimedia.org/P91428 and previous config saved to /var/cache/conftool/dbconfig/20260424-104235-fceratto.json
10:42 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1195.eqiad.wmnet with reason: Maintenance
10:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1186 (T419635)', diff saved to https://phabricator.wikimedia.org/P91427 and previous config saved to /var/cache/conftool/dbconfig/20260424-104210-fceratto.json
10:41 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1159.eqiad.wmnet with reason: host reimage
10:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1241 (T419961)', diff saved to https://phabricator.wikimedia.org/P91426 and previous config saved to /var/cache/conftool/dbconfig/20260424-104016-fceratto.json
10:38 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2228.codfw.wmnet with reason: host reimage
10:36 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1159.eqiad.wmnet with reason: host reimage
10:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1186', diff saved to https://phabricator.wikimedia.org/P91424 and previous config saved to /var/cache/conftool/dbconfig/20260424-103202-fceratto.json
10:31 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1241 (T419961)', diff saved to https://phabricator.wikimedia.org/P91423 and previous config saved to /var/cache/conftool/dbconfig/20260424-103146-fceratto.json
10:31 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1241.eqiad.wmnet with reason: Maintenance
10:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1238 (T419961)', diff saved to https://phabricator.wikimedia.org/P91422 and previous config saved to /var/cache/conftool/dbconfig/20260424-103116-fceratto.json
10:30 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-serve1015.eqiad.wmnet
10:26 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-serve1015.eqiad.wmnet
10:22 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2228.codfw.wmnet with OS trixie
10:22 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1159.eqiad.wmnet with OS trixie
10:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1186', diff saved to https://phabricator.wikimedia.org/P91421 and previous config saved to /var/cache/conftool/dbconfig/20260424-102154-fceratto.json
10:21 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2228: Reimage to Trixie
10:21 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1159: Reimage to Trixie
10:21 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2228: Reimage to Trixie
10:21 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2228.codfw.wmnet with reason: Reimage to Trixie
10:21 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1159: Reimage to Trixie
10:21 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1159.eqiad.wmnet with reason: Reimage to Trixie
10:21 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-serve1015.eqiad.wmnet
10:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1238', diff saved to https://phabricator.wikimedia.org/P91418 and previous config saved to /var/cache/conftool/dbconfig/20260424-102108-fceratto.json
10:17 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-serve1015.eqiad.wmnet
10:15 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-serve1015.eqiad.wmnet
10:12 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-serve1015.eqiad.wmnet
10:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1186 (T419635)', diff saved to https://phabricator.wikimedia.org/P91417 and previous config saved to /var/cache/conftool/dbconfig/20260424-101146-fceratto.json
10:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1238', diff saved to https://phabricator.wikimedia.org/P91416 and previous config saved to /var/cache/conftool/dbconfig/20260424-101056-fceratto.json
10:02 aokoth@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1004.wikimedia.org with reason: Security Release - T424175
10:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1238 (T419961)', diff saved to https://phabricator.wikimedia.org/P91415 and previous config saved to /var/cache/conftool/dbconfig/20260424-100047-fceratto.json
09:57 cmooney@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-serve1015.eqiad.wmnet
09:57 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ml-serve1015.eqiad.wmnet on all recursors
09:56 cmooney@cumin1003: START - Cookbook sre.dns.wipe-cache ml-serve1015.eqiad.wmnet on all recursors
09:56 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
09:56 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: change records for ml-serve1014 - cmooney@cumin1003"
09:56 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: change records for ml-serve1014 - cmooney@cumin1003"
09:54 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1186 (T419635)', diff saved to https://phabricator.wikimedia.org/P91414 and previous config saved to /var/cache/conftool/dbconfig/20260424-095450-fceratto.json
09:54 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1186.eqiad.wmnet with reason: Maintenance
09:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1184 (T419635)', diff saved to https://phabricator.wikimedia.org/P91413 and previous config saved to /var/cache/conftool/dbconfig/20260424-095425-fceratto.json
09:52 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1238 (T419961)', diff saved to https://phabricator.wikimedia.org/P91412 and previous config saved to /var/cache/conftool/dbconfig/20260424-095228-fceratto.json
09:52 cmooney@cumin1003: START - Cookbook sre.dns.netbox
09:52 cmooney@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-serve1015.eqiad.wmnet
09:52 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1238.eqiad.wmnet with reason: Maintenance
09:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1221 (T419961)', diff saved to https://phabricator.wikimedia.org/P91411 and previous config saved to /var/cache/conftool/dbconfig/20260424-095159-fceratto.json
09:50 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-serve1015.eqiad.wmnet
09:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1184', diff saved to https://phabricator.wikimedia.org/P91410 and previous config saved to /var/cache/conftool/dbconfig/20260424-094417-fceratto.json
09:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1221', diff saved to https://phabricator.wikimedia.org/P91409 and previous config saved to /var/cache/conftool/dbconfig/20260424-094151-fceratto.json
09:40 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-serve1015.eqiad.wmnet
09:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1184', diff saved to https://phabricator.wikimedia.org/P91408 and previous config saved to /var/cache/conftool/dbconfig/20260424-093409-fceratto.json
09:34 cmooney@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host ml-serve1014.eqiad.wmnet
09:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1221', diff saved to https://phabricator.wikimedia.org/P91407 and previous config saved to /var/cache/conftool/dbconfig/20260424-093143-fceratto.json
09:28 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ml-serve1014.eqiad.wmnet on all recursors
09:28 cmooney@cumin1003: START - Cookbook sre.dns.wipe-cache ml-serve1014.eqiad.wmnet on all recursors
09:27 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
09:27 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: change records for ml-serve1014 - cmooney@cumin1003"
09:27 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: change records for ml-serve1014 - cmooney@cumin1003"
09:24 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-serve1015.eqiad.wmnet
09:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1184 (T419635)', diff saved to https://phabricator.wikimedia.org/P91406 and previous config saved to /var/cache/conftool/dbconfig/20260424-092401-fceratto.json
09:23 cmooney@cumin1003: START - Cookbook sre.dns.netbox
09:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1221 (T419961)', diff saved to https://phabricator.wikimedia.org/P91405 and previous config saved to /var/cache/conftool/dbconfig/20260424-092135-fceratto.json
09:21 cmooney@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-serve1014.eqiad.wmnet
09:19 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-serve1015.eqiad.wmnet
09:16 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-serve1015.eqiad.wmnet
09:13 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1221 (T419961)', diff saved to https://phabricator.wikimedia.org/P91404 and previous config saved to /var/cache/conftool/dbconfig/20260424-091316-fceratto.json
09:13 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1015,1024-1025].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
09:12 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1221.eqiad.wmnet with reason: Maintenance
09:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1199 (T419961)', diff saved to https://phabricator.wikimedia.org/P91403 and previous config saved to /var/cache/conftool/dbconfig/20260424-091237-fceratto.json
09:11 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-serve1015.eqiad.wmnet
09:04 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1184 (T419635)', diff saved to https://phabricator.wikimedia.org/P91402 and previous config saved to /var/cache/conftool/dbconfig/20260424-090454-fceratto.json
09:04 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1184.eqiad.wmnet with reason: Maintenance
09:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1169 (T419635)', diff saved to https://phabricator.wikimedia.org/P91401 and previous config saved to /var/cache/conftool/dbconfig/20260424-090429-fceratto.json
09:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1199', diff saved to https://phabricator.wikimedia.org/P91400 and previous config saved to /var/cache/conftool/dbconfig/20260424-090229-fceratto.json
09:01 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-serve1015.eqiad.wmnet
08:56 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-serve1015.eqiad.wmnet
08:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P91399 and previous config saved to /var/cache/conftool/dbconfig/20260424-085421-fceratto.json
08:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1199', diff saved to https://phabricator.wikimedia.org/P91398 and previous config saved to /var/cache/conftool/dbconfig/20260424-085221-fceratto.json
08:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P91397 and previous config saved to /var/cache/conftool/dbconfig/20260424-084414-fceratto.json
08:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1199 (T419961)', diff saved to https://phabricator.wikimedia.org/P91396 and previous config saved to /var/cache/conftool/dbconfig/20260424-084213-fceratto.json
08:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1169 (T419635)', diff saved to https://phabricator.wikimedia.org/P91395 and previous config saved to /var/cache/conftool/dbconfig/20260424-083406-fceratto.json
08:31 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1199 (T419961)', diff saved to https://phabricator.wikimedia.org/P91394 and previous config saved to /var/cache/conftool/dbconfig/20260424-083118-fceratto.json
08:31 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1199.eqiad.wmnet with reason: Maintenance
08:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1190 (T419961)', diff saved to https://phabricator.wikimedia.org/P91393 and previous config saved to /var/cache/conftool/dbconfig/20260424-083050-fceratto.json
08:29 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host hcaptcha-proxy5003.wikimedia.org
08:29 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) hcaptcha-proxy5003.wikimedia.org on all recursors
08:29 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache hcaptcha-proxy5003.wikimedia.org on all recursors
08:29 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
08:29 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove records for VM hcaptcha-proxy5003.wikimedia.org - jmm@cumin2002"
08:29 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove records for VM hcaptcha-proxy5003.wikimedia.org - jmm@cumin2002"
08:27 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-serve1014.eqiad.wmnet
08:24 jmm@cumin2002: START - Cookbook sre.dns.netbox
08:24 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) hcaptcha-proxy5003.wikimedia.org on all recursors
08:24 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache hcaptcha-proxy5003.wikimedia.org on all recursors
08:24 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
08:24 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM hcaptcha-proxy5003.wikimedia.org - jmm@cumin2002"
08:24 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM hcaptcha-proxy5003.wikimedia.org - jmm@cumin2002"
08:22 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-serve1014.eqiad.wmnet
08:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1190', diff saved to https://phabricator.wikimedia.org/P91392 and previous config saved to /var/cache/conftool/dbconfig/20260424-082041-fceratto.json
08:19 jmm@cumin2002: START - Cookbook sre.dns.netbox
08:19 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host hcaptcha-proxy5003.wikimedia.org
08:15 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1169 (T419635)', diff saved to https://phabricator.wikimedia.org/P91391 and previous config saved to /var/cache/conftool/dbconfig/20260424-081539-fceratto.json
08:15 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1169.eqiad.wmnet with reason: Maintenance
08:12 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host hcaptcha-proxy5003.wikimedia.org
08:12 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) hcaptcha-proxy5003.wikimedia.org on all recursors
08:12 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache hcaptcha-proxy5003.wikimedia.org on all recursors
08:12 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
08:12 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove records for VM hcaptcha-proxy5003.wikimedia.org - jmm@cumin2002"
08:12 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove records for VM hcaptcha-proxy5003.wikimedia.org - jmm@cumin2002"
08:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1190', diff saved to https://phabricator.wikimedia.org/P91390 and previous config saved to /var/cache/conftool/dbconfig/20260424-081033-fceratto.json
08:09 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
08:08 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
08:08 jmm@cumin2002: START - Cookbook sre.dns.netbox
08:08 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) hcaptcha-proxy5003.wikimedia.org on all recursors
08:08 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache hcaptcha-proxy5003.wikimedia.org on all recursors
08:08 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
08:05 jmm@cumin2002: START - Cookbook sre.dns.netbox
08:05 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts netflow5002.eqsin.wmnet
08:05 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
08:05 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: netflow5002.eqsin.wmnet decommissioned, removing all IPs except the asset tag one - ayounsi@cumin1003"
08:04 jmm@cumin2002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
08:03 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: netflow5002.eqsin.wmnet decommissioned, removing all IPs except the asset tag one - ayounsi@cumin1003"
08:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1190 (T419961)', diff saved to https://phabricator.wikimedia.org/P91388 and previous config saved to /var/cache/conftool/dbconfig/20260424-080025-fceratto.json
08:00 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
07:54 jmm@cumin2002: START - Cookbook sre.dns.netbox
07:54 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host hcaptcha-proxy5003.wikimedia.org
07:51 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1190 (T419961)', diff saved to https://phabricator.wikimedia.org/P91386 and previous config saved to /var/cache/conftool/dbconfig/20260424-075145-fceratto.json
07:51 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1190.eqiad.wmnet with reason: Maintenance
07:50 ayounsi@cumin1003: START - Cookbook sre.hosts.decommission for hosts netflow5002.eqsin.wmnet
07:45 ayounsi@cumin1003: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=99) for hosts netflow5002.eqsin.wmnet
07:45 ayounsi@cumin1003: START - Cookbook sre.hosts.decommission for hosts netflow5002.eqsin.wmnet
06:28 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on db[2142-2143].codfw.wmnet with reason: Cloning
05:56 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 264595
05:53 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 264595
05:49 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 58717
05:48 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 58717
05:44 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 20940
05:41 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 20940
05:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 19165
05:40 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 19165
05:33 marostegui@cumin1003: dbctl commit (dc=all): 'Remove db2148 from dbctl T424309', diff saved to https://phabricator.wikimedia.org/P91385 and previous config saved to /var/cache/conftool/dbconfig/20260424-053342-marostegui.json
03:30 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db2181 (T410589)', diff saved to https://phabricator.wikimedia.org/P91384 and previous config saved to /var/cache/conftool/dbconfig/20260424-033021-ladsgroup.json
03:30 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db2181.codfw.wmnet with reason: Maintenance
03:29 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2167 (T410589)', diff saved to https://phabricator.wikimedia.org/P91383 and previous config saved to /var/cache/conftool/dbconfig/20260424-032955-ladsgroup.json
03:19 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2167', diff saved to https://phabricator.wikimedia.org/P91382 and previous config saved to /var/cache/conftool/dbconfig/20260424-031947-ladsgroup.json
03:09 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2167', diff saved to https://phabricator.wikimedia.org/P91381 and previous config saved to /var/cache/conftool/dbconfig/20260424-030938-ladsgroup.json
02:59 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2167 (T410589)', diff saved to https://phabricator.wikimedia.org/P91380 and previous config saved to /var/cache/conftool/dbconfig/20260424-025930-ladsgroup.json
02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 32s)
02:01 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
00:05 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host rdb2013.codfw.wmnet with OS trixie

2026-04-23

23:38 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host rdb2014.codfw.wmnet with OS trixie
23:38 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host rdb2013.codfw.wmnet with OS trixie
23:31 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host rdb2014.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
23:26 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host rdb2014.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
23:00 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host rdb2014.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
23:00 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host rdb2014.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
22:38 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host rdb2013.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
22:37 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host rdb2014.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
22:34 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host rdb2014.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
22:31 ladsgroup@deploy1003: Finished scap sync-world: Backport for QuickView: Fix relying on non-standard sizes (T424032) (duration: 07m 19s)
22:28 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host rdb2014.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
22:28 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host rdb2014.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
22:28 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
22:27 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host rdb2013.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
22:27 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host rdb2014.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
22:26 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host rdb2014.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
22:26 ladsgroup@deploy1003: ladsgroup: Backport for QuickView: Fix relying on non-standard sizes (T424032) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
22:24 ladsgroup@deploy1003: Started scap sync-world: Backport for QuickView: Fix relying on non-standard sizes (T424032)
22:21 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudelastic1011.eqiad.wmnet with OS trixie
22:21 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host rdb2013.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
22:18 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host rdb2014.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
22:14 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host rdb2014.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
22:14 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host rdb2013.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
22:13 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host rdb2014
22:12 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host rdb2014
22:12 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host rdb2013
22:12 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host rdb2013
22:08 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
22:08 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding rdb2013 to codfw - jhancock@cumin2002"
22:07 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding rdb2013 to codfw - jhancock@cumin2002"
22:03 jhancock@cumin2002: START - Cookbook sre.dns.netbox
21:54 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudelastic1011.eqiad.wmnet with reason: host reimage
21:48 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudelastic1011.eqiad.wmnet with reason: host reimage
21:36 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1011.eqiad.wmnet with OS trixie
21:10 krinkle@deploy1003: Finished scap sync-world: Backport for ext.wikiEditor: Set background-size for toolbar buttons (T414805) (duration: 05m 47s)
21:06 krinkle@deploy1003: krinkle: Continuing with deployment
21:05 krinkle@deploy1003: krinkle: Backport for ext.wikiEditor: Set background-size for toolbar buttons (T414805) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
21:04 krinkle@deploy1003: Started scap sync-world: Backport for ext.wikiEditor: Set background-size for toolbar buttons (T414805)
21:03 krinkle@deploy1003: Finished scap sync-world: Backport for ext.wikiEditor: Set background-size for toolbar buttons (T414805) (duration: 03m 05s)
21:03 krinkle@deploy1003: krinkle: Rolling back deployment
21:02 krinkle@deploy1003: krinkle: Backport for ext.wikiEditor: Set background-size for toolbar buttons (T414805) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
21:00 krinkle@deploy1003: Started scap sync-world: Backport for ext.wikiEditor: Set background-size for toolbar buttons (T414805)
20:51 cscott@deploy1003: Finished scap sync-world: Backport for Deploy Parsoid Read Views to banwiki/ganwiki (T423785) (duration: 06m 02s)
20:47 cscott@deploy1003: cscott: Continuing with deployment
20:47 cscott@deploy1003: cscott: Backport for Deploy Parsoid Read Views to banwiki/ganwiki (T423785) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
20:45 cscott@deploy1003: Started scap sync-world: Backport for Deploy Parsoid Read Views to banwiki/ganwiki (T423785)
19:28 otto@deploy1003: Finished scap sync-world: Backport for Remove stream 'mediawiki.dump.revision_content_history.reconcile.rc0' (T417694), EventStreamConfig - add rc0 streams for html and feature count change (T423920) (duration: 22m 05s)
19:24 otto@deploy1003: xcollazo, otto: Continuing with deployment
19:14 otto@deploy1003: xcollazo, otto: Backport for Remove stream 'mediawiki.dump.revision_content_history.reconcile.rc0' (T417694), EventStreamConfig - add rc0 streams for html and feature count change (T423920) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
19:09 jasmine@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-ctrl[2004-2005].codfw.wmnet
19:09 jasmine@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-ctrl[2004-2005].codfw.wmnet
19:06 jasmine_: “ran homer on lsw1-c7-codfw and lsw1-b2-codfw following new control planes (T390861)"
19:06 otto@deploy1003: Started scap sync-world: Backport for Remove stream 'mediawiki.dump.revision_content_history.reconcile.rc0' (T417694), EventStreamConfig - add rc0 streams for html and feature count change (T423920)
18:19 jasmine@cumin2002: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Syncing netbox hieradata to fetch BGP for new control planes - jasmine@cumin2002 - T390861"
18:13 jasmine@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Syncing netbox hieradata to fetch BGP for new control planes - jasmine@cumin2002 - T390861"
17:09 bd808@deploy1003: helmfile [eqiad] DONE helmfile.d/services/developer-portal: apply
17:09 bd808@deploy1003: helmfile [eqiad] START helmfile.d/services/developer-portal: apply
17:08 bd808@deploy1003: helmfile [codfw] DONE helmfile.d/services/developer-portal: apply
17:08 bd808@deploy1003: helmfile [codfw] START helmfile.d/services/developer-portal: apply
17:07 bd808@deploy1003: helmfile [staging] DONE helmfile.d/services/developer-portal: apply
17:07 bd808@deploy1003: helmfile [staging] START helmfile.d/services/developer-portal: apply
16:46 jasmine@dns1004: END - running authdns-update
16:44 jasmine@dns1004: START - running authdns-update
16:39 jasmine@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on wikikube-ctrl2005.codfw.wmnet with reason: Downtiming to avoid page in case of race condition
16:29 herron@cumin1003: END (PASS) - Cookbook sre.kafka.change-confluent-distro-version (exit_code=0) Change Confluent distribution for Kafka A:kafka-logging-codfw cluster: Change Confluent distribution.
16:26 ladsgroup@deploy1003: Finished scap sync-world: Backport for Media: Fallback to the largest standard size if an overly large one is requested (T418745 T423895) (duration: 05m 53s)
16:22 ladsgroup@deploy1003: ladsgroup: Continuing with deployment
16:22 ladsgroup@deploy1003: ladsgroup: Backport for Media: Fallback to the largest standard size if an overly large one is requested (T418745 T423895) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
16:20 ladsgroup@deploy1003: Started scap sync-world: Backport for Media: Fallback to the largest standard size if an overly large one is requested (T418745 T423895)
16:16 Amir1: re-enabling general ban on any non-standard thumb (T414805)
16:13 klausman@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
16:13 klausman@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
16:12 klausman@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
16:12 klausman@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
16:12 klausman@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
16:11 klausman@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
16:10 herron@cumin1003: START - Cookbook sre.kafka.change-confluent-distro-version Change Confluent distribution for Kafka A:kafka-logging-codfw cluster: Change Confluent distribution.
15:54 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host ncredir5004.eqsin.wmnet
15:54 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ncredir5004.eqsin.wmnet with OS bookworm
15:48 sukhe: sudo cumin -b31 "A:cp and not P{cp2041* or cp2042*}" "run-puppet-agent --enable 'merging CR 1276017'" T420604. finish rollout of removing CSP in VCL from beta
15:34 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ncredir5004.eqsin.wmnet with reason: host reimage
15:30 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ncredir5004.eqsin.wmnet with reason: host reimage
15:25 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db2167 (T410589)', diff saved to https://phabricator.wikimedia.org/P91378 and previous config saved to /var/cache/conftool/dbconfig/20260423-152514-ladsgroup.json
15:25 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db2167.codfw.wmnet with reason: Maintenance
15:24 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2166 (T410589)', diff saved to https://phabricator.wikimedia.org/P91377 and previous config saved to /var/cache/conftool/dbconfig/20260423-152450-ladsgroup.json
15:16 aokoth@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab2002.wikimedia.org with reason: Security Release - T424175
15:14 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2166', diff saved to https://phabricator.wikimedia.org/P91375 and previous config saved to /var/cache/conftool/dbconfig/20260423-151441-ladsgroup.json
15:07 aokoth@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab2002.wikimedia.org with reason: Security Release - T424175
15:06 aokoth@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1003.wikimedia.org with reason: Security Release - T424175
15:04 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2166', diff saved to https://phabricator.wikimedia.org/P91374 and previous config saved to /var/cache/conftool/dbconfig/20260423-150433-ladsgroup.json
15:03 moritzm: installing rsync security updates
14:57 aokoth@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1003.wikimedia.org with reason: Security Release - T424175
14:54 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2166 (T410589)', diff saved to https://phabricator.wikimedia.org/P91373 and previous config saved to /var/cache/conftool/dbconfig/20260423-145425-ladsgroup.json
14:48 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ncredir5004.eqsin.wmnet with OS bookworm
14:47 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM ncredir5004.eqsin.wmnet - jmm@cumin2002"
14:46 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM ncredir5004.eqsin.wmnet - jmm@cumin2002"
14:46 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ncredir5004.eqsin.wmnet on all recursors
14:46 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache ncredir5004.eqsin.wmnet on all recursors
14:46 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
14:46 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM ncredir5004.eqsin.wmnet - jmm@cumin2002"
14:46 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM ncredir5004.eqsin.wmnet - jmm@cumin2002"
14:42 jmm@cumin2002: START - Cookbook sre.dns.netbox
14:42 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host ncredir5004.eqsin.wmnet
14:39 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host ncredir5003.eqsin.wmnet
14:39 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ncredir5003.eqsin.wmnet with OS bookworm
14:34 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
14:33 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
14:33 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
14:33 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
14:22 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts tcp-proxy5002.eqsin.wmnet
14:22 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
14:22 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: tcp-proxy5002.eqsin.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
14:22 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: tcp-proxy5002.eqsin.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
14:15 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ncredir5003.eqsin.wmnet with reason: host reimage
14:10 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ncredir5003.eqsin.wmnet with reason: host reimage
14:10 jmm@cumin2002: START - Cookbook sre.dns.netbox
14:06 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts db2145.codfw.wmnet
14:06 marostegui@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
14:06 marostegui@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2145.codfw.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
14:06 marostegui@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2145.codfw.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
14:00 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts tcp-proxy5002.eqsin.wmnet
13:59 marostegui@cumin1003: START - Cookbook sre.dns.netbox
13:59 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts tcp-proxy5001.eqsin.wmnet
13:59 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
13:59 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: tcp-proxy5001.eqsin.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
13:59 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: tcp-proxy5001.eqsin.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
13:52 marostegui@cumin1003: START - Cookbook sre.hosts.decommission for hosts db2145.codfw.wmnet
13:39 Lucas_WMDE: UTC afternoon backport+config window done
13:36 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for Enable the CampaignEvents extension on incubator (T421749) (duration: 06m 11s)
13:32 lucaswerkmeister-wmde@deploy1003: mhorsey, lucaswerkmeister-wmde: Continuing with deployment
13:32 lucaswerkmeister-wmde@deploy1003: mhorsey, lucaswerkmeister-wmde: Backport for Enable the CampaignEvents extension on incubator (T421749) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
13:32 jmm@cumin2002: START - Cookbook sre.dns.netbox
13:30 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for Enable the CampaignEvents extension on incubator (T421749)
13:28 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ncredir5003.eqsin.wmnet with OS bookworm
13:27 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts tcp-proxy5001.eqsin.wmnet
13:25 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM ncredir5003.eqsin.wmnet - jmm@cumin2002"
13:25 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM ncredir5003.eqsin.wmnet - jmm@cumin2002"
13:25 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ncredir5003.eqsin.wmnet on all recursors
13:25 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache ncredir5003.eqsin.wmnet on all recursors
13:25 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
13:25 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM ncredir5003.eqsin.wmnet - jmm@cumin2002"
13:23 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2048 (T419961)', diff saved to https://phabricator.wikimedia.org/P91370 and previous config saved to /var/cache/conftool/dbconfig/20260423-132311-fceratto.json
13:22 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
13:22 aude@deploy1003: Finished scap sync-world: Backport for Parsoid Read Views: 100% rollout to Russian Wikipedia (T423188), Opt-in new accounts to ReadingLists beta feature on all Wikipedia wikis (T420881) (duration: 06m 42s)
13:22 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
13:21 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
13:21 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
13:18 aude@deploy1003: cscott, aude: Continuing with deployment
13:16 aude@deploy1003: cscott, aude: Backport for Parsoid Read Views: 100% rollout to Russian Wikipedia (T423188), Opt-in new accounts to ReadingLists beta feature on all Wikipedia wikis (T420881) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
13:15 aude@deploy1003: Started scap sync-world: Backport for Parsoid Read Views: 100% rollout to Russian Wikipedia (T423188), Opt-in new accounts to ReadingLists beta feature on all Wikipedia wikis (T420881)
13:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2048', diff saved to https://phabricator.wikimedia.org/P91369 and previous config saved to /var/cache/conftool/dbconfig/20260423-131303-fceratto.json
13:12 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM ncredir5003.eqsin.wmnet - jmm@cumin2002"
13:05 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-analytics-test: apply
13:05 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-analytics-test: apply
13:03 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host thanos-be2009.codfw.wmnet with OS bullseye
13:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2048', diff saved to https://phabricator.wikimedia.org/P91368 and previous config saved to /var/cache/conftool/dbconfig/20260423-130255-fceratto.json
13:01 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-test-k8s: apply
13:01 eevans@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on aqs1015.eqiad.wmnet with reason: Decommissioning — T412830
13:00 jmm@cumin2002: START - Cookbook sre.dns.netbox
13:00 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host ncredir5003.eqsin.wmnet
13:00 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
12:55 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host netflow5003.eqsin.wmnet
12:55 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host netflow5003.eqsin.wmnet with OS bookworm
12:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2048 (T419961)', diff saved to https://phabricator.wikimedia.org/P91367 and previous config saved to /var/cache/conftool/dbconfig/20260423-125247-fceratto.json
12:45 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es2048 (T419961)', diff saved to https://phabricator.wikimedia.org/P91366 and previous config saved to /var/cache/conftool/dbconfig/20260423-124535-fceratto.json
12:45 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es2048.codfw.wmnet with reason: Maintenance
12:45 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on thanos-be2009.codfw.wmnet with reason: host reimage
12:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2040 (T419961)', diff saved to https://phabricator.wikimedia.org/P91365 and previous config saved to /var/cache/conftool/dbconfig/20260423-124504-fceratto.json
12:39 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on thanos-be2009.codfw.wmnet with reason: host reimage
12:38 kharlan@deploy1003: Finished scap sync-world: Backport for hCaptcha: Retry SiteVerify up to two times (T421204) (duration: 06m 25s)
12:36 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on netflow5003.eqsin.wmnet with reason: host reimage
12:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2040', diff saved to https://phabricator.wikimedia.org/P91363 and previous config saved to /var/cache/conftool/dbconfig/20260423-123456-fceratto.json
12:34 kharlan@deploy1003: kharlan: Continuing with deployment
12:33 kharlan@deploy1003: kharlan: Backport for hCaptcha: Retry SiteVerify up to two times (T421204) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
12:32 kharlan@deploy1003: Started scap sync-world: Backport for hCaptcha: Retry SiteVerify up to two times (T421204)
12:30 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on netflow5003.eqsin.wmnet with reason: host reimage
12:30 kharlan@deploy1003: Finished scap sync-world: Backport for hCaptcha: Disable Private Access Tokens in secure-api URL (T424216) (duration: 06m 57s)
12:26 kharlan@deploy1003: kharlan: Continuing with deployment
12:24 kharlan@deploy1003: kharlan: Backport for hCaptcha: Disable Private Access Tokens in secure-api URL (T424216) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
12:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2040', diff saved to https://phabricator.wikimedia.org/P91362 and previous config saved to /var/cache/conftool/dbconfig/20260423-122448-fceratto.json
12:23 kharlan@deploy1003: Started scap sync-world: Backport for hCaptcha: Disable Private Access Tokens in secure-api URL (T424216)
12:19 kharlan@deploy1003: Finished scap sync-world: Backport for hCaptcha: Don't prevent opening links present in the hCaptcha popup (T408812) (duration: 08m 11s)
12:16 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host thanos-be2009.codfw.wmnet with OS bullseye
12:15 kharlan@deploy1003: harroyo-wmf, kharlan: Continuing with deployment
12:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2040 (T419961)', diff saved to https://phabricator.wikimedia.org/P91361 and previous config saved to /var/cache/conftool/dbconfig/20260423-121439-fceratto.json
12:12 kharlan@deploy1003: harroyo-wmf, kharlan: Backport for hCaptcha: Don't prevent opening links present in the hCaptcha popup (T408812) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
12:11 kharlan@deploy1003: Started scap sync-world: Backport for hCaptcha: Don't prevent opening links present in the hCaptcha popup (T408812)
12:08 kart_: staging: Update cxserver to 2026-04-23-114216-production (T423002)
12:04 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es2040 (T419961)', diff saved to https://phabricator.wikimedia.org/P91360 and previous config saved to /var/cache/conftool/dbconfig/20260423-120400-fceratto.json
12:03 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es2040.codfw.wmnet with reason: Maintenance
12:03 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2039 (T419961)', diff saved to https://phabricator.wikimedia.org/P91359 and previous config saved to /var/cache/conftool/dbconfig/20260423-120332-fceratto.json
12:00 kartik@deploy1003: helmfile [staging] DONE helmfile.d/services/cxserver: apply
12:00 kartik@deploy1003: helmfile [staging] START helmfile.d/services/cxserver: apply
11:53 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2039', diff saved to https://phabricator.wikimedia.org/P91358 and previous config saved to /var/cache/conftool/dbconfig/20260423-115324-fceratto.json
11:47 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host netflow5003.eqsin.wmnet with OS bookworm
11:44 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
11:44 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
11:43 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2039', diff saved to https://phabricator.wikimedia.org/P91357 and previous config saved to /var/cache/conftool/dbconfig/20260423-114316-fceratto.json
11:42 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM netflow5003.eqsin.wmnet - jmm@cumin2002"
11:42 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM netflow5003.eqsin.wmnet - jmm@cumin2002"
11:42 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) netflow5003.eqsin.wmnet on all recursors
11:42 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache netflow5003.eqsin.wmnet on all recursors
11:42 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
11:42 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM netflow5003.eqsin.wmnet - jmm@cumin2002"
11:40 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM netflow5003.eqsin.wmnet - jmm@cumin2002"
11:36 jmm@cumin2002: START - Cookbook sre.dns.netbox
11:36 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host netflow5003.eqsin.wmnet
11:33 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2039 (T419961)', diff saved to https://phabricator.wikimedia.org/P91356 and previous config saved to /var/cache/conftool/dbconfig/20260423-113307-fceratto.json
11:21 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es2039 (T419961)', diff saved to https://phabricator.wikimedia.org/P91355 and previous config saved to /var/cache/conftool/dbconfig/20260423-112133-fceratto.json
11:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es2039.codfw.wmnet with reason: Maintenance
11:21 moritzm: installing ngtcp2 security updates
11:20 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
11:19 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
11:13 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
11:13 hnowlan@deploy1003: Finished deploy [restbase/deploy@8a25036]: Add urwikisource T415975 (repeat attempt, last deploy did not include change) (duration: 11m 55s)
11:13 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
11:12 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host doh5004.wikimedia.org
11:12 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host doh5004.wikimedia.org with OS bookworm
11:08 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
11:08 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
11:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1037 (T419961)', diff saved to https://phabricator.wikimedia.org/P91354 and previous config saved to /var/cache/conftool/dbconfig/20260423-110359-fceratto.json
11:01 hnowlan@deploy1003: Started deploy [restbase/deploy@8a25036]: Add urwikisource T415975 (repeat attempt, last deploy did not include change)
11:00 hnowlan@deploy1003: Finished deploy [restbase/deploy@8a25036]: Add urwikisource T415975 (repeat attempt, last deploy did not include change) (duration: 33m 20s)
10:57 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host thanos-be2008.codfw.wmnet with OS bullseye
10:54 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on doh5004.wikimedia.org with reason: host reimage
10:53 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1037', diff saved to https://phabricator.wikimedia.org/P91353 and previous config saved to /var/cache/conftool/dbconfig/20260423-105351-fceratto.json
10:50 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on doh5004.wikimedia.org with reason: host reimage
10:43 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1037', diff saved to https://phabricator.wikimedia.org/P91352 and previous config saved to /var/cache/conftool/dbconfig/20260423-104343-fceratto.json
10:42 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
10:39 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on thanos-be2008.codfw.wmnet with reason: host reimage
10:37 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
10:33 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
10:33 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es1037 (T419961)', diff saved to https://phabricator.wikimedia.org/P91351 and previous config saved to /var/cache/conftool/dbconfig/20260423-103334-fceratto.json
10:32 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on thanos-be2008.codfw.wmnet with reason: host reimage
10:27 hnowlan@deploy1003: Started deploy [restbase/deploy@8a25036]: Add urwikisource T415975 (repeat attempt, last deploy did not include change)
10:24 lucaswerkmeister-wmde@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikidata-query-gui: apply
10:24 lucaswerkmeister-wmde@deploy1003: helmfile [eqiad] START helmfile.d/services/wikidata-query-gui: apply
10:24 lucaswerkmeister-wmde@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikidata-query-gui: apply
10:24 lucaswerkmeister-wmde@deploy1003: helmfile [codfw] START helmfile.d/services/wikidata-query-gui: apply
10:24 lucaswerkmeister-wmde@deploy1003: helmfile [staging] DONE helmfile.d/services/wikidata-query-gui: apply
10:23 lucaswerkmeister-wmde@deploy1003: helmfile [staging] START helmfile.d/services/wikidata-query-gui: apply
10:21 isaranto@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
10:20 isaranto@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
10:19 marostegui@cumin1003: dbctl commit (dc=all): 'Repool pc2 with pc2022 as codfw master T418973', diff saved to https://phabricator.wikimedia.org/P91348 and previous config saved to /var/cache/conftool/dbconfig/20260423-101957-marostegui.json
10:19 daniel@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/aux-k8s-services/redioscope: apply
10:19 daniel@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/aux-k8s-services/redioscope: apply
10:19 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es1037 (T419961)', diff saved to https://phabricator.wikimedia.org/P91347 and previous config saved to /var/cache/conftool/dbconfig/20260423-101855-fceratto.json
10:18 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1037.eqiad.wmnet with reason: Maintenance
10:17 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
10:16 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
10:16 daniel@deploy1003: Finished scap sync-world: Backport for api rate limits: use global apihighlimits-requestor group. (T419796) (duration: 07m 37s)
10:16 marostegui@cumin1003: dbctl commit (dc=all): 'Make pc2022 master of pc2 T418973', diff saved to https://phabricator.wikimedia.org/P91346 and previous config saved to /var/cache/conftool/dbconfig/20260423-101611-marostegui.json
10:15 marostegui@cumin1003: dbctl commit (dc=all): 'Add pc2022, remove pc2012 T418973 T424201', diff saved to https://phabricator.wikimedia.org/P91345 and previous config saved to /var/cache/conftool/dbconfig/20260423-101544-marostegui.json
10:15 daniel@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/aux-k8s-services/redioscope: apply
10:15 ayounsi@cumin1003: END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox-canary
10:14 ayounsi@cumin1003: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox-canary
10:14 daniel@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/aux-k8s-services/redioscope: apply
10:14 ayounsi@cumin1003: END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox
10:13 ayounsi@cumin1003: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox
10:12 daniel@deploy1003: daniel: Continuing with deployment
10:10 daniel@deploy1003: daniel: Backport for api rate limits: use global apihighlimits-requestor group. (T419796) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
10:10 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host thanos-be2008.codfw.wmnet with OS bullseye
10:08 daniel@deploy1003: Started scap sync-world: Backport for api rate limits: use global apihighlimits-requestor group. (T419796)
10:00 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host doh5004.wikimedia.org with OS bookworm
10:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2047 (T419961)', diff saved to https://phabricator.wikimedia.org/P91343 and previous config saved to /var/cache/conftool/dbconfig/20260423-100035-fceratto.json
09:59 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM doh5004.wikimedia.org - jmm@cumin2002"
09:59 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM doh5004.wikimedia.org - jmm@cumin2002"
09:58 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) doh5004.wikimedia.org on all recursors
09:58 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache doh5004.wikimedia.org on all recursors
09:58 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
09:58 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM doh5004.wikimedia.org - jmm@cumin2002"
09:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2047', diff saved to https://phabricator.wikimedia.org/P91341 and previous config saved to /var/cache/conftool/dbconfig/20260423-095027-fceratto.json
09:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2047', diff saved to https://phabricator.wikimedia.org/P91340 and previous config saved to /var/cache/conftool/dbconfig/20260423-094019-fceratto.json
09:32 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM doh5004.wikimedia.org - jmm@cumin2002"
09:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2047 (T419961)', diff saved to https://phabricator.wikimedia.org/P91339 and previous config saved to /var/cache/conftool/dbconfig/20260423-093010-fceratto.json
09:27 aikochou@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
09:25 jmm@cumin2002: START - Cookbook sre.dns.netbox
09:25 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host doh5004.wikimedia.org
09:23 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es2047 (T419961)', diff saved to https://phabricator.wikimedia.org/P91338 and previous config saved to /var/cache/conftool/dbconfig/20260423-092303-fceratto.json
09:22 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es2047.codfw.wmnet with reason: Maintenance
09:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2037 (T419961)', diff saved to https://phabricator.wikimedia.org/P91337 and previous config saved to /var/cache/conftool/dbconfig/20260423-092232-fceratto.json
09:19 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host doh5003.wikimedia.org
09:19 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host doh5003.wikimedia.org with OS bookworm
09:17 aikochou@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
09:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2037', diff saved to https://phabricator.wikimedia.org/P91336 and previous config saved to /var/cache/conftool/dbconfig/20260423-091224-fceratto.json
09:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2037', diff saved to https://phabricator.wikimedia.org/P91335 and previous config saved to /var/cache/conftool/dbconfig/20260423-090216-fceratto.json
09:00 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on doh5003.wikimedia.org with reason: host reimage
09:00 marostegui@cumin1003: dbctl commit (dc=all): 'Remove db2146 from dbctl T424179', diff saved to https://phabricator.wikimedia.org/P91334 and previous config saved to /var/cache/conftool/dbconfig/20260423-090014-marostegui.json
08:58 jmm@puppetserver1001: conftool action : set/pooled=no; selector: name=tcp-proxy5002.eqsin.wmnet
08:58 jmm@puppetserver1001: conftool action : set/pooled=no; selector: name=tcp-proxy5001.eqsin.wmnet
08:56 jmm@puppetserver1001: conftool action : set/pooled=yes; selector: name=tcp-proxy5004.eqsin.wmnet
08:56 jmm@puppetserver1001: conftool action : set/weight=1; selector: name=tcp-proxy5004.eqsin.wmnet
08:55 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on doh5003.wikimedia.org with reason: host reimage
08:53 jmm@puppetserver1001: conftool action : set/pooled=yes; selector: name=tcp-proxy5003.eqsin.wmnet
08:52 jmm@puppetserver1001: conftool action : set/weight=1; selector: name=tcp-proxy5003.eqsin.wmnet
08:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance es2037 (T419961)', diff saved to https://phabricator.wikimedia.org/P91333 and previous config saved to /var/cache/conftool/dbconfig/20260423-085207-fceratto.json
08:40 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling es2037 (T419961)', diff saved to https://phabricator.wikimedia.org/P91330 and previous config saved to /var/cache/conftool/dbconfig/20260423-084035-fceratto.json
08:40 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es2037.codfw.wmnet with reason: Maintenance
08:12 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2152.codfw.wmnet with reason: Maintenance
08:10 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host thanos-be2007.codfw.wmnet with OS bullseye
08:06 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host doh5003.wikimedia.org with OS bookworm
08:05 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM doh5003.wikimedia.org - jmm@cumin2002"
08:05 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM doh5003.wikimedia.org - jmm@cumin2002"
08:05 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) doh5003.wikimedia.org on all recursors
08:05 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache doh5003.wikimedia.org on all recursors
08:05 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
08:05 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM doh5003.wikimedia.org - jmm@cumin2002"
08:04 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM doh5003.wikimedia.org - jmm@cumin2002"
08:01 jmm@cumin2002: START - Cookbook sre.dns.netbox
08:01 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host doh5003.wikimedia.org
07:50 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on thanos-be2007.codfw.wmnet with reason: host reimage
07:45 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on thanos-be2007.codfw.wmnet with reason: host reimage
07:22 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host thanos-be2007.codfw.wmnet with OS bullseye
07:15 marostegui@cumin1003: dbctl commit (dc=all): 'Remove db2145 from dbctl T424177', diff saved to https://phabricator.wikimedia.org/P91329 and previous config saved to /var/cache/conftool/dbconfig/20260423-071500-marostegui.json
06:58 marostegui@cumin1003: dbctl commit (dc=all): 'Repool ms1 with db2252 as new codfw master T418979', diff saved to https://phabricator.wikimedia.org/P91328 and previous config saved to /var/cache/conftool/dbconfig/20260423-065803-marostegui.json
06:57 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2252: Cloning
06:57 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.parsercache (exit_code=99)
06:56 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
06:56 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2252: Cloning
06:53 marostegui@cumin1003: dbctl commit (dc=all): 'Make db2252 master of ms3 T418979', diff saved to https://phabricator.wikimedia.org/P91327 and previous config saved to /var/cache/conftool/dbconfig/20260423-065323-marostegui.json
06:52 marostegui@cumin1003: dbctl commit (dc=all): 'Remove db2143 from ms3, add db2252 T418979', diff saved to https://phabricator.wikimedia.org/P91326 and previous config saved to /var/cache/conftool/dbconfig/20260423-065214-marostegui.json
06:28 jelto: gerrit2003 maintenance finished - T333143
06:05 jelto: start gerrit2003 maintenance - T333143
05:57 jelto@cumin1003: DONE (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 0:35:00 on gerrit.discovery.wmnet with reason: Gerrit maintenance
05:57 jelto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:35:00 on gerrit2003.wikimedia.org with reason: Gerrit maintenance
05:41 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db[2143,2252].codfw.wmnet,db1153.eqiad.wmnet with reason: Cloning
05:41 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2143: Cloning db2252 from db2143
05:41 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
05:41 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
05:41 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2143: Cloning db2252 from db2143
05:28 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on pc[2012,2022].codfw.wmnet,pc1012.eqiad.wmnet with reason: Cloning
05:28 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool pc2012: Cloning pc2022 from pc2012
05:28 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
05:28 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
05:28 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool pc2012: Cloning pc2022 from pc2012
05:27 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on pc[2012,2022].codfw.wmnet with reason: Cloning
03:15 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db2166 (T410589)', diff saved to https://phabricator.wikimedia.org/P91321 and previous config saved to /var/cache/conftool/dbconfig/20260423-031538-ladsgroup.json
03:15 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db2166.codfw.wmnet with reason: Maintenance
03:15 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2164 (T410589)', diff saved to https://phabricator.wikimedia.org/P91320 and previous config saved to /var/cache/conftool/dbconfig/20260423-031512-ladsgroup.json
03:05 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2164', diff saved to https://phabricator.wikimedia.org/P91319 and previous config saved to /var/cache/conftool/dbconfig/20260423-030504-ladsgroup.json
02:54 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2164', diff saved to https://phabricator.wikimedia.org/P91318 and previous config saved to /var/cache/conftool/dbconfig/20260423-025455-ladsgroup.json
02:44 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2164 (T410589)', diff saved to https://phabricator.wikimedia.org/P91317 and previous config saved to /var/cache/conftool/dbconfig/20260423-024447-ladsgroup.json
02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 12s)
02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image

2026-04-22

15:40 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db[2141,2250].codfw.wmnet with reason: clone
15:08 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db2164 (T410589)', diff saved to https://phabricator.wikimedia.org/P91315 and previous config saved to /var/cache/conftool/dbconfig/20260422-150817-ladsgroup.json
15:08 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db2164.codfw.wmnet with reason: Maintenance
15:07 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2163 (T410589)', diff saved to https://phabricator.wikimedia.org/P91314 and previous config saved to /var/cache/conftool/dbconfig/20260422-150752-ladsgroup.json
14:57 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2163', diff saved to https://phabricator.wikimedia.org/P91313 and previous config saved to /var/cache/conftool/dbconfig/20260422-145744-ladsgroup.json
14:47 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2163', diff saved to https://phabricator.wikimedia.org/P91312 and previous config saved to /var/cache/conftool/dbconfig/20260422-144736-ladsgroup.json
14:37 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2163 (T410589)', diff saved to https://phabricator.wikimedia.org/P91311 and previous config saved to /var/cache/conftool/dbconfig/20260422-143728-ladsgroup.json
11:59 mvolz@deploy1003: helmfile [eqiad] DONE helmfile.d/services/zotero: apply
11:58 mvolz@deploy1003: helmfile [eqiad] START helmfile.d/services/zotero: apply
11:41 mvolz@deploy1003: helmfile [codfw] DONE helmfile.d/services/zotero: apply
11:41 mvolz@deploy1003: helmfile [codfw] START helmfile.d/services/zotero: apply
11:36 mvolz@deploy1003: helmfile [staging] DONE helmfile.d/services/zotero: apply
11:36 mvolz@deploy1003: helmfile [staging] START helmfile.d/services/zotero: apply
11:26 mvolz@deploy1003: helmfile [codfw] DONE helmfile.d/services/citoid: apply
11:26 mvolz@deploy1003: helmfile [codfw] START helmfile.d/services/citoid: apply
11:25 mvolz@deploy1003: helmfile [eqiad] DONE helmfile.d/services/citoid: apply
11:25 mvolz@deploy1003: helmfile [eqiad] START helmfile.d/services/citoid: apply
11:25 mvolz@deploy1003: helmfile [staging] DONE helmfile.d/services/citoid: apply
11:24 mvolz@deploy1003: helmfile [staging] START helmfile.d/services/citoid: apply
11:22 mvolz@deploy1003: helmfile [staging] DONE helmfile.d/services/citoid: apply
11:22 mvolz@deploy1003: helmfile [staging] START helmfile.d/services/citoid: apply
11:12 mvolz@deploy1003: helmfile [eqiad] DONE helmfile.d/services/citoid: apply
11:12 mvolz@deploy1003: helmfile [eqiad] START helmfile.d/services/citoid: apply
11:08 mvolz@deploy1003: helmfile [codfw] DONE helmfile.d/services/citoid: apply
11:07 mvolz@deploy1003: helmfile [codfw] START helmfile.d/services/citoid: apply
11:06 mvolz@deploy1003: helmfile [staging] DONE helmfile.d/services/citoid: apply
11:06 mvolz@deploy1003: helmfile [staging] START helmfile.d/services/citoid: apply
10:27 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db[2141,2250].codfw.wmnet with reason: clone
07:23 samwilson@deploy1003: Finished scap sync-world: Backport for Use canvas rather than webgl for OpenSeadragon (T423548) (duration: 08m 31s)
07:17 samwilson@deploy1003: samwilson: Continuing with deployment
07:16 samwilson@deploy1003: samwilson: Backport for Use canvas rather than webgl for OpenSeadragon (T423548) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
07:14 samwilson@deploy1003: Started scap sync-world: Backport for Use canvas rather than webgl for OpenSeadragon (T423548)
04:35 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
04:19 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
03:03 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db2163 (T410589)', diff saved to https://phabricator.wikimedia.org/P91310 and previous config saved to /var/cache/conftool/dbconfig/20260422-030300-ladsgroup.json
03:02 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db2163.codfw.wmnet with reason: Maintenance
03:02 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2161 (T410589)', diff saved to https://phabricator.wikimedia.org/P91309 and previous config saved to /var/cache/conftool/dbconfig/20260422-030235-ladsgroup.json
02:52 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2161', diff saved to https://phabricator.wikimedia.org/P91308 and previous config saved to /var/cache/conftool/dbconfig/20260422-025227-ladsgroup.json
02:42 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2161', diff saved to https://phabricator.wikimedia.org/P91307 and previous config saved to /var/cache/conftool/dbconfig/20260422-024219-ladsgroup.json
02:32 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2161 (T410589)', diff saved to https://phabricator.wikimedia.org/P91306 and previous config saved to /var/cache/conftool/dbconfig/20260422-023211-ladsgroup.json
02:17 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host phab2003.codfw.wmnet with OS trixie
02:17 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
02:16 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 06s)
02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
01:59 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on phab2003.codfw.wmnet with reason: host reimage
01:51 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on phab2003.codfw.wmnet with reason: host reimage
01:37 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host phab2003.codfw.wmnet with OS trixie
01:24 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host phab2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
00:55 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host phab2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED

2026-04-21

23:15 denisse@deploy1003: Finished deploy [librenms/librenms@4a0466d]: Upgrade LibreNMS to 26.4.0 - T423229 (duration: 00m 18s)
23:15 denisse@deploy1003: Started deploy [librenms/librenms@4a0466d]: Upgrade LibreNMS to 26.4.0 - T423229
{{safesubst:SAL entry|1=22:37 musikanimal@deploy1003: Finished scap sync-world: Backport for Promote CM6 out of beta, remove CM5 modules, and add v6 aliases (T373720), VisualEditor.CodeMirror.less: remove CM5 styles, CodeEditorHooks: remove temporary code for CodeMirror beta feature (T419332), DescriptionField: use new module name for loading CodeMirror, [[gerrit:1275998|H}}
22:31 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host rdb1015.eqiad.wmnet with OS trixie
22:25 musikanimal@deploy1003: musikanimal: Continuing with deployment
{{safesubst:SAL entry|1=22:19 musikanimal@deploy1003: musikanimal: Backport for Promote CM6 out of beta, remove CM5 modules, and add v6 aliases (T373720), VisualEditor.CodeMirror.less: remove CM5 styles, CodeEditorHooks: remove temporary code for CodeMirror beta feature (T419332), DescriptionField: use new module name for loading CodeMirror, [[gerrit:1275998|Hooks: remove}}
{{safesubst:SAL entry|1=22:02 musikanimal@deploy1003: Started scap sync-world: Backport for Promote CM6 out of beta, remove CM5 modules, and add v6 aliases (T373720), VisualEditor.CodeMirror.less: remove CM5 styles, CodeEditorHooks: remove temporary code for CodeMirror beta feature (T419332), DescriptionField: use new module name for loading CodeMirror, [[gerrit:1275998|Ho}}
21:58 cjming@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
21:57 cjming@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
21:47 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host rdb1016.eqiad.wmnet with OS trixie
21:29 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host rdb1016.eqiad.wmnet with OS trixie
21:29 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host rdb1016.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
21:17 jclark@cumin1003: START - Cookbook sre.hosts.provision for host rdb1016.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
21:12 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host rdb1016.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
21:10 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host rdb1015.eqiad.wmnet with OS trixie
21:07 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host rdb1015.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
21:05 jclark@cumin1003: START - Cookbook sre.hosts.provision for host rdb1016.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
21:00 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host rdb1016.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
20:57 musikanimal@deploy1003: Finished scap sync-world: Backport for mw.FormDataTransport.test: Update expected API call for POSTed calls (T423529 T421288) (duration: 06m 27s)
20:53 musikanimal@deploy1003: musikanimal: Continuing with deployment
20:53 jclark@cumin1003: START - Cookbook sre.hosts.provision for host rdb1016.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
20:52 jclark@cumin1003: START - Cookbook sre.hosts.provision for host rdb1015.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
20:52 musikanimal@deploy1003: musikanimal: Backport for mw.FormDataTransport.test: Update expected API call for POSTed calls (T423529 T421288) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
20:51 musikanimal@deploy1003: Started scap sync-world: Backport for mw.FormDataTransport.test: Update expected API call for POSTed calls (T423529 T421288)
20:50 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
20:50 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding rdb1016 to eqiad - jclark@cumin1003"
20:49 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding rdb1016 to eqiad - jclark@cumin1003"
20:44 dancy@deploy1003: Installation of scap version "4.250.1" completed for 2 hosts
20:42 dancy@deploy1003: Installing scap version "4.250.1" for 2 host(s)
20:35 jclark@cumin1003: START - Cookbook sre.dns.netbox
20:28 Dreamy_Jazz: Evening UTC backport window done
20:16 Dreamy_Jazz: Running `mwscript-k8s maintenance/namespaceDupes.php --wiki=diqwiki --fix`
20:15 dreamyjazz@deploy1003: Finished scap sync-world: Backport for Diqwiki: change project namespace (T328207), Remove unused wgCheckUserUserAgentTableMigrationStage config, CheckUser Suggested Investigations: Enable on commonswiki (T424084) (duration: 07m 38s)
20:11 dreamyjazz@deploy1003: pppery, dreamyjazz: Continuing with sync
20:09 dreamyjazz@deploy1003: pppery, dreamyjazz: Backport for Diqwiki: change project namespace (T328207), Remove unused wgCheckUserUserAgentTableMigrationStage config, CheckUser Suggested Investigations: Enable on commonswiki (T424084) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
20:08 dreamyjazz@deploy1003: Started scap sync-world: Backport for Diqwiki: change project namespace (T328207), Remove unused wgCheckUserUserAgentTableMigrationStage config, CheckUser Suggested Investigations: Enable on commonswiki (T424084)
20:07 dancy@deploy1003: Installation of scap version "4.249.0" completed for 2 hosts
20:05 dancy@deploy1003: Installing scap version "4.249.0" for 2 host(s)
19:49 jasmine@dns1004: END - running authdns-update
19:47 jasmine@dns1004: START - running authdns-update
19:37 mutante: contint1003 - re-enabling puppet T418521
19:32 Dreamy_Jazz: Created cusi_user, cusi_case, and cusi_signal on commonswiki on the extension1 database cluster - T424084
18:02 dancy@deploy1003: Finished scap sync-world: Testing (duration: 02m 58s)
17:59 dancy@deploy1003: Started scap sync-world: Testing
17:58 dancy@deploy1003: Installation of scap version "4.250.0" completed for 2 hosts
17:56 dancy@deploy1003: Installing scap version "4.250.0" for 2 host(s)
17:42 rzl@deploy1003: Finished scap sync-world: https://gerrit.wikimedia.org/r/1275464 T423623 (duration: 02m 30s)
17:41 rzl@deploy1003: Started scap sync-world: https://gerrit.wikimedia.org/r/1275464 T423623
17:01 mvernon@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host thanos-be2005.codfw.wmnet with OS bullseye
17:00 aikochou@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
16:45 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on thanos-be2005.codfw.wmnet with reason: host reimage
16:37 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on thanos-be2005.codfw.wmnet with reason: host reimage
16:23 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host thanos-be2005.codfw.wmnet with OS bullseye
16:23 brennen@deploy1003: Finished deploy [phabricator/deployment@ceeecba]: deploy phab1004 for T424059 (duration: 00m 38s)
16:22 brennen@deploy1003: Started deploy [phabricator/deployment@ceeecba]: deploy phab1004 for T424059
16:22 brennen@deploy1003: Finished deploy [phabricator/deployment@ceeecba]: deploy phab2002 for T424059 (duration: 00m 47s)
16:21 brennen@deploy1003: Started deploy [phabricator/deployment@ceeecba]: deploy phab2002 for T424059
15:58 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host prometheus5003.eqsin.wmnet
15:57 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host prometheus5003.eqsin.wmnet with OS bookworm
15:45 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on prometheus5003.eqsin.wmnet with reason: host reimage
15:40 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on prometheus5003.eqsin.wmnet with reason: host reimage
15:39 moritzm: installing busybox updates from Trixie point release
15:05 brennen@deploy1003: Finished deploy [phabricator/deployment@ce0ec30]: deploy phab1004 for T424033 (duration: 00m 43s)
15:04 aokoth@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on phab1004.eqiad.wmnet with reason: Phorge Deploy
15:04 brennen@deploy1003: Started deploy [phabricator/deployment@ce0ec30]: deploy phab1004 for T424033
15:04 brennen@deploy1003: Finished deploy [phabricator/deployment@ce0ec30]: deploy phab2002 for T424033 (duration: 00m 44s)
15:03 brennen@deploy1003: Started deploy [phabricator/deployment@ce0ec30]: deploy phab2002 for T424033
15:01 aokoth@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on phab2002.codfw.wmnet with reason: Phorge Deploy
15:00 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host prometheus5003.eqsin.wmnet with OS bookworm
15:00 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db2161 (T410589)', diff saved to https://phabricator.wikimedia.org/P91305 and previous config saved to /var/cache/conftool/dbconfig/20260421-150025-ladsgroup.json
15:00 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db2161.codfw.wmnet with reason: Maintenance
15:00 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2154 (T410589)', diff saved to https://phabricator.wikimedia.org/P91304 and previous config saved to /var/cache/conftool/dbconfig/20260421-145959-ladsgroup.json
14:56 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM prometheus5003.eqsin.wmnet - jmm@cumin2002"
14:56 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM prometheus5003.eqsin.wmnet - jmm@cumin2002"
14:55 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) prometheus5003.eqsin.wmnet on all recursors
14:55 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache prometheus5003.eqsin.wmnet on all recursors
14:55 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
14:55 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM prometheus5003.eqsin.wmnet - jmm@cumin2002"
14:55 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM prometheus5003.eqsin.wmnet - jmm@cumin2002"
14:54 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host thanos-be1009.eqiad.wmnet with OS bullseye
14:53 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.powercycle (exit_code=0) for host sretest2010
14:51 elukey@cumin1003: START - Cookbook sre.hosts.powercycle for host sretest2010
14:51 jmm@cumin2002: START - Cookbook sre.dns.netbox
14:51 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host prometheus5003.eqsin.wmnet
14:49 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2154', diff saved to https://phabricator.wikimedia.org/P91303 and previous config saved to /var/cache/conftool/dbconfig/20260421-144951-ladsgroup.json
14:45 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host tcp-proxy5004.eqsin.wmnet
14:45 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host tcp-proxy5004.eqsin.wmnet with OS trixie
14:39 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2154', diff saved to https://phabricator.wikimedia.org/P91302 and previous config saved to /var/cache/conftool/dbconfig/20260421-143943-ladsgroup.json
14:39 cscott@deploy1003: Finished scap sync-world: Backport for Increase Parsoid Read Views percentage for ruwiki to 55% (duration: 09m 37s)
14:36 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on thanos-be1009.eqiad.wmnet with reason: host reimage
14:35 cscott@deploy1003: cscott: Continuing with sync
14:34 papaul: moving OOB link on mr1-eqiad to ge-0/0/7
14:32 moritzm: installing gdk-pixbuf security updates
14:31 marostegui@cumin1003: dbctl commit (dc=all): 'Repool ms1 T418979', diff saved to https://phabricator.wikimedia.org/P91301 and previous config saved to /var/cache/conftool/dbconfig/20260421-143145-marostegui.json
14:31 cscott@deploy1003: cscott: Backport for Increase Parsoid Read Views percentage for ruwiki to 55% synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
14:30 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on thanos-be1009.eqiad.wmnet with reason: host reimage
14:30 marostegui@cumin1003: dbctl commit (dc=all): 'Add db2251 to ms1 T418979', diff saved to https://phabricator.wikimedia.org/P91300 and previous config saved to /var/cache/conftool/dbconfig/20260421-143017-marostegui.json
14:29 cscott@deploy1003: Started scap sync-world: Backport for Increase Parsoid Read Views percentage for ruwiki to 55%
14:29 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2154 (T410589)', diff saved to https://phabricator.wikimedia.org/P91299 and previous config saved to /var/cache/conftool/dbconfig/20260421-142935-ladsgroup.json
14:29 marostegui@cumin1003: dbctl commit (dc=all): 'Add db2251, remove db2142 T418979', diff saved to https://phabricator.wikimedia.org/P91298 and previous config saved to /var/cache/conftool/dbconfig/20260421-142913-marostegui.json
14:24 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on tcp-proxy5004.eqsin.wmnet with reason: host reimage
14:22 cscott@deploy1003: Finished scap sync-world: Backport for Bump wikimedia/parsoid to 0.23.0-a28 (T420102 T421680 T422879 T422966 T423192 T423763 T423662), Bump wikimedia/parsoid to 0.23.0-a28 (T423662), [tests] add ParsoidLanguageConverterTest, ParsoidLanguageConverter: update lang/dir on content wrapper div (T423747) (duration: 13m 02s)
14:21 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on tcp-proxy5004.eqsin.wmnet with reason: host reimage
14:16 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 15 days, 0:00:00 on db2142.codfw.wmnet,pc2011.codfw.wmnet with reason: Will be decommissioned
14:16 cscott@deploy1003: cscott: Continuing with sync
14:11 cscott@deploy1003: cscott: Backport for Bump wikimedia/parsoid to 0.23.0-a28 (T420102 T421680 T422879 T422966 T423192 T423763 T423662), Bump wikimedia/parsoid to 0.23.0-a28 (T423662), [tests] add ParsoidLanguageConverterTest, ParsoidLanguageConverter: update lang/dir on content wrapper div (T423747) synced to the testservers (see https://wikit
14:10 cscott@deploy1003: Started scap sync-world: Backport for Bump wikimedia/parsoid to 0.23.0-a28 (T420102 T421680 T422879 T422966 T423192 T423763 T423662), Bump wikimedia/parsoid to 0.23.0-a28 (T423662), [tests] add ParsoidLanguageConverterTest, ParsoidLanguageConverter: update lang/dir on content wrapper div (T423747)
14:08 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host thanos-be1009.eqiad.wmnet with OS bullseye
13:55 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2142.codfw.wmnet,db1152.eqiad.wmnet with reason: Cloning
13:55 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2142: Cloning
13:55 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
13:55 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
13:55 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2142: Cloning
13:53 jayme@cumin1003: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts wikikube-worker[1029,1089,1092,1098-1099,1106,1112].eqiad.wmnet
13:53 jayme@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
13:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host atlas5001.wikimedia.org
13:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM atlas5001.wikimedia.org - ayounsi@cumin1003"
13:52 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM atlas5001.wikimedia.org - ayounsi@cumin1003"
{{safesubst:SAL entry|1=13:52 stran@deploy1003: Finished scap sync-world: Backport for Enable non-emergency categories via config (T423244), Add next steps page for non-emergency "sockpuppetry" incidents (T423045), Add next steps page for non-emergency "vandalism" incidents (T423563), Add next steps page for non-emergency "user dispute" incidents (T423587), [[gerrit:127583}}
13:51 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) atlas5001.wikimedia.org on all recursors
13:51 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache atlas5001.wikimedia.org on all recursors
13:51 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
13:51 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM atlas5001.wikimedia.org - ayounsi@cumin1003"
13:50 jayme@cumin1003: START - Cookbook sre.dns.netbox
13:49 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM atlas5001.wikimedia.org - ayounsi@cumin1003"
13:45 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
13:45 ayounsi@cumin1003: START - Cookbook sre.ganeti.makevm for new host atlas5001.wikimedia.org
13:44 isaranto@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
13:44 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
13:44 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
13:40 stran@deploy1003: stran: Continuing with sync
13:39 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
13:39 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: eqsin remove old sandbox vlan - ayounsi@cumin1003"
13:38 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: eqsin remove old sandbox vlan - ayounsi@cumin1003"
{{safesubst:SAL entry|1=13:37 stran@deploy1003: stran: Backport for Enable non-emergency categories via config (T423244), Add next steps page for non-emergency "sockpuppetry" incidents (T423045), Add next steps page for non-emergency "vandalism" incidents (T423563), Add next steps page for non-emergency "user dispute" incidents (T423587), [[gerrit:1275836|Add next steps pa}}
13:34 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
13:30 ayounsi@dns1004: END - running authdns-update
13:29 ayounsi@dns1004: START - running authdns-update
13:27 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host tcp-proxy5004.eqsin.wmnet with OS trixie
13:27 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM tcp-proxy5004.eqsin.wmnet - jmm@cumin2002"
13:27 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM tcp-proxy5004.eqsin.wmnet - jmm@cumin2002"
13:26 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) tcp-proxy5004.eqsin.wmnet on all recursors
13:26 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache tcp-proxy5004.eqsin.wmnet on all recursors
13:26 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
13:26 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM tcp-proxy5004.eqsin.wmnet - jmm@cumin2002"
13:25 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM tcp-proxy5004.eqsin.wmnet - jmm@cumin2002"
{{safesubst:SAL entry|1=13:20 stran@deploy1003: Started scap sync-world: Backport for Enable non-emergency categories via config (T423244), Add next steps page for non-emergency "sockpuppetry" incidents (T423045), Add next steps page for non-emergency "vandalism" incidents (T423563), Add next steps page for non-emergency "user dispute" incidents (T423587), [[gerrit:1275836}}
13:16 aude@deploy1003: Finished scap sync-world: Backport for Opt-in new accounts to the ReadingLists beta feature on enwiki (T420881) (duration: 06m 50s)
13:13 jmm@cumin2002: START - Cookbook sre.dns.netbox
13:13 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host tcp-proxy5004.eqsin.wmnet
13:12 aude@deploy1003: aude: Continuing with sync
13:11 aude@deploy1003: aude: Backport for Opt-in new accounts to the ReadingLists beta feature on enwiki (T420881) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
13:09 aude@deploy1003: Started scap sync-world: Backport for Opt-in new accounts to the ReadingLists beta feature on enwiki (T420881)
13:09 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host tcp-proxy5003.eqsin.wmnet
13:09 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host tcp-proxy5003.eqsin.wmnet with OS trixie
13:08 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
13:06 jayme@cumin1003: START - Cookbook sre.hosts.decommission for hosts wikikube-worker[1029,1089,1092,1098-1099,1106,1112].eqiad.wmnet
13:03 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host thanos-be1008.eqiad.wmnet with OS bullseye
13:03 jmm@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts puppetserver1002.eqiad.wmnet
13:03 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host puppetserver1002.eqiad.wmnet
12:56 jiji@deploy1003: Unlocked for deployment [ALL REPOSITORIES]: Upgrading mw-mcrouter - effie (duration: 33m 37s)
12:54 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host puppetserver1002.eqiad.wmnet
12:53 moritzm: update firmware on puppetserver1002: NIC from 22.31.6 to 23.21.6 T423282
12:52 jmm@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts puppetserver1002.eqiad.wmnet
12:49 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on tcp-proxy5003.eqsin.wmnet with reason: host reimage
12:49 jiji@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-mcrouter: apply
12:47 jiji@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-mcrouter: apply
12:45 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on tcp-proxy5003.eqsin.wmnet with reason: host reimage
12:45 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on thanos-be1008.eqiad.wmnet with reason: host reimage
12:44 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host phab2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
12:43 elukey@cumin1003: START - Cookbook sre.hosts.provision for host phab2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
12:41 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-mcrouter: apply
12:40 jmm@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts puppetserver1002.eqiad.wmnet
12:40 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host puppetserver1002.eqiad.wmnet
12:39 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/mw-mcrouter: apply
12:38 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on thanos-be1008.eqiad.wmnet with reason: host reimage
12:29 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host puppetserver1002.eqiad.wmnet
12:29 moritzm: update firmware on puppetserver1002: BIOS from 1.9.2 to 1.20.2 T423282
12:28 moritzm: update firmware on puppetserver1002: idrac from 6.10.30.20 to 7.20.80.50 T423282
12:23 jiji@deploy1003: Locking from deployment [ALL REPOSITORIES]: Upgrading mw-mcrouter - effie
12:22 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
12:15 jmm@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts puppetserver1002.eqiad.wmnet
12:15 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host thanos-be1008.eqiad.wmnet with OS bullseye
12:06 gkyziridis@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
12:02 marostegui@cumin1003: dbctl commit (dc=all): 'Pool back pc1 but with pc2021 replacing pc2011', diff saved to https://phabricator.wikimedia.org/P91287 and previous config saved to /var/cache/conftool/dbconfig/20260421-120206-marostegui.json
11:58 jiji@deploy1003: Unlocked for deployment [ALL REPOSITORIES]: Upgrading mw-mcrouter - effie (duration: 68m 02s)
11:57 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool pc1011: Pool pc2021 into pc
11:57 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.parsercache (exit_code=99)
11:57 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
11:57 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool pc1011: Pool pc2021 into pc
11:57 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool pc2021: Pool pc2021 into pc
11:57 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.parsercache (exit_code=99)
11:57 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
11:57 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool pc2021: Pool pc2021 into pc
11:56 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool pc2021: Pool pc2021 into pc
11:56 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.parsercache (exit_code=99)
11:56 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
11:56 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool pc2021: Pool pc2021 into pc
11:53 gkyziridis@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
11:52 marostegui@cumin1003: dbctl commit (dc=all): 'add pc2021 to pc1', diff saved to https://phabricator.wikimedia.org/P91286 and previous config saved to /var/cache/conftool/dbconfig/20260421-115209-marostegui.json
11:50 moritzm: installing Tornado security updates
11:48 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
11:48 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host tcp-proxy5003.eqsin.wmnet with OS trixie
11:47 marostegui@cumin1003: dbctl commit (dc=all): 'Remove pc2011 and add pc2021 as replacement', diff saved to https://phabricator.wikimedia.org/P91285 and previous config saved to /var/cache/conftool/dbconfig/20260421-114718-marostegui.json
11:45 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM tcp-proxy5003.eqsin.wmnet - jmm@cumin2002"
11:45 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM tcp-proxy5003.eqsin.wmnet - jmm@cumin2002"
11:45 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) tcp-proxy5003.eqsin.wmnet on all recursors
11:45 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache tcp-proxy5003.eqsin.wmnet on all recursors
11:45 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
11:45 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM tcp-proxy5003.eqsin.wmnet - jmm@cumin2002"
11:44 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM tcp-proxy5003.eqsin.wmnet - jmm@cumin2002"
11:41 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-mcrouter: apply
11:41 jmm@cumin2002: START - Cookbook sre.dns.netbox
11:41 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host tcp-proxy5003.eqsin.wmnet
11:39 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1240.eqiad.wmnet with reason: Maintenance
11:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1223 (T419961)', diff saved to https://phabricator.wikimedia.org/P91283 and previous config saved to /var/cache/conftool/dbconfig/20260421-113927-fceratto.json
11:38 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
11:38 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: test current diff - jmm@cumin2002"
11:38 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: test current diff - jmm@cumin2002"
11:33 jmm@cumin2002: START - Cookbook sre.dns.netbox
11:31 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
11:30 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1016,1020].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
11:30 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1161.eqiad.wmnet with reason: Maintenance
11:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1159 (T419635)', diff saved to https://phabricator.wikimedia.org/P91282 and previous config saved to /var/cache/conftool/dbconfig/20260421-113010-fceratto.json
11:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1223', diff saved to https://phabricator.wikimedia.org/P91281 and previous config saved to /var/cache/conftool/dbconfig/20260421-112919-fceratto.json
11:27 klausman@deploy1003: helmfile [eqiad] DONE helmfile.d/services/changeprop: apply
11:26 klausman@deploy1003: helmfile [eqiad] START helmfile.d/services/changeprop: apply
11:26 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2143: repool after maintenance
11:26 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
11:25 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
11:25 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2143: repool after maintenance
11:24 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.pool (exit_code=99) pool db2143: after reimage to trixie
11:24 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2143: after reimage to trixie
11:23 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2143.codfw.wmnet with OS trixie
11:21 klausman@deploy1003: helmfile [codfw] DONE helmfile.d/services/changeprop: apply
11:21 klausman@deploy1003: helmfile [codfw] START helmfile.d/services/changeprop: apply
11:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1159', diff saved to https://phabricator.wikimedia.org/P91279 and previous config saved to /var/cache/conftool/dbconfig/20260421-112001-fceratto.json
11:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1223', diff saved to https://phabricator.wikimedia.org/P91278 and previous config saved to /var/cache/conftool/dbconfig/20260421-111911-fceratto.json
11:11 claime: Enabling puppet on A:cp to deploy https://gerrit.wikimedia.org/r/c/operations/puppet/+/1271804 - T422804
11:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1159', diff saved to https://phabricator.wikimedia.org/P91277 and previous config saved to /var/cache/conftool/dbconfig/20260421-110954-fceratto.json
11:09 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/mw-mcrouter: apply
11:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1223 (T419961)', diff saved to https://phabricator.wikimedia.org/P91276 and previous config saved to /var/cache/conftool/dbconfig/20260421-110903-fceratto.json
11:07 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host tcp-proxy5003.eqsin.wmnet
11:07 jmm@cumin2002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
11:00 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2143.codfw.wmnet with reason: host reimage
10:59 jmm@cumin2002: START - Cookbook sre.dns.netbox
10:59 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host tcp-proxy5003.eqsin.wmnet
10:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1159 (T419635)', diff saved to https://phabricator.wikimedia.org/P91275 and previous config saved to /var/cache/conftool/dbconfig/20260421-105945-fceratto.json
10:57 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host tcp-proxy5003.eqsin.wmnet
10:57 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host tcp-proxy5003.eqsin.wmnet with OS trixie
10:56 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
10:56 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2143.codfw.wmnet with reason: host reimage
10:51 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/proton: apply
10:50 gkyziridis@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
10:50 jiji@deploy1003: Locking from deployment [ALL REPOSITORIES]: Upgrading mw-mcrouter - effie
10:49 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/proton: apply
10:49 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/proton: apply
10:48 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host thanos-be1007.eqiad.wmnet with OS bullseye
10:47 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/proton: apply
10:44 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/proton: apply
10:43 jmm@deploy1003: helmfile [staging] START helmfile.d/services/proton: apply
10:39 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1223 (T419961)', diff saved to https://phabricator.wikimedia.org/P91274 and previous config saved to /var/cache/conftool/dbconfig/20260421-103945-fceratto.json
10:39 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1223.eqiad.wmnet with reason: Maintenance
10:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1212 (T419961)', diff saved to https://phabricator.wikimedia.org/P91273 and previous config saved to /var/cache/conftool/dbconfig/20260421-103915-fceratto.json
10:38 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2143.codfw.wmnet with OS trixie
10:37 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2143: Reimage to Trixie
10:37 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
10:37 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
10:37 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2143: Reimage to Trixie
10:37 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2143.codfw.wmnet with reason: Reimage to Trixie
10:36 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
10:35 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2143.codfw.wmnet,db1153.eqiad.wmnet with reason: Reimage to Trixie
10:30 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on thanos-be1007.eqiad.wmnet with reason: host reimage
10:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1212', diff saved to https://phabricator.wikimedia.org/P91271 and previous config saved to /var/cache/conftool/dbconfig/20260421-102907-fceratto.json
10:22 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on thanos-be1007.eqiad.wmnet with reason: host reimage
10:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1212', diff saved to https://phabricator.wikimedia.org/P91270 and previous config saved to /var/cache/conftool/dbconfig/20260421-101857-fceratto.json
10:13 lucaswerkmeister-wmde@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikidata-query-gui: apply
10:13 lucaswerkmeister-wmde@deploy1003: helmfile [codfw] START helmfile.d/services/wikidata-query-gui: apply
10:13 lucaswerkmeister-wmde@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikidata-query-gui: apply
10:13 lucaswerkmeister-wmde@deploy1003: helmfile [eqiad] START helmfile.d/services/wikidata-query-gui: apply
10:12 lucaswerkmeister-wmde@deploy1003: helmfile [staging] DONE helmfile.d/services/wikidata-query-gui: apply
10:12 lucaswerkmeister-wmde@deploy1003: helmfile [staging] START helmfile.d/services/wikidata-query-gui: apply
10:10 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
10:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1212 (T419961)', diff saved to https://phabricator.wikimedia.org/P91269 and previous config saved to /var/cache/conftool/dbconfig/20260421-100849-fceratto.json
10:07 claime: Disabling puppet on A:cp to merge https://gerrit.wikimedia.org/r/c/operations/puppet/+/1271804 - T422804
10:03 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host tcp-proxy5003.eqsin.wmnet with OS trixie
10:00 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1212 (T419961)', diff saved to https://phabricator.wikimedia.org/P91268 and previous config saved to /var/cache/conftool/dbconfig/20260421-100051-fceratto.json
10:00 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on 6 hosts with reason: Maintenance
10:00 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1212.eqiad.wmnet with reason: Maintenance
10:00 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host thanos-be1007.eqiad.wmnet with OS bullseye
09:59 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1159 (T419635)', diff saved to https://phabricator.wikimedia.org/P91267 and previous config saved to /var/cache/conftool/dbconfig/20260421-095928-fceratto.json
09:59 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1159.eqiad.wmnet with reason: Maintenance
09:55 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM tcp-proxy5003.eqsin.wmnet - jmm@cumin2002"
09:55 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM tcp-proxy5003.eqsin.wmnet - jmm@cumin2002"
09:54 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1198: Security update
09:54 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) tcp-proxy5003.eqsin.wmnet on all recursors
09:54 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache tcp-proxy5003.eqsin.wmnet on all recursors
09:54 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
09:54 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM tcp-proxy5003.eqsin.wmnet - jmm@cumin2002"
09:54 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM tcp-proxy5003.eqsin.wmnet - jmm@cumin2002"
09:50 jmm@cumin2002: START - Cookbook sre.dns.netbox
09:50 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host tcp-proxy5003.eqsin.wmnet
09:47 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
09:45 moritzm: updating debdeploy on trixie to 0.0.99.15
09:39 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1153: repool after maintenance
09:39 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
09:39 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
09:39 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1153: repool after maintenance
09:38 gkyziridis@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
09:37 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.pool (exit_code=99) pool db1153: repool after maintenance
09:37 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1153: repool after maintenance
09:35 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.pool (exit_code=99) pool db1153: after reimage to trixie
09:35 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1153: after reimage to trixie
09:34 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1016,1020].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
09:34 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1161.eqiad.wmnet with reason: Maintenance
09:34 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1153.eqiad.wmnet with OS trixie
09:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1159 (T419635)', diff saved to https://phabricator.wikimedia.org/P91262 and previous config saved to /var/cache/conftool/dbconfig/20260421-093401-fceratto.json
09:26 moritzm: imported debdeploy 0.0.99.15 for trixie-wikimedia (compat release for Cumin 6)
09:23 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1159', diff saved to https://phabricator.wikimedia.org/P91260 and previous config saved to /var/cache/conftool/dbconfig/20260421-092352-fceratto.json
09:21 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1198: Security update
09:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1198 (T419961)', diff saved to https://phabricator.wikimedia.org/P91259 and previous config saved to /var/cache/conftool/dbconfig/20260421-091949-fceratto.json
09:17 gkyziridis@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
09:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1159', diff saved to https://phabricator.wikimedia.org/P91258 and previous config saved to /var/cache/conftool/dbconfig/20260421-091344-fceratto.json
09:11 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1198 (T419961)', diff saved to https://phabricator.wikimedia.org/P91257 and previous config saved to /var/cache/conftool/dbconfig/20260421-091124-fceratto.json
09:11 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1198.eqiad.wmnet with reason: Maintenance
09:11 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1153.eqiad.wmnet with reason: host reimage
09:05 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1175.eqiad.wmnet with reason: Maintenance
09:05 jayme: kubectl delete node $(nodeset -e wikikube-worker[1002-1005,1011-1012,1019-1020,1029-1031,1058-1063,1082-1083,1088-1092,1096-1112,1166-1168].eqiad.wmnet) - T423863
09:05 fabfur: restarting pybal on lvs1019-1020 to clear alerts
09:03 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1175 (T419961)', diff saved to https://phabricator.wikimedia.org/P91256 and previous config saved to /var/cache/conftool/dbconfig/20260421-090358-fceratto.json
09:03 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1175.eqiad.wmnet with reason: Maintenance
09:03 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1159 (T419635)', diff saved to https://phabricator.wikimedia.org/P91255 and previous config saved to /var/cache/conftool/dbconfig/20260421-090336-fceratto.json
09:01 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1153.eqiad.wmnet with reason: host reimage
09:00 jayme: homer 'asw2-a-eqiad.mgmt.eqiad.wmnet' commit - T423863
09:00 jayme: homer 'asw2-b-eqiad.mgmt.eqiad.wmnet' commit - T423863
08:56 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
08:51 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1157: Security update
08:50 jayme: homer 'cr*eqiad*' commit - T423863
08:46 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti5006.eqsin.wmnet to cluster eqsin02 and group 01
08:46 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1153.eqiad.wmnet with OS trixie
08:45 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1153: Reimage to Trixie
08:45 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
08:45 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
08:45 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1153: Reimage to Trixie
08:45 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1153.eqiad.wmnet with reason: Reimage to Trixie
08:45 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2143.codfw.wmnet,db1153.eqiad.wmnet with reason: Reimage to Trixie
08:43 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti5006.eqsin.wmnet to cluster eqsin02 and group 01
08:40 jayme@cumin1003: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts wikikube-worker[1002-1005,1011-1012,1019-1020,1029-1031,1058-1061].eqiad.wmnet
08:40 jayme@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
08:39 jayme@cumin1003: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts wikikube-worker[1062-1063,1082-1083,1088-1092,1096-1101].eqiad.wmnet
08:39 jayme@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
08:39 jayme@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: wikikube-worker[1062-1063,1082-1083,1088-1092,1096-1101].eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jayme@cumin1003"
08:39 jayme@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: wikikube-worker[1062-1063,1082-1083,1088-1092,1096-1101].eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jayme@cumin1003"
08:38 jayme@cumin1003: START - Cookbook sre.dns.netbox
08:32 jayme@cumin1003: START - Cookbook sre.dns.netbox
08:32 moritzm: installing gst-plugins-base1.0 security updates
08:32 jayme@cumin1003: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts wikikube-worker[1102-1112,1166-1168].eqiad.wmnet
08:32 jayme@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
08:32 jayme@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: wikikube-worker[1102-1112,1166-1168].eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jayme@cumin1003"
08:31 jayme@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: wikikube-worker[1102-1112,1166-1168].eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jayme@cumin1003"
08:27 jayme@cumin1003: START - Cookbook sre.dns.netbox
08:20 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1157: Security update
08:18 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti5006.eqsin.wmnet
08:18 musikanimal@deploy1003: Finished scap sync-world: Backport for ext.abuseFilter.edit.js: temporary locking of CodeMirror lineWrapping (T423773 T423756) (duration: 07m 01s)
08:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1157 (T419961)', diff saved to https://phabricator.wikimedia.org/P91250 and previous config saved to /var/cache/conftool/dbconfig/20260421-081717-fceratto.json
08:14 elukey: bootstrapping pki intermediate discovery2026
08:14 musikanimal@deploy1003: musikanimal: Continuing with sync
08:12 musikanimal@deploy1003: musikanimal: Backport for ext.abuseFilter.edit.js: temporary locking of CodeMirror lineWrapping (T423773 T423756) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
08:10 musikanimal@deploy1003: Started scap sync-world: Backport for ext.abuseFilter.edit.js: temporary locking of CodeMirror lineWrapping (T423773 T423756)
08:09 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1157 (T419961)', diff saved to https://phabricator.wikimedia.org/P91249 and previous config saved to /var/cache/conftool/dbconfig/20260421-080936-fceratto.json
08:09 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1157.eqiad.wmnet with reason: Maintenance
08:08 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti5006.eqsin.wmnet
08:06 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host thanos-be1006.eqiad.wmnet with OS bullseye
08:04 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1150.eqiad.wmnet with reason: Maintenance
08:03 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1159 (T419635)', diff saved to https://phabricator.wikimedia.org/P91248 and previous config saved to /var/cache/conftool/dbconfig/20260421-080314-fceratto.json
08:03 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1159.eqiad.wmnet with reason: Maintenance
07:51 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1025.eqiad.wmnet,service=s4
07:51 fnegri@cumin1003: conftool action : set/weight=100; selector: name=clouddb1025.eqiad.wmnet,service=s4
07:49 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1025.eqiad.wmnet,service=s6
07:49 fnegri@cumin1003: conftool action : set/weight=100; selector: name=clouddb1025.eqiad.wmnet,service=s6
07:48 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for ms-be1064.eqiad.wmnet
07:48 mvernon@cumin2002: START - Cookbook sre.hosts.remove-downtime for ms-be1064.eqiad.wmnet
07:46 jayme@cumin1003: START - Cookbook sre.hosts.decommission for hosts wikikube-worker[1102-1112,1166-1168].eqiad.wmnet
07:46 jayme@cumin1003: START - Cookbook sre.hosts.decommission for hosts wikikube-worker[1062-1063,1082-1083,1088-1092,1096-1101].eqiad.wmnet
07:46 jayme@cumin1003: START - Cookbook sre.hosts.decommission for hosts wikikube-worker[1002-1005,1011-1012,1019-1020,1029-1031,1058-1061].eqiad.wmnet
07:45 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on thanos-be1006.eqiad.wmnet with reason: host reimage
07:39 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool pc1011: Reimage to Trixie
07:39 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
07:39 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
07:39 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool pc1011: Reimage to Trixie
07:38 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on thanos-be1006.eqiad.wmnet with reason: host reimage
07:17 mvernon@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on ms-be1064.eqiad.wmnet with reason: vacuum overlarge container dbs
07:16 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host thanos-be1006.eqiad.wmnet with OS bullseye
07:06 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on pc2011.codfw.wmnet,pc1011.eqiad.wmnet with reason: Cloning pc2021 from pc2011
07:05 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool pc2011: Cloning pc2021
07:05 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
07:05 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
07:05 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool pc2011: Cloning pc2021
07:04 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2144: After reimage
07:04 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
07:04 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
07:04 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2144: After reimage
07:03 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.pool (exit_code=99) pool db2144: after reimage to trixie
07:03 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2144: after reimage to trixie
07:02 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2144.codfw.wmnet with OS trixie
06:38 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2144.codfw.wmnet with reason: host reimage
06:33 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2144.codfw.wmnet with reason: host reimage
06:12 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2144.codfw.wmnet with OS trixie
06:12 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2144: Reimage to Trixie
06:12 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
06:12 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
06:12 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2144: Reimage to Trixie
06:12 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2144.codfw.wmnet with reason: Reimage to Trixie
06:11 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2144.codfw.wmnet,db1151.eqiad.wmnet with reason: Reimage to Trixie
05:40 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
05:27 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on clouddb1015.eqiad.wmnet with reason: Clone s6 to clouddb1025
05:25 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on clouddb1025.eqiad.wmnet with reason: Clone s6
04:02 mwpresync@deploy1003: Pruned MediaWiki: 1.46.0-wmf.22 (duration: 02m 30s)
02:53 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db2154 (T410589)', diff saved to https://phabricator.wikimedia.org/P91242 and previous config saved to /var/cache/conftool/dbconfig/20260421-025311-ladsgroup.json
02:53 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db2154.codfw.wmnet with reason: Maintenance
02:52 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2152 (T410589)', diff saved to https://phabricator.wikimedia.org/P91241 and previous config saved to /var/cache/conftool/dbconfig/20260421-025245-ladsgroup.json
02:42 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2152', diff saved to https://phabricator.wikimedia.org/P91240 and previous config saved to /var/cache/conftool/dbconfig/20260421-024237-ladsgroup.json
02:32 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2152', diff saved to https://phabricator.wikimedia.org/P91239 and previous config saved to /var/cache/conftool/dbconfig/20260421-023228-ladsgroup.json
02:22 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2152 (T410589)', diff saved to https://phabricator.wikimedia.org/P91238 and previous config saved to /var/cache/conftool/dbconfig/20260421-022219-ladsgroup.json
02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 03s)
02:01 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
01:32 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db2250.codfw.wmnet with OS bookworm
01:32 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
01:28 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
01:24 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2251.codfw.wmnet with OS bookworm
01:24 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
01:24 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
01:23 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2252.codfw.wmnet with OS bookworm
01:23 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
01:20 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
01:18 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2253.codfw.wmnet with OS bookworm
01:18 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
01:17 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
01:12 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2250.codfw.wmnet with reason: host reimage
01:07 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2251.codfw.wmnet with reason: host reimage
01:03 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2252.codfw.wmnet with reason: host reimage
01:02 zabe: marked 543 revisions as bad # T393237
00:59 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2253.codfw.wmnet with reason: host reimage
00:52 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2250.codfw.wmnet with reason: host reimage
00:52 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2251.codfw.wmnet with reason: host reimage
00:52 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2252.codfw.wmnet with reason: host reimage
00:51 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2253.codfw.wmnet with reason: host reimage
00:35 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host db2253.codfw.wmnet with OS bookworm
00:35 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host db2252.codfw.wmnet with OS bookworm
00:35 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host db2251.codfw.wmnet with OS bookworm
00:35 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host db2250.codfw.wmnet with OS bookworm
00:34 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db2250.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
00:33 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db2253.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
00:33 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db2252.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
00:32 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db2251.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
00:29 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host db2253.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
00:28 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host db2252.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
00:28 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host db2251.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
00:28 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host db2250.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED

2026-04-20

23:40 jdlrobson@deploy1003: Finished scap sync-world: Backport for Restore PageImages functionality to Wikisources and Wikibooks (T417538) (duration: 07m 47s)
23:39 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host pc2024.codfw.wmnet with OS trixie
23:36 jdlrobson@deploy1003: jdlrobson, ignaciorodrguez: Continuing with sync
23:34 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host pc2023.codfw.wmnet with OS trixie
23:34 jdlrobson@deploy1003: jdlrobson, ignaciorodrguez: Backport for Restore PageImages functionality to Wikisources and Wikibooks (T417538) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
23:32 jdlrobson@deploy1003: Started scap sync-world: Backport for Restore PageImages functionality to Wikisources and Wikibooks (T417538)
23:29 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host pc2022.codfw.wmnet with OS trixie
23:28 jdlrobson@deploy1003: Finished scap sync-world: Backport for [Mobile Page Previews] Avoid syntax error on older browsers (T423959) (duration: 08m 13s)
23:24 jdlrobson@deploy1003: jdlrobson: Continuing with sync
23:24 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on pc2024.codfw.wmnet with reason: host reimage
23:21 jdlrobson@deploy1003: jdlrobson: Backport for [Mobile Page Previews] Avoid syntax error on older browsers (T423959) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
23:20 jdlrobson@deploy1003: Started scap sync-world: Backport for [Mobile Page Previews] Avoid syntax error on older browsers (T423959)
23:19 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on pc2023.codfw.wmnet with reason: host reimage
23:16 jdlrobson@deploy1003: Finished scap sync-world: Backport for Revert "Skin: Avoid stretching low resolution images" (T421524 T423676) (duration: 05m 56s)
23:14 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on pc2022.codfw.wmnet with reason: host reimage
23:12 jdlrobson@deploy1003: cscott, jdlrobson: Continuing with sync
23:12 jdlrobson@deploy1003: cscott, jdlrobson: Backport for Revert "Skin: Avoid stretching low resolution images" (T421524 T423676) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
23:10 jdlrobson@deploy1003: Started scap sync-world: Backport for Revert "Skin: Avoid stretching low resolution images" (T421524 T423676)
23:07 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on pc2024.codfw.wmnet with reason: host reimage
23:07 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on pc2023.codfw.wmnet with reason: host reimage
23:06 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on pc2022.codfw.wmnet with reason: host reimage
22:53 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host pc2024.codfw.wmnet with OS trixie
22:52 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host pc2023.codfw.wmnet with OS trixie
22:52 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host pc2022.codfw.wmnet with OS trixie
21:59 rzl@deploy1003: Finished scap sync-world: https://gerrit.wikimedia.org/r/1275463 T423311 T423624 (duration: 03m 24s)
21:57 rzl@deploy1003: Started scap sync-world: https://gerrit.wikimedia.org/r/1275463 T423311 T423624
21:42 maryum: Deployed security fix for T406954
21:33 maryum: Deployed security fix for T299359
20:16 aude@deploy1003: Finished scap sync-world: Backport for Do not show donate button on affiliate wikis (T423876) (duration: 10m 57s)
20:10 aude@deploy1003: aude: Continuing with sync
20:08 aude@deploy1003: aude: Backport for Do not show donate button on affiliate wikis (T423876) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
20:05 aude@deploy1003: Started scap sync-world: Backport for Do not show donate button on affiliate wikis (T423876)
19:48 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host phab2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
19:46 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2190: Security update
19:28 elukey@cumin1003: START - Cookbook sre.hosts.provision for host phab2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
19:04 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host phab2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
19:00 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2190: Security update
18:58 dancy@deploy1003: Installation of scap version "4.249.0" completed for 2 hosts
18:56 dancy@deploy1003: Installing scap version "4.249.0" for 2 host(s)
{{safesubst:SAL entry|1=18:55 jforrester@deploy1003: Finished scap sync-world: Backport for Attribution: Clean up API spec descriptions (T422502), [[gerrit:1275476|i18n: Use Template:Doc-markdown template in Attribution qqq.json (T422502)]], Attribution: Documentation copyedits, Attribution: Update contact and add call to action (T422502), [[gerrit:1275478|Attribution: Add localized texts for tre}}
18:44 jforrester@deploy1003: pmiazga, jforrester: Continuing with sync
{{safesubst:SAL entry|1=18:42 jforrester@deploy1003: pmiazga, jforrester: Backport for Attribution: Clean up API spec descriptions (T422502), [[gerrit:1275476|i18n: Use Template:Doc-markdown template in Attribution qqq.json (T422502)]], Attribution: Documentation copyedits, Attribution: Update contact and add call to action (T422502), [[gerrit:1275478|Attribution: Add localized texts for trending}}
{{safesubst:SAL entry|1=18:25 jforrester@deploy1003: Started scap sync-world: Backport for Attribution: Clean up API spec descriptions (T422502), [[gerrit:1275476|i18n: Use Template:Doc-markdown template in Attribution qqq.json (T422502)]], Attribution: Documentation copyedits, Attribution: Update contact and add call to action (T422502), [[gerrit:1275478|Attribution: Add localized texts for tren}}
18:11 Amir1: drop of langlinks table on testcommonswiki (T421914)
18:07 herron@dns1004: END - running authdns-update
18:05 herron@dns1004: START - running authdns-update
17:56 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host thanos-be1005.eqiad.wmnet with OS bullseye
17:47 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-video: apply
17:46 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-video: apply
17:46 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-timeline: apply
17:45 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-timeline: apply
17:45 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
17:45 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-syntaxhighlight: apply
17:44 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-media: apply
17:44 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-media: apply
17:43 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-constraints: apply
17:43 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-constraints: apply
17:42 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox: apply
17:41 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox: apply
17:38 elukey@cumin1003: START - Cookbook sre.hosts.provision for host phab2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
17:37 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host phab2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
17:36 elukey@cumin1003: START - Cookbook sre.hosts.provision for host phab2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
17:35 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudelastic1012.eqiad.wmnet with OS trixie
17:34 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on thanos-be1005.eqiad.wmnet with reason: host reimage
17:27 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-video: apply
17:27 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on thanos-be1005.eqiad.wmnet with reason: host reimage
17:26 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-video: apply
17:26 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-timeline: apply
17:25 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-timeline: apply
17:25 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
17:24 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-syntaxhighlight: apply
17:24 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-media: apply
17:24 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-media: apply
17:23 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-constraints: apply
17:23 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-constraints: apply
17:22 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox: apply
17:21 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox: apply
17:18 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-video: apply
17:18 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudelastic1012.eqiad.wmnet with reason: host reimage
17:18 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-video: apply
17:18 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-timeline: apply
17:18 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-timeline: apply
17:18 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
17:17 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-syntaxhighlight: apply
17:17 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-media: apply
17:17 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-media: apply
17:17 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-constraints: apply
17:17 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-constraints: apply
17:17 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox: apply
17:16 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/shellbox: apply
17:15 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host thanos-be1005.eqiad.wmnet with OS bullseye
17:14 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudelastic1012.eqiad.wmnet with reason: host reimage
17:02 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1012.eqiad.wmnet with OS trixie
16:55 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1161 (T419635)', diff saved to https://phabricator.wikimedia.org/P91231 and previous config saved to /var/cache/conftool/dbconfig/20260420-165459-fceratto.json
16:54 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1016,1020].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
16:54 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1161.eqiad.wmnet with reason: Maintenance
16:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1159 (T419635)', diff saved to https://phabricator.wikimedia.org/P91230 and previous config saved to /var/cache/conftool/dbconfig/20260420-165423-fceratto.json
16:52 moritzm: installing imagemagick security updates
16:48 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti5006.eqsin.wmnet with OS bookworm
16:48 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM aux-k8s-etcd1003.eqiad.wmnet
16:44 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM aux-k8s-etcd1003.eqiad.wmnet
16:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1159', diff saved to https://phabricator.wikimedia.org/P91229 and previous config saved to /var/cache/conftool/dbconfig/20260420-164415-fceratto.json
16:38 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM aux-k8s-etcd1003.eqiad.wmnet
16:36 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es2036: Moving to another rack
16:34 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM aux-k8s-etcd1003.eqiad.wmnet
16:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1159', diff saved to https://phabricator.wikimedia.org/P91227 and previous config saved to /var/cache/conftool/dbconfig/20260420-163407-fceratto.json
16:33 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM aux-k8s-etcd1003.eqiad.wmnet
16:29 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM aux-k8s-etcd1003.eqiad.wmnet
16:29 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM backupmon1001.eqiad.wmnet
16:27 marostegui@dns1004: END - running authdns-update
16:26 marostegui: Switchover m3 proxy (phabricator)
16:26 marostegui@dns1004: START - running authdns-update
16:25 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti5006.eqsin.wmnet with reason: host reimage
16:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1159 (T419635)', diff saved to https://phabricator.wikimedia.org/P91226 and previous config saved to /var/cache/conftool/dbconfig/20260420-162359-fceratto.json
16:21 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1166: Security update
16:19 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM backupmon1001.eqiad.wmnet
16:17 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti5006.eqsin.wmnet with reason: host reimage
16:17 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
16:17 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
16:06 bking@cumin2002: conftool action : set/pooled=no; selector: name=cloudelastic1012.eqiad.wmnet
15:57 moritzm: installing libvirt security updates
15:55 sukhe: sudo cumin -b31 "A:cp and not P{cp2041* or cp2042*}" "run-puppet-agent --enable 'merging CR 1272869'"
15:51 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2036: Moving to another rack
15:50 cgoubert@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2188.codfw.wmnet
15:50 cgoubert@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2188.codfw.wmnet
15:50 marostegui@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool es2036: Moving to another rack
15:50 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool es2036: Moving to another rack
15:50 cgoubert@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for wikikube-worker2188.codfw.wmnet
15:50 cgoubert@cumin1003: START - Cookbook sre.hosts.remove-downtime for wikikube-worker2188.codfw.wmnet
15:48 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti5006.eqsin.wmnet with OS bookworm
15:41 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
15:41 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
15:36 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host cloudelastic1012.eqiad.wmnet
15:36 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host es2036
15:36 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host es2036
15:36 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dbstore1010.eqiad.wmnet with OS bookworm
15:36 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
15:36 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
15:35 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1166: Security update
15:34 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
15:34 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
15:33 jhancock@cumin2002: START - Cookbook sre.dns.netbox
15:27 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
15:25 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudelastic1012.eqiad.wmnet
15:25 fceratto@cumin1003: END (FAIL) - Cookbook sre.mysql.pool (exit_code=99) pool db1166: Security update
15:23 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1159 (T419635)', diff saved to https://phabricator.wikimedia.org/P91217 and previous config saved to /var/cache/conftool/dbconfig/20260420-152341-fceratto.json
15:23 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1159.eqiad.wmnet with reason: Maintenance
15:20 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1166: Security update
15:14 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es2036.codfw.wmnet with reason: Moved to anotehr rack
15:13 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool es2036: Moving to another rack
15:13 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool es2036: Moving to another rack
15:11 jhancock@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['wikikube-worker2188']
15:11 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.pool (exit_code=99) pool db1151: repool after maintenance
15:11 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1151: repool after maintenance
15:09 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dbstore1010.eqiad.wmnet with reason: host reimage
15:05 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host thanos-be2006.codfw.wmnet with OS bullseye
15:05 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dbstore1010.eqiad.wmnet with reason: host reimage
15:03 trueg@deploy1003: helmfile [staging] DONE helmfile.d/services/rdf-streaming-updater: apply
15:03 trueg@deploy1003: helmfile [staging] START helmfile.d/services/rdf-streaming-updater: apply
14:52 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
14:51 jhancock@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['wikikube-worker2188']
14:46 cwhite@deploy1003: Finished deploy [performance/arc-lamp@bd7b2ab]: T413127 (duration: 00m 08s)
14:45 cwhite@deploy1003: Started deploy [performance/arc-lamp@bd7b2ab]: T413127
14:45 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on thanos-be2006.codfw.wmnet with reason: host reimage
14:43 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.pool (exit_code=99) pool db1151: after reimage to trixie
14:43 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1151: after reimage to trixie
14:42 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1151.eqiad.wmnet with OS trixie
14:40 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host dbstore1010.eqiad.wmnet with OS bookworm
14:37 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host dbstore1010.eqiad.wmnet with OS bookworm
14:37 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host dbstore1010.eqiad.wmnet with OS bookworm
14:36 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on thanos-be2006.codfw.wmnet with reason: host reimage
14:35 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dbstore1010.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
14:26 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
14:21 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudelastic1012.eqiad.wmnet
14:21 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1166 (T419961)', diff saved to https://phabricator.wikimedia.org/P91215 and previous config saved to /var/cache/conftool/dbconfig/20260420-142120-fceratto.json
14:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1166.eqiad.wmnet with reason: Maintenance
14:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1157 (T419961)', diff saved to https://phabricator.wikimedia.org/P91214 and previous config saved to /var/cache/conftool/dbconfig/20260420-142050-fceratto.json
14:19 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudelastic1012.eqiad.wmnet
14:19 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1151.eqiad.wmnet with reason: host reimage
14:15 jclark@cumin1003: START - Cookbook sre.hosts.provision for host dbstore1010.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
14:14 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1151.eqiad.wmnet with reason: host reimage
14:14 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host thanos-be2006.codfw.wmnet with OS bullseye
14:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1157', diff saved to https://phabricator.wikimedia.org/P91213 and previous config saved to /var/cache/conftool/dbconfig/20260420-141042-fceratto.json
14:02 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
14:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1253 (T419635)', diff saved to https://phabricator.wikimedia.org/P91212 and previous config saved to /var/cache/conftool/dbconfig/20260420-140203-fceratto.json
14:02 urandom: upgrade envoyproxy, restbase — T419637 & T410975
14:00 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1151.eqiad.wmnet with OS trixie
14:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1157', diff saved to https://phabricator.wikimedia.org/P91211 and previous config saved to /var/cache/conftool/dbconfig/20260420-140033-fceratto.json
14:00 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1151: Reimage to Trixie
14:00 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
14:00 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
14:00 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1151: Reimage to Trixie
14:00 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1151.eqiad.wmnet with reason: Reimage to Trixie
14:00 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
13:59 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2144.codfw.wmnet,db1151.eqiad.wmnet with reason: Reimage to Trixie
13:52 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db2152 (T410589)', diff saved to https://phabricator.wikimedia.org/P91209 and previous config saved to /var/cache/conftool/dbconfig/20260420-135255-ladsgroup.json
13:52 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db2152.codfw.wmnet with reason: Maintenance
13:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1253', diff saved to https://phabricator.wikimedia.org/P91208 and previous config saved to /var/cache/conftool/dbconfig/20260420-135155-fceratto.json
13:50 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
13:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1157 (T419961)', diff saved to https://phabricator.wikimedia.org/P91207 and previous config saved to /var/cache/conftool/dbconfig/20260420-135025-fceratto.json
13:47 jclark@cumin1003: START - Cookbook sre.dns.netbox
13:45 jclark@cumin1003: START - Cookbook sre.hosts.provision for host dbstore1010.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
13:44 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
13:44 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding dbstore1010 to eqiad - jclark@cumin1003"
13:44 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding dbstore1010 to eqiad - jclark@cumin1003"
13:41 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1157 (T419961)', diff saved to https://phabricator.wikimedia.org/P91206 and previous config saved to /var/cache/conftool/dbconfig/20260420-134158-fceratto.json
13:41 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1157.eqiad.wmnet with reason: Maintenance
13:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1253', diff saved to https://phabricator.wikimedia.org/P91205 and previous config saved to /var/cache/conftool/dbconfig/20260420-134148-fceratto.json
13:37 jclark@cumin1003: START - Cookbook sre.dns.netbox
13:35 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1150.eqiad.wmnet with reason: Maintenance
13:34 Lucas_WMDE: UTC afternoon backport+config window done
13:32 urandom: decommissioning Cassandra, aqs1014 [a,b] — T412830
13:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1253 (T419635)', diff saved to https://phabricator.wikimedia.org/P91204 and previous config saved to /var/cache/conftool/dbconfig/20260420-133139-fceratto.json
13:30 eevans@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on aqs1014.eqiad.wmnet with reason: Decommissioning — T412830
13:29 phuedx@deploy1003: Finished scap sync-world: Backport for PHP SDK: Split measurement of unknown experiments (T422112) (duration: 07m 51s)
13:29 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1253 (T419635)', diff saved to https://phabricator.wikimedia.org/P91203 and previous config saved to /var/cache/conftool/dbconfig/20260420-132926-fceratto.json
13:29 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1253.eqiad.wmnet with reason: Maintenance
13:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1236 (T419635)', diff saved to https://phabricator.wikimedia.org/P91202 and previous config saved to /var/cache/conftool/dbconfig/20260420-132901-fceratto.json
13:26 phuedx@deploy1003: phuedx: Continuing with sync
13:23 phuedx@deploy1003: phuedx: Backport for PHP SDK: Split measurement of unknown experiments (T422112) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
13:22 phuedx@deploy1003: Started scap sync-world: Backport for PHP SDK: Split measurement of unknown experiments (T422112)
13:20 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for Remove unused JWT for bot password temporary config (T422367 T415007), Enable ReadingLists beta feature for all Wikipedia wikis (T420881) (duration: 08m 21s)
13:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1236', diff saved to https://phabricator.wikimedia.org/P91200 and previous config saved to /var/cache/conftool/dbconfig/20260420-131853-fceratto.json
13:16 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, aude, d3r1ck01: Continuing with sync
13:15 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, aude, d3r1ck01: Backport for Remove unused JWT for bot password temporary config (T422367 T415007), Enable ReadingLists beta feature for all Wikipedia wikis (T420881) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
13:12 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for Remove unused JWT for bot password temporary config (T422367 T415007), Enable ReadingLists beta feature for all Wikipedia wikis (T420881)
13:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1236', diff saved to https://phabricator.wikimedia.org/P91199 and previous config saved to /var/cache/conftool/dbconfig/20260420-130845-fceratto.json
12:58 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[1002-1005,1011-1012,1019-1020,1029-1031,1058-1063,1082-1083,1088-1092,1096,1098-1112,1166-1168].eqiad.wmnet
12:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1236 (T419635)', diff saved to https://phabricator.wikimedia.org/P91198 and previous config saved to /var/cache/conftool/dbconfig/20260420-125837-fceratto.json
12:56 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1236 (T419635)', diff saved to https://phabricator.wikimedia.org/P91197 and previous config saved to /var/cache/conftool/dbconfig/20260420-125624-fceratto.json
12:56 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1236.eqiad.wmnet with reason: Maintenance
12:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1231 (T419635)', diff saved to https://phabricator.wikimedia.org/P91196 and previous config saved to /var/cache/conftool/dbconfig/20260420-125559-fceratto.json
12:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1231', diff saved to https://phabricator.wikimedia.org/P91195 and previous config saved to /var/cache/conftool/dbconfig/20260420-124550-fceratto.json
12:35 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts moss-be[1001-1002].eqiad.wmnet
12:35 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
12:35 mvernon@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: moss-be[1001-1002].eqiad.wmnet decommissioned, removing all IPs except the asset tag one - mvernon@cumin2002"
12:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1231', diff saved to https://phabricator.wikimedia.org/P91194 and previous config saved to /var/cache/conftool/dbconfig/20260420-123542-fceratto.json
12:31 mvernon@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: moss-be[1001-1002].eqiad.wmnet decommissioned, removing all IPs except the asset tag one - mvernon@cumin2002"
12:28 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[1002-1005,1011-1012,1019-1020,1029-1031,1058-1063,1082-1083,1088-1092,1096,1098-1112,1166-1168].eqiad.wmnet
12:26 mvernon@cumin2002: START - Cookbook sre.dns.netbox
12:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1231 (T419635)', diff saved to https://phabricator.wikimedia.org/P91193 and previous config saved to /var/cache/conftool/dbconfig/20260420-122534-fceratto.json
12:23 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1231 (T419635)', diff saved to https://phabricator.wikimedia.org/P91192 and previous config saved to /var/cache/conftool/dbconfig/20260420-122321-fceratto.json
12:23 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1231.eqiad.wmnet with reason: Maintenance
12:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1227 (T419635)', diff saved to https://phabricator.wikimedia.org/P91191 and previous config saved to /var/cache/conftool/dbconfig/20260420-122256-fceratto.json
12:17 zabe: Deployed patch for T423821
12:16 moritzm: remove ganeti5006 from eqsin01 Ganeti cluster (running classic Ganeti) T421863
12:15 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1328-1334,1360-1374].eqiad.wmnet
12:15 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1328-1334,1360-1374].eqiad.wmnet
12:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1227', diff saved to https://phabricator.wikimedia.org/P91190 and previous config saved to /var/cache/conftool/dbconfig/20260420-121247-fceratto.json
12:11 mvernon@cumin2002: START - Cookbook sre.hosts.decommission for hosts moss-be[1001-1002].eqiad.wmnet
12:10 moritzm: installing edk2 security updates
12:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1227', diff saved to https://phabricator.wikimedia.org/P91189 and previous config saved to /var/cache/conftool/dbconfig/20260420-120239-fceratto.json
11:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1227 (T419635)', diff saved to https://phabricator.wikimedia.org/P91188 and previous config saved to /var/cache/conftool/dbconfig/20260420-115231-fceratto.json
11:49 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti5006.eqsin.wmnet
11:17 fnegri@cumin1003: conftool action : set/pooled=no; selector: name=clouddb1025.eqiad.wmnet,service=x4
10:52 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1227 (T419635)', diff saved to https://phabricator.wikimedia.org/P91187 and previous config saved to /var/cache/conftool/dbconfig/20260420-105213-fceratto.json
10:52 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1227.eqiad.wmnet with reason: Maintenance
10:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1202 (T419635)', diff saved to https://phabricator.wikimedia.org/P91186 and previous config saved to /var/cache/conftool/dbconfig/20260420-105148-fceratto.json
10:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1202', diff saved to https://phabricator.wikimedia.org/P91185 and previous config saved to /var/cache/conftool/dbconfig/20260420-104141-fceratto.json
10:32 trueg@deploy1003: helmfile [staging] DONE helmfile.d/services/rdf-streaming-updater: apply
10:32 trueg@deploy1003: helmfile [staging] START helmfile.d/services/rdf-streaming-updater: apply
10:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1202', diff saved to https://phabricator.wikimedia.org/P91184 and previous config saved to /var/cache/conftool/dbconfig/20260420-103133-fceratto.json
10:26 kamila@deploy1003: Finished scap sync-world: ICU 72 upgrade (duration: 51m 35s)
10:24 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts bast1003.wikimedia.org
10:24 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
10:24 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: bast1003.wikimedia.org decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
10:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1202 (T419635)', diff saved to https://phabricator.wikimedia.org/P91183 and previous config saved to /var/cache/conftool/dbconfig/20260420-102125-fceratto.json
10:19 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1202 (T419635)', diff saved to https://phabricator.wikimedia.org/P91182 and previous config saved to /var/cache/conftool/dbconfig/20260420-101913-fceratto.json
10:19 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1202.eqiad.wmnet with reason: Maintenance
10:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1194 (T419635)', diff saved to https://phabricator.wikimedia.org/P91181 and previous config saved to /var/cache/conftool/dbconfig/20260420-101847-fceratto.json
10:15 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: bast1003.wikimedia.org decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
10:14 kamila@deploy1003: kamila: Continuing with sync
10:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1194', diff saved to https://phabricator.wikimedia.org/P91180 and previous config saved to /var/cache/conftool/dbconfig/20260420-100839-fceratto.json
10:07 jmm@cumin2002: START - Cookbook sre.dns.netbox
10:04 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2190 (T419961)', diff saved to https://phabricator.wikimedia.org/P91179 and previous config saved to /var/cache/conftool/dbconfig/20260420-100423-fceratto.json
10:04 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2190.codfw.wmnet with reason: Maintenance
10:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177 (T419961)', diff saved to https://phabricator.wikimedia.org/P91178 and previous config saved to /var/cache/conftool/dbconfig/20260420-100402-fceratto.json
10:02 Emperor: ceph orch host drain moss-be1002 T418901
10:02 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1165: after reimage to trixie
09:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1194', diff saved to https://phabricator.wikimedia.org/P91176 and previous config saved to /var/cache/conftool/dbconfig/20260420-095831-fceratto.json
09:58 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts bast1003.wikimedia.org
09:53 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P91175 and previous config saved to /var/cache/conftool/dbconfig/20260420-095354-fceratto.json
09:52 kamila@deploy1003: kamila: ICU 72 upgrade synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1194 (T419635)', diff saved to https://phabricator.wikimedia.org/P91174 and previous config saved to /var/cache/conftool/dbconfig/20260420-094823-fceratto.json
09:48 klausman@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'llm' for release 'main' .
09:46 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1194 (T419635)', diff saved to https://phabricator.wikimedia.org/P91172 and previous config saved to /var/cache/conftool/dbconfig/20260420-094612-fceratto.json
09:46 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1194.eqiad.wmnet with reason: Maintenance
09:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1191 (T419635)', diff saved to https://phabricator.wikimedia.org/P91171 and previous config saved to /var/cache/conftool/dbconfig/20260420-094546-fceratto.json
09:43 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P91170 and previous config saved to /var/cache/conftool/dbconfig/20260420-094345-fceratto.json
09:43 Emperor: ceph orch host drain moss-be1001 T418901
09:40 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM puppetboard1003.eqiad.wmnet
09:36 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM puppetboard1003.eqiad.wmnet
09:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1191', diff saved to https://phabricator.wikimedia.org/P91169 and previous config saved to /var/cache/conftool/dbconfig/20260420-093538-fceratto.json
09:35 kamila@deploy1003: Started scap sync-world: ICU 72 upgrade
09:33 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177 (T419961)', diff saved to https://phabricator.wikimedia.org/P91168 and previous config saved to /var/cache/conftool/dbconfig/20260420-093337-fceratto.json
09:33 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM puppetboard2003.codfw.wmnet
09:29 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM puppetboard2003.codfw.wmnet
09:26 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti5006.eqsin.wmnet
09:26 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti5006.eqsin.wmnet
09:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1191', diff saved to https://phabricator.wikimedia.org/P91166 and previous config saved to /var/cache/conftool/dbconfig/20260420-092530-fceratto.json
09:25 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM testvm2008.wikimedia.org
09:25 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti5006.eqsin.wmnet
09:24 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2177 (T419961)', diff saved to https://phabricator.wikimedia.org/P91165 and previous config saved to /var/cache/conftool/dbconfig/20260420-092448-fceratto.json
09:24 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2177.codfw.wmnet with reason: Maintenance
09:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156 (T419961)', diff saved to https://phabricator.wikimedia.org/P91164 and previous config saved to /var/cache/conftool/dbconfig/20260420-092417-fceratto.json
09:24 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts testvm2006.codfw.wmnet
09:24 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
09:24 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: testvm2006.codfw.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
09:22 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: testvm2006.codfw.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
09:21 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
09:21 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM testvm2008.wikimedia.org
09:19 klausman@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
09:18 klausman@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'llm' for release 'main' .
09:16 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1165: after reimage to trixie
09:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1191 (T419635)', diff saved to https://phabricator.wikimedia.org/P91162 and previous config saved to /var/cache/conftool/dbconfig/20260420-091522-fceratto.json
09:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P91161 and previous config saved to /var/cache/conftool/dbconfig/20260420-091409-fceratto.json
09:14 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1165.eqiad.wmnet with OS trixie
09:13 trueg@deploy1003: helmfile [staging] DONE helmfile.d/services/rdf-streaming-updater: apply
09:13 trueg@deploy1003: helmfile [staging] START helmfile.d/services/rdf-streaming-updater: apply
09:13 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1191 (T419635)', diff saved to https://phabricator.wikimedia.org/P91160 and previous config saved to /var/cache/conftool/dbconfig/20260420-091310-fceratto.json
09:13 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1191.eqiad.wmnet with reason: Maintenance
09:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1174 (T419635)', diff saved to https://phabricator.wikimedia.org/P91159 and previous config saved to /var/cache/conftool/dbconfig/20260420-091233-fceratto.json
09:11 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM testvm2007.codfw.wmnet
09:11 trueg@deploy1003: helmfile [staging] DONE helmfile.d/services/rdf-streaming-updater: apply
09:10 trueg@deploy1003: helmfile [staging] START helmfile.d/services/rdf-streaming-updater: apply
09:08 jmm@cumin2002: START - Cookbook sre.dns.netbox
09:07 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM testvm2007.codfw.wmnet
09:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P91158 and previous config saved to /var/cache/conftool/dbconfig/20260420-090401-fceratto.json
09:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P91157 and previous config saved to /var/cache/conftool/dbconfig/20260420-090225-fceratto.json
08:53 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156 (T419961)', diff saved to https://phabricator.wikimedia.org/P91156 and previous config saved to /var/cache/conftool/dbconfig/20260420-085349-fceratto.json
08:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P91155 and previous config saved to /var/cache/conftool/dbconfig/20260420-085217-fceratto.json
08:49 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1165.eqiad.wmnet with reason: host reimage
08:48 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts testvm2006.codfw.wmnet
08:45 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2156 (T419961)', diff saved to https://phabricator.wikimedia.org/P91154 and previous config saved to /var/cache/conftool/dbconfig/20260420-084512-fceratto.json
08:45 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2156.codfw.wmnet with reason: Maintenance
08:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149 (T419961)', diff saved to https://phabricator.wikimedia.org/P91153 and previous config saved to /var/cache/conftool/dbconfig/20260420-084440-fceratto.json
08:43 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM testvm2005.codfw.wmnet
08:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1174 (T419635)', diff saved to https://phabricator.wikimedia.org/P91152 and previous config saved to /var/cache/conftool/dbconfig/20260420-084209-fceratto.json
08:41 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1165.eqiad.wmnet with reason: host reimage
08:41 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts atlas5001.wikimedia.org
08:41 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
08:41 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: atlas5001.wikimedia.org decommissioned, removing all IPs except the asset tag one - ayounsi@cumin1003"
08:41 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: atlas5001.wikimedia.org decommissioned, removing all IPs except the asset tag one - ayounsi@cumin1003"
08:39 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1174 (T419635)', diff saved to https://phabricator.wikimedia.org/P91151 and previous config saved to /var/cache/conftool/dbconfig/20260420-083957-fceratto.json
08:39 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1174.eqiad.wmnet with reason: Maintenance
08:39 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM testvm2005.codfw.wmnet
08:39 gkyziridis@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
08:34 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
08:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P91150 and previous config saved to /var/cache/conftool/dbconfig/20260420-083432-fceratto.json
08:32 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts testvm2004.codfw.wmnet
08:32 jmm@cumin2002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
08:30 ayounsi@cumin1003: START - Cookbook sre.hosts.decommission for hosts atlas5001.wikimedia.org
08:30 jmm@cumin2002: START - Cookbook sre.dns.netbox
08:28 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1165.eqiad.wmnet with OS trixie
08:27 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1165: Reimage to Trixie
08:27 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1165: Reimage to Trixie
08:27 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1165.eqiad.wmnet with reason: Reimage to Trixie
08:26 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2229: after reimage to trixie
08:26 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts testvm2004.codfw.wmnet
08:26 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1171.eqiad.wmnet with reason: Maintenance
08:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1170 (T419635)', diff saved to https://phabricator.wikimedia.org/P91147 and previous config saved to /var/cache/conftool/dbconfig/20260420-082555-fceratto.json
08:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P91146 and previous config saved to /var/cache/conftool/dbconfig/20260420-082424-fceratto.json
08:23 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb1015.eqiad.wmnet,db[1155,1165].eqiad.wmnet with reason: Reimage to Trixie
08:23 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts bast5004.wikimedia.org
08:22 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
08:22 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: bast5004.wikimedia.org decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
08:22 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: bast5004.wikimedia.org decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
08:19 filippo@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM cloudcumin1001.eqiad.wmnet
08:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1170', diff saved to https://phabricator.wikimedia.org/P91145 and previous config saved to /var/cache/conftool/dbconfig/20260420-081547-fceratto.json
08:15 filippo@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM cloudcumin1001.eqiad.wmnet
08:15 cgoubert@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on wikikube-worker2188.codfw.wmnet with reason: dcops intervention
08:14 cgoubert@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2188.codfw.wmnet
08:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149 (T419961)', diff saved to https://phabricator.wikimedia.org/P91144 and previous config saved to /var/cache/conftool/dbconfig/20260420-081416-fceratto.json
08:14 cgoubert@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2188.codfw.wmnet
08:13 filippo@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM cloudcumin2001.codfw.wmnet
08:09 jmm@cumin2002: START - Cookbook sre.dns.netbox
08:07 filippo@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM cloudcumin2001.codfw.wmnet
08:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1170', diff saved to https://phabricator.wikimedia.org/P91142 and previous config saved to /var/cache/conftool/dbconfig/20260420-080539-fceratto.json
08:05 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2149 (T419961)', diff saved to https://phabricator.wikimedia.org/P91141 and previous config saved to /var/cache/conftool/dbconfig/20260420-080529-fceratto.json
08:05 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2149.codfw.wmnet with reason: Maintenance
08:04 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts bast5004.wikimedia.org
08:01 marostegui: Removed categorylinks_icu72 from s3 with a sleep, this will around 1.5 hours T422546
07:59 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 12389
07:59 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'configure' for AS: 12389
07:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1170 (T419635)', diff saved to https://phabricator.wikimedia.org/P91139 and previous config saved to /var/cache/conftool/dbconfig/20260420-075524-fceratto.json
07:51 marostegui: Removed categorylinks_icu72 from s5 T422546
07:41 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2229: after reimage to trixie
07:40 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1170 (T419635)', diff saved to https://phabricator.wikimedia.org/P91137 and previous config saved to /var/cache/conftool/dbconfig/20260420-074031-fceratto.json
07:40 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1170.eqiad.wmnet with reason: Maintenance
07:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1158 (T419635)', diff saved to https://phabricator.wikimedia.org/P91136 and previous config saved to /var/cache/conftool/dbconfig/20260420-074005-fceratto.json
07:38 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2229.codfw.wmnet with OS trixie
07:31 marostegui: Removed categorylinks_icu72 from s7 T422546
07:30 marostegui: Removed categorylinks_icu72 from s2 T422546
07:30 marostegui: Removed categorylinks_icu72 from s12 T422546
07:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P91135 and previous config saved to /var/cache/conftool/dbconfig/20260420-072957-fceratto.json
07:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P91134 and previous config saved to /var/cache/conftool/dbconfig/20260420-071949-fceratto.json
07:14 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2229.codfw.wmnet with reason: host reimage
07:10 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2229.codfw.wmnet with reason: host reimage
07:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1158 (T419635)', diff saved to https://phabricator.wikimedia.org/P91133 and previous config saved to /var/cache/conftool/dbconfig/20260420-070941-fceratto.json
07:07 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1158 (T419635)', diff saved to https://phabricator.wikimedia.org/P91132 and previous config saved to /var/cache/conftool/dbconfig/20260420-070728-fceratto.json
07:07 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1014,1018].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
07:07 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1158.eqiad.wmnet with reason: Maintenance
06:57 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2151: repool after maintenance
06:52 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2229.codfw.wmnet with OS trixie
06:50 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2229: Reimage to Trixie
06:50 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2229: Reimage to Trixie
06:50 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2229.codfw.wmnet with reason: Reimage to Trixie
06:40 marostegui@cumin1003: dbctl commit (dc=all): 'Depool db2229 T423837', diff saved to https://phabricator.wikimedia.org/P91129 and previous config saved to /var/cache/conftool/dbconfig/20260420-064042-marostegui.json
06:40 marostegui@cumin1003: dbctl commit (dc=all): 'Promote db2214 to s6 primary T423837', diff saved to https://phabricator.wikimedia.org/P91128 and previous config saved to /var/cache/conftool/dbconfig/20260420-064006-marostegui.json
06:39 marostegui: Starting s6 codfw failover from db2229 to db2214 - T423837
06:36 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 21 hosts with reason: Primary switchover s6 T423837
06:35 marostegui@cumin1003: dbctl commit (dc=all): 'Set db2214 with weight 0 T423837', diff saved to https://phabricator.wikimedia.org/P91127 and previous config saved to /var/cache/conftool/dbconfig/20260420-063553-marostegui.json
06:27 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2151: repool after maintenance
06:26 marostegui@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db2151: after reimage to trixie
06:22 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2151: after reimage to trixie
06:22 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2151.codfw.wmnet with OS trixie
06:06 marostegui: Removed categorylinks_icu72 from s1 and s6 T422546
05:59 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2151.codfw.wmnet with reason: host reimage
05:53 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2151.codfw.wmnet with reason: host reimage
05:33 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2151.codfw.wmnet with OS trixie
05:32 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2151: Reimage to Trixie
05:32 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2151: Reimage to Trixie
05:32 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2151.codfw.wmnet with reason: Reimage to Trixie
03:19 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
03:19 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
03:05 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
03:05 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
00:24 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
00:24 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply

2026-04-19

18:20 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
18:20 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
17:50 zabe@deploy1003: Finished scap sync-world: Backport for Temporarily switch back to file read old schema (T423065) (duration: 33m 41s)
17:36 zabe@deploy1003: zabe: Continuing with sync
17:34 zabe@deploy1003: zabe: Backport for Temporarily switch back to file read old schema (T423065) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
17:16 zabe@deploy1003: Started scap sync-world: Backport for Temporarily switch back to file read old schema (T423065)
16:02 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
16:02 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
13:42 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
13:42 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
06:59 mvernon@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on ms-be1066.eqiad.wmnet with reason: vacuum overlarge container dbs
02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 11s)
02:01 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image

2026-04-17

23:55 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1070.eqiad.wmnet with OS bookworm
23:55 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
23:48 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
23:31 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1070.eqiad.wmnet with reason: host reimage
23:26 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1067.eqiad.wmnet with OS bookworm
23:26 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
23:25 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
23:24 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1070.eqiad.wmnet with reason: host reimage
23:10 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
23:10 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
23:08 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1067.eqiad.wmnet with reason: host reimage
23:07 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mc1070.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
23:05 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1067.eqiad.wmnet with reason: host reimage
23:00 jclark@cumin1003: START - Cookbook sre.hosts.provision for host mc1070.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
22:56 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1056.eqiad.wmnet with OS bookworm
22:56 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
22:56 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
22:51 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mc1070.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
22:50 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host mc1070.eqiad.wmnet with OS bookworm
22:49 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db2253.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
22:49 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db2251.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
22:49 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db2252.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
22:48 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db2250.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
22:45 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mc1067.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
22:41 jclark@cumin1003: START - Cookbook sre.hosts.provision for host mc1070.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
22:40 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1072.eqiad.wmnet with OS bookworm
22:40 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
22:40 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
22:39 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1056.eqiad.wmnet with reason: host reimage
22:38 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host db2253.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
22:38 jclark@cumin1003: START - Cookbook sre.hosts.provision for host mc1067.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
22:38 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host db2252.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
22:38 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host mc1067.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
22:38 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host db2251.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
22:37 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host db2250.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
22:36 jclark@cumin1003: START - Cookbook sre.hosts.provision for host mc1067.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
22:35 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host mc1067.eqiad.wmnet with OS bookworm
22:33 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1056.eqiad.wmnet with reason: host reimage
22:24 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1072.eqiad.wmnet with reason: host reimage
22:23 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host mc1070.eqiad.wmnet with OS bookworm
22:20 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1072.eqiad.wmnet with reason: host reimage
22:19 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host pc2024.codfw.wmnet with OS trixie
22:19 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
22:15 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host mc1067.eqiad.wmnet with OS bookworm
22:13 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
22:13 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host mc1056.eqiad.wmnet with OS bookworm
22:09 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host mc1072.eqiad.wmnet with OS bookworm
22:09 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1071.eqiad.wmnet with OS bookworm
22:09 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
22:08 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
22:08 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mc1072.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
22:02 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host pc2022.codfw.wmnet with OS trixie
22:02 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
22:01 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host pc2023.codfw.wmnet with OS trixie
22:01 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
22:01 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
21:59 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on pc2024.codfw.wmnet with reason: host reimage
21:58 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
21:57 jclark@cumin1003: START - Cookbook sre.hosts.provision for host mc1072.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
21:54 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host pc2021.codfw.wmnet with OS trixie
21:54 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
21:54 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
21:51 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1071.eqiad.wmnet with reason: host reimage
21:47 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on pc2022.codfw.wmnet with reason: host reimage
21:45 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1065.eqiad.wmnet with OS bookworm
21:42 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on pc2023.codfw.wmnet with reason: host reimage
21:42 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host mc1056.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
21:41 jclark@cumin1003: START - Cookbook sre.hosts.provision for host mc1056.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
21:39 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on pc2021.codfw.wmnet with reason: host reimage
21:39 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host mc1056.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
21:37 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on pc2024.codfw.wmnet with reason: host reimage
21:36 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on pc2023.codfw.wmnet with reason: host reimage
21:35 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1069.eqiad.wmnet with OS bookworm
21:35 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
21:35 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on pc2022.codfw.wmnet with reason: host reimage
21:35 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
21:34 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1068.eqiad.wmnet with OS bookworm
21:34 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
21:34 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1071.eqiad.wmnet with reason: host reimage
21:33 jclark@cumin1003: START - Cookbook sre.hosts.provision for host mc1056.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
21:32 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on pc2021.codfw.wmnet with reason: host reimage
21:28 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1065.eqiad.wmnet with reason: host reimage
21:27 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
21:24 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host mc1056.eqiad.wmnet with OS bookworm
21:23 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host mc1071.eqiad.wmnet with OS bookworm
21:21 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1065.eqiad.wmnet with reason: host reimage
21:18 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1069.eqiad.wmnet with reason: host reimage
21:16 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host pc2024.codfw.wmnet with OS trixie
21:16 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1063.eqiad.wmnet with OS bookworm
21:16 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
21:16 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host pc2023.codfw.wmnet with OS trixie
21:16 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
21:15 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host pc2022.codfw.wmnet with OS trixie
21:15 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host pc2021.codfw.wmnet with OS trixie
21:14 jhancock@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts ['pc2021']
21:14 jhancock@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['pc2021']
21:13 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1066.eqiad.wmnet with OS bookworm
21:13 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
21:12 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1069.eqiad.wmnet with reason: host reimage
21:12 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
21:10 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1068.eqiad.wmnet with reason: host reimage
21:10 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host pc2024.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
21:10 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host pc2023.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
21:09 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host mc1065.eqiad.wmnet with OS bookworm
21:09 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host pc2022.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
21:07 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host mc1065.eqiad.wmnet with OS bookworm
21:07 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host pc2021.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
21:06 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1068.eqiad.wmnet with reason: host reimage
21:04 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host pc2024.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
21:04 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host pc2023.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
21:03 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host pc2022.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
21:03 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host pc2021.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
21:03 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host mc1070.eqiad.wmnet with OS bookworm
21:02 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1058.eqiad.wmnet with OS bookworm
21:02 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
21:02 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host mc1069.eqiad.wmnet with OS bookworm
21:02 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
21:00 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1059.eqiad.wmnet with OS bookworm
21:00 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
20:59 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
20:59 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1063.eqiad.wmnet with reason: host reimage
20:56 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host mc1068.eqiad.wmnet with OS bookworm
20:55 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host mc1067.eqiad.wmnet with OS bookworm
20:54 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1066.eqiad.wmnet with reason: host reimage
20:54 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1057.eqiad.wmnet with OS bookworm
20:54 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
20:53 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
20:51 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host pc2022.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
20:50 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1062.eqiad.wmnet with OS bookworm
20:50 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
20:50 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1065.eqiad.wmnet with reason: host reimage
20:50 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
20:48 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host pc2024.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
20:48 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host pc2023.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
20:48 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1066.eqiad.wmnet with reason: host reimage
20:47 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host pc2021.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
20:47 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1061.eqiad.wmnet with OS bookworm
20:47 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
20:46 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
20:45 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1058.eqiad.wmnet with reason: host reimage
20:43 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1064.eqiad.wmnet with OS bookworm
20:43 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
20:43 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
20:42 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1065.eqiad.wmnet with reason: host reimage
20:42 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1059.eqiad.wmnet with reason: host reimage
20:40 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1058.eqiad.wmnet with reason: host reimage
20:39 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1063.eqiad.wmnet with reason: host reimage
20:38 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host pc2024.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
20:37 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host pc2023.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
20:37 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host mc1066.eqiad.wmnet with OS bookworm
20:37 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1055.eqiad.wmnet with OS bookworm
20:37 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host pc2022.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
20:37 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
20:37 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host pc2021.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
20:36 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
20:36 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1057.eqiad.wmnet with reason: host reimage
20:35 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host pc2024
20:34 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host pc2024
20:34 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host pc2023
20:34 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host pc2023
20:34 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host pc2022
20:34 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host pc2022
20:34 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host pc2021
20:33 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host pc2021
20:32 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1062.eqiad.wmnet with reason: host reimage
20:31 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host mc1065.eqiad.wmnet with OS bookworm
20:31 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
20:31 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding pc2021 to codfw - jhancock@cumin2002"
20:31 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding pc2021 to codfw - jhancock@cumin2002"
20:29 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host mc1058.eqiad.wmnet with OS bookworm
20:28 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host mc1063.eqiad.wmnet with OS bookworm
20:28 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1061.eqiad.wmnet with reason: host reimage
20:28 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1060.eqiad.wmnet with OS bookworm
20:28 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
20:28 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
20:25 jhancock@cumin2002: START - Cookbook sre.dns.netbox
20:24 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1064.eqiad.wmnet with reason: host reimage
20:23 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db2253
20:23 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host db2253
20:23 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db2252
20:22 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host db2252
20:22 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db2251
20:22 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host db2251
20:22 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db2250
20:22 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host db2250
20:21 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
20:21 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding db2250 to codfw - jhancock@cumin2002"
20:21 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding db2250 to codfw - jhancock@cumin2002"
20:21 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1062.eqiad.wmnet with reason: host reimage
20:20 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1061.eqiad.wmnet with reason: host reimage
20:20 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1055.eqiad.wmnet with reason: host reimage
20:20 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1059.eqiad.wmnet with reason: host reimage
20:20 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1064.eqiad.wmnet with reason: host reimage
20:17 jhancock@cumin2002: START - Cookbook sre.dns.netbox
20:17 jhancock@cumin2002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
20:15 jhancock@cumin2002: START - Cookbook sre.dns.netbox
20:14 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1057.eqiad.wmnet with reason: host reimage
20:14 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1055.eqiad.wmnet with reason: host reimage
20:13 mutante: planet1003, planet2003 - rebooting on ganeti level for T422596
20:10 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host mc1062.eqiad.wmnet with OS bookworm
20:10 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1060.eqiad.wmnet with reason: host reimage
20:10 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host mc1061.eqiad.wmnet with OS bookworm
20:09 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host mc1064.eqiad.wmnet with OS bookworm
20:09 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host mc1059.eqiad.wmnet with OS bookworm
20:06 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1060.eqiad.wmnet with reason: host reimage
20:04 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host mc1057.eqiad.wmnet with OS bookworm
20:03 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host mc1056.eqiad.wmnet with OS bookworm
20:03 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host mc1055.eqiad.wmnet with OS bookworm
19:59 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mc1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
19:53 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mc1055.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
19:50 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host mc1060.eqiad.wmnet with OS bookworm
19:50 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mc1056.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
19:49 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mc1060.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
19:48 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mc1057.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
19:48 jclark@cumin1003: START - Cookbook sre.hosts.provision for host mc1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
19:47 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mc1059.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
19:46 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mc1061.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
19:44 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mc1062.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
19:44 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mc1065.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
19:44 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mc1064.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
19:41 jclark@cumin1003: START - Cookbook sre.hosts.provision for host mc1055.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
19:40 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mc1063.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
19:38 jclark@cumin1003: START - Cookbook sre.hosts.provision for host mc1056.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
19:37 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host mc1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
19:37 jclark@cumin1003: START - Cookbook sre.hosts.provision for host mc1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
19:36 jclark@cumin1003: START - Cookbook sre.hosts.provision for host mc1057.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
19:36 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mc1066.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
19:36 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host mc1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
19:35 jclark@cumin1003: START - Cookbook sre.hosts.provision for host mc1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
19:35 jclark@cumin1003: START - Cookbook sre.hosts.provision for host mc1060.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
19:35 jclark@cumin1003: START - Cookbook sre.hosts.provision for host mc1059.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
19:35 jclark@cumin1003: START - Cookbook sre.hosts.provision for host mc1061.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
19:33 jclark@cumin1003: START - Cookbook sre.hosts.provision for host mc1062.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
19:33 jclark@cumin1003: START - Cookbook sre.hosts.provision for host mc1065.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
19:33 jclark@cumin1003: START - Cookbook sre.hosts.provision for host mc1063.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
19:33 jclark@cumin1003: START - Cookbook sre.hosts.provision for host mc1064.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
19:33 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mc1067.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
19:33 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mc1068.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
19:33 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
19:33 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mc1070.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
19:32 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
19:32 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mc1069.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
19:32 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mc1071.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
19:25 jclark@cumin1003: START - Cookbook sre.hosts.provision for host mc1066.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
19:24 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host mc1072.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
19:22 jclark@cumin1003: START - Cookbook sre.hosts.provision for host mc1067.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
19:22 jclark@cumin1003: START - Cookbook sre.hosts.provision for host mc1068.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
19:22 jclark@cumin1003: START - Cookbook sre.hosts.provision for host mc1069.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
19:21 jclark@cumin1003: START - Cookbook sre.hosts.provision for host mc1070.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
19:21 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
19:21 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding phab1006 to eqiad - jclark@cumin1003"
19:21 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding phab1006 to eqiad - jclark@cumin1003"
19:21 jclark@cumin1003: START - Cookbook sre.hosts.provision for host mc1071.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
19:18 jclark@cumin1003: START - Cookbook sre.hosts.provision for host mc1072.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
19:17 jclark@cumin1003: START - Cookbook sre.dns.netbox
19:17 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
19:17 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding phab1006 to eqiad - jclark@cumin1003"
19:17 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding phab1006 to eqiad - jclark@cumin1003"
19:12 jclark@cumin1003: START - Cookbook sre.dns.netbox
17:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2222 (T419635)', diff saved to https://phabricator.wikimedia.org/P91116 and previous config saved to /var/cache/conftool/dbconfig/20260417-172835-fceratto.json
17:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2222', diff saved to https://phabricator.wikimedia.org/P91114 and previous config saved to /var/cache/conftool/dbconfig/20260417-171827-fceratto.json
17:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2222', diff saved to https://phabricator.wikimedia.org/P91113 and previous config saved to /var/cache/conftool/dbconfig/20260417-170819-fceratto.json
16:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2222 (T419635)', diff saved to https://phabricator.wikimedia.org/P91112 and previous config saved to /var/cache/conftool/dbconfig/20260417-165811-fceratto.json
16:56 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2222 (T419635)', diff saved to https://phabricator.wikimedia.org/P91111 and previous config saved to /var/cache/conftool/dbconfig/20260417-165559-fceratto.json
16:55 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2222.codfw.wmnet with reason: Maintenance
16:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2221 (T419635)', diff saved to https://phabricator.wikimedia.org/P91110 and previous config saved to /var/cache/conftool/dbconfig/20260417-165544-fceratto.json
16:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2221', diff saved to https://phabricator.wikimedia.org/P91108 and previous config saved to /var/cache/conftool/dbconfig/20260417-164536-fceratto.json
16:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2221', diff saved to https://phabricator.wikimedia.org/P91107 and previous config saved to /var/cache/conftool/dbconfig/20260417-163528-fceratto.json
16:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2221 (T419635)', diff saved to https://phabricator.wikimedia.org/P91105 and previous config saved to /var/cache/conftool/dbconfig/20260417-162520-fceratto.json
16:23 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2221 (T419635)', diff saved to https://phabricator.wikimedia.org/P91104 and previous config saved to /var/cache/conftool/dbconfig/20260417-162307-fceratto.json
16:23 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2221.codfw.wmnet with reason: Maintenance
16:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2218 (T419635)', diff saved to https://phabricator.wikimedia.org/P91103 and previous config saved to /var/cache/conftool/dbconfig/20260417-162253-fceratto.json
16:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2218', diff saved to https://phabricator.wikimedia.org/P91102 and previous config saved to /var/cache/conftool/dbconfig/20260417-161245-fceratto.json
16:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259 (T419961)', diff saved to https://phabricator.wikimedia.org/P91101 and previous config saved to /var/cache/conftool/dbconfig/20260417-160418-fceratto.json
16:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2218', diff saved to https://phabricator.wikimedia.org/P91100 and previous config saved to /var/cache/conftool/dbconfig/20260417-160236-fceratto.json
16:02 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
16:01 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
15:59 jiji@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
15:59 jiji@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
15:59 jiji@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
15:58 jiji@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
15:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259', diff saved to https://phabricator.wikimedia.org/P91099 and previous config saved to /var/cache/conftool/dbconfig/20260417-155410-fceratto.json
15:54 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host phab2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
15:53 elukey@cumin1003: START - Cookbook sre.hosts.provision for host phab2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
15:53 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host phab2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
15:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2218 (T419635)', diff saved to https://phabricator.wikimedia.org/P91098 and previous config saved to /var/cache/conftool/dbconfig/20260417-155228-fceratto.json
15:50 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2218 (T419635)', diff saved to https://phabricator.wikimedia.org/P91097 and previous config saved to /var/cache/conftool/dbconfig/20260417-155015-fceratto.json
15:50 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2218.codfw.wmnet with reason: Maintenance
15:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2208 (T419635)', diff saved to https://phabricator.wikimedia.org/P91096 and previous config saved to /var/cache/conftool/dbconfig/20260417-155001-fceratto.json
15:49 elukey@cumin1003: START - Cookbook sre.hosts.provision for host phab2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
15:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259', diff saved to https://phabricator.wikimedia.org/P91095 and previous config saved to /var/cache/conftool/dbconfig/20260417-154402-fceratto.json
15:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2208', diff saved to https://phabricator.wikimedia.org/P91094 and previous config saved to /var/cache/conftool/dbconfig/20260417-153953-fceratto.json
15:33 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259 (T419961)', diff saved to https://phabricator.wikimedia.org/P91093 and previous config saved to /var/cache/conftool/dbconfig/20260417-153354-fceratto.json
15:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2208', diff saved to https://phabricator.wikimedia.org/P91092 and previous config saved to /var/cache/conftool/dbconfig/20260417-152944-fceratto.json
15:27 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for enwikinews: Move override for $wgFlaggedRevsHandleIncludes to InitialiseSettings.php (T423512) (duration: 06m 51s)
15:26 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1259 (T419961)', diff saved to https://phabricator.wikimedia.org/P91091 and previous config saved to /var/cache/conftool/dbconfig/20260417-152620-fceratto.json
15:26 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1259.eqiad.wmnet with reason: Maintenance
15:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254 (T419961)', diff saved to https://phabricator.wikimedia.org/P91090 and previous config saved to /var/cache/conftool/dbconfig/20260417-152549-fceratto.json
15:25 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
15:25 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
15:24 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
15:24 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
15:23 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, asmartkitten: Continuing with sync
15:23 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
15:23 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
15:22 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, asmartkitten: Backport for enwikinews: Move override for $wgFlaggedRevsHandleIncludes to InitialiseSettings.php (T423512) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
15:22 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
15:22 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
15:20 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for enwikinews: Move override for $wgFlaggedRevsHandleIncludes to InitialiseSettings.php (T423512)
15:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2208 (T419635)', diff saved to https://phabricator.wikimedia.org/P91089 and previous config saved to /var/cache/conftool/dbconfig/20260417-151936-fceratto.json
15:18 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
15:18 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
15:17 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
15:17 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
15:17 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2208 (T419635)', diff saved to https://phabricator.wikimedia.org/P91088 and previous config saved to /var/cache/conftool/dbconfig/20260417-151723-fceratto.json
15:17 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2208.codfw.wmnet with reason: Maintenance
15:16 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2200.codfw.wmnet with reason: Maintenance
15:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254', diff saved to https://phabricator.wikimedia.org/P91087 and previous config saved to /var/cache/conftool/dbconfig/20260417-151541-fceratto.json
15:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254', diff saved to https://phabricator.wikimedia.org/P91086 and previous config saved to /var/cache/conftool/dbconfig/20260417-150532-fceratto.json
15:04 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2198.codfw.wmnet with reason: Maintenance
15:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2182 (T419635)', diff saved to https://phabricator.wikimedia.org/P91085 and previous config saved to /var/cache/conftool/dbconfig/20260417-150440-fceratto.json
14:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254 (T419961)', diff saved to https://phabricator.wikimedia.org/P91084 and previous config saved to /var/cache/conftool/dbconfig/20260417-145524-fceratto.json
14:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2182', diff saved to https://phabricator.wikimedia.org/P91083 and previous config saved to /var/cache/conftool/dbconfig/20260417-145432-fceratto.json
14:51 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
14:51 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
14:48 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
14:48 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
14:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1254 (T419961)', diff saved to https://phabricator.wikimedia.org/P91082 and previous config saved to /var/cache/conftool/dbconfig/20260417-144819-fceratto.json
14:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1254.eqiad.wmnet with reason: Maintenance
14:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2182', diff saved to https://phabricator.wikimedia.org/P91081 and previous config saved to /var/cache/conftool/dbconfig/20260417-144424-fceratto.json
14:43 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1239.eqiad.wmnet with reason: Maintenance
14:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233 (T419961)', diff saved to https://phabricator.wikimedia.org/P91080 and previous config saved to /var/cache/conftool/dbconfig/20260417-144247-fceratto.json
14:36 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
14:36 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
14:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2182 (T419635)', diff saved to https://phabricator.wikimedia.org/P91079 and previous config saved to /var/cache/conftool/dbconfig/20260417-143416-fceratto.json
14:33 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
14:33 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
14:33 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
14:32 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
14:32 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
14:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P91078 and previous config saved to /var/cache/conftool/dbconfig/20260417-143238-fceratto.json
14:32 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2182 (T419635)', diff saved to https://phabricator.wikimedia.org/P91077 and previous config saved to /var/cache/conftool/dbconfig/20260417-143204-fceratto.json
14:31 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2182.codfw.wmnet with reason: Maintenance
14:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2168 (T419635)', diff saved to https://phabricator.wikimedia.org/P91076 and previous config saved to /var/cache/conftool/dbconfig/20260417-143139-fceratto.json
14:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P91075 and previous config saved to /var/cache/conftool/dbconfig/20260417-142230-fceratto.json
14:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2168', diff saved to https://phabricator.wikimedia.org/P91074 and previous config saved to /var/cache/conftool/dbconfig/20260417-142130-fceratto.json
14:18 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
14:16 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
14:16 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
14:14 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
14:13 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
14:12 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
14:12 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
14:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233 (T419961)', diff saved to https://phabricator.wikimedia.org/P91073 and previous config saved to /var/cache/conftool/dbconfig/20260417-141222-fceratto.json
14:12 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
14:11 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
14:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2168', diff saved to https://phabricator.wikimedia.org/P91072 and previous config saved to /var/cache/conftool/dbconfig/20260417-141123-fceratto.json
14:10 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
14:09 urandom: decommissioning Cassandra, aqs1011 [a,b] — T412830
14:06 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-restart-varnish (exit_code=0) rolling restart of Varnish on 1 hosts matching query P{cp3073.*}
14:06 eevans@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on aqs1011.eqiad.wmnet with reason: Bootstrapping — T412830
14:05 fabfur@cumin1003: START - Cookbook sre.cdn.roll-restart-varnish rolling restart of Varnish on 1 hosts matching query P{cp3073.*}
14:04 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1233 (T419961)', diff saved to https://phabricator.wikimedia.org/P91071 and previous config saved to /var/cache/conftool/dbconfig/20260417-140454-fceratto.json
14:04 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1233.eqiad.wmnet with reason: Maintenance
14:04 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-restart-varnish (exit_code=0) rolling restart of Varnish on 1 hosts matching query P{cp3072.*}
14:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229 (T419961)', diff saved to https://phabricator.wikimedia.org/P91070 and previous config saved to /var/cache/conftool/dbconfig/20260417-140424-fceratto.json
14:04 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
14:03 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
14:03 fabfur@cumin1003: START - Cookbook sre.cdn.roll-restart-varnish rolling restart of Varnish on 1 hosts matching query P{cp3072.*}
14:02 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-restart-varnish (exit_code=0) rolling restart of Varnish on 1 hosts matching query P{cp3070.*}
14:01 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
14:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2168 (T419635)', diff saved to https://phabricator.wikimedia.org/P91069 and previous config saved to /var/cache/conftool/dbconfig/20260417-140115-fceratto.json
14:01 fabfur@cumin1003: START - Cookbook sre.cdn.roll-restart-varnish rolling restart of Varnish on 1 hosts matching query P{cp3070.*}
14:00 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-restart-varnish (exit_code=0) rolling restart of Varnish on 1 hosts matching query P{cp3069.*}
14:00 fabfur: restart varnish on cp3069, cp3070, cp3072, cp3073 to clear alerts
14:00 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2168 (T419635)', diff saved to https://phabricator.wikimedia.org/P91068 and previous config saved to /var/cache/conftool/dbconfig/20260417-140003-fceratto.json
13:59 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2168.codfw.wmnet with reason: Maintenance
13:59 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
13:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2159 (T419635)', diff saved to https://phabricator.wikimedia.org/P91067 and previous config saved to /var/cache/conftool/dbconfig/20260417-135938-fceratto.json
13:58 fabfur@cumin1003: START - Cookbook sre.cdn.roll-restart-varnish rolling restart of Varnish on 1 hosts matching query P{cp3069.*}
13:57 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-restart-varnish (exit_code=0) rolling restart of Varnish on 1 hosts matching query P{cp3066.*}
13:54 fabfur@cumin1003: START - Cookbook sre.cdn.roll-restart-varnish rolling restart of Varnish on 1 hosts matching query P{cp3066.*}
13:54 fabfur: restarting varnish on cp3066 to clear alerts
13:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P91066 and previous config saved to /var/cache/conftool/dbconfig/20260417-135416-fceratto.json
13:52 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
13:52 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
13:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2159', diff saved to https://phabricator.wikimedia.org/P91065 and previous config saved to /var/cache/conftool/dbconfig/20260417-134930-fceratto.json
13:44 jmm@dns1004: END - running authdns-update
13:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P91064 and previous config saved to /var/cache/conftool/dbconfig/20260417-134408-fceratto.json
13:43 jmm@dns1004: START - running authdns-update
13:42 inflatador: bking@apt1002 sudo -E reprepro -C component/opensearch2 include trixie-wikimedia /home/bking/wmf-opensearch-search-plugins-2.19.5+5-trixie/wmf-opensearch-search-plugins_2.19.5+5_amd64.changes
13:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2159', diff saved to https://phabricator.wikimedia.org/P91063 and previous config saved to /var/cache/conftool/dbconfig/20260417-133923-fceratto.json
13:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229 (T419961)', diff saved to https://phabricator.wikimedia.org/P91062 and previous config saved to /var/cache/conftool/dbconfig/20260417-133359-fceratto.json
13:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2159 (T419635)', diff saved to https://phabricator.wikimedia.org/P91061 and previous config saved to /var/cache/conftool/dbconfig/20260417-132914-fceratto.json
13:28 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2159 (T419635)', diff saved to https://phabricator.wikimedia.org/P91060 and previous config saved to /var/cache/conftool/dbconfig/20260417-132802-fceratto.json
13:27 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2159.codfw.wmnet with reason: Maintenance
13:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2150 (T419635)', diff saved to https://phabricator.wikimedia.org/P91059 and previous config saved to /var/cache/conftool/dbconfig/20260417-132738-fceratto.json
13:27 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
13:26 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
13:26 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1229 (T419961)', diff saved to https://phabricator.wikimedia.org/P91058 and previous config saved to /var/cache/conftool/dbconfig/20260417-132628-fceratto.json
13:26 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1229.eqiad.wmnet with reason: Maintenance
13:26 jiji@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM wikikube-worker-exp2001.codfw.wmnet
13:22 jiji@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM wikikube-worker-exp2001.codfw.wmnet
13:20 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1225.eqiad.wmnet with reason: Maintenance
13:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197 (T419961)', diff saved to https://phabricator.wikimedia.org/P91057 and previous config saved to /var/cache/conftool/dbconfig/20260417-132034-fceratto.json
13:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2150', diff saved to https://phabricator.wikimedia.org/P91056 and previous config saved to /var/cache/conftool/dbconfig/20260417-131730-fceratto.json
13:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P91055 and previous config saved to /var/cache/conftool/dbconfig/20260417-131026-fceratto.json
13:07 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2150', diff saved to https://phabricator.wikimedia.org/P91054 and previous config saved to /var/cache/conftool/dbconfig/20260417-130722-fceratto.json
13:07 jiji@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM wikikube-worker-exp1001.eqiad.wmnet
13:00 jiji@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM wikikube-worker-exp1001.eqiad.wmnet
13:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P91053 and previous config saved to /var/cache/conftool/dbconfig/20260417-130018-fceratto.json
12:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2150 (T419635)', diff saved to https://phabricator.wikimedia.org/P91052 and previous config saved to /var/cache/conftool/dbconfig/20260417-125714-fceratto.json
12:55 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2150 (T419635)', diff saved to https://phabricator.wikimedia.org/P91051 and previous config saved to /var/cache/conftool/dbconfig/20260417-125501-fceratto.json
12:54 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2150.codfw.wmnet with reason: Maintenance
12:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197 (T419961)', diff saved to https://phabricator.wikimedia.org/P91050 and previous config saved to /var/cache/conftool/dbconfig/20260417-125009-fceratto.json
12:41 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1197 (T419961)', diff saved to https://phabricator.wikimedia.org/P91049 and previous config saved to /var/cache/conftool/dbconfig/20260417-124149-fceratto.json
12:41 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1197.eqiad.wmnet with reason: Maintenance
12:41 jiji@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
12:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188 (T419961)', diff saved to https://phabricator.wikimedia.org/P91048 and previous config saved to /var/cache/conftool/dbconfig/20260417-124120-fceratto.json
12:41 jiji@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
12:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P91047 and previous config saved to /var/cache/conftool/dbconfig/20260417-123111-fceratto.json
12:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P91046 and previous config saved to /var/cache/conftool/dbconfig/20260417-122104-fceratto.json
12:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188 (T419961)', diff saved to https://phabricator.wikimedia.org/P91045 and previous config saved to /var/cache/conftool/dbconfig/20260417-121056-fceratto.json
12:02 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1188 (T419961)', diff saved to https://phabricator.wikimedia.org/P91044 and previous config saved to /var/cache/conftool/dbconfig/20260417-120255-fceratto.json
12:02 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1188.eqiad.wmnet with reason: Maintenance
12:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182 (T419961)', diff saved to https://phabricator.wikimedia.org/P91043 and previous config saved to /var/cache/conftool/dbconfig/20260417-120226-fceratto.json
11:55 jiji@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
11:54 jiji@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
11:53 jiji@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
11:53 jiji@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
11:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P91042 and previous config saved to /var/cache/conftool/dbconfig/20260417-115218-fceratto.json
11:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P91041 and previous config saved to /var/cache/conftool/dbconfig/20260417-114210-fceratto.json
11:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182 (T419961)', diff saved to https://phabricator.wikimedia.org/P91040 and previous config saved to /var/cache/conftool/dbconfig/20260417-113201-fceratto.json
11:23 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1182 (T419961)', diff saved to https://phabricator.wikimedia.org/P91039 and previous config saved to /var/cache/conftool/dbconfig/20260417-112333-fceratto.json
11:23 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1182.eqiad.wmnet with reason: Maintenance
11:23 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162 (T419961)', diff saved to Unable to send diff to phaste and previous config saved to /var/cache/conftool/dbconfig/20260417-112259-fceratto.json
11:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P91037 and previous config saved to /var/cache/conftool/dbconfig/20260417-111250-fceratto.json
11:11 jynus@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ms-backup2002.codfw.wmnet
11:11 jynus@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
11:11 jynus@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ms-backup2002.codfw.wmnet decommissioned, removing all IPs except the asset tag one - jynus@cumin1003"
11:08 jynus@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ms-backup2002.codfw.wmnet decommissioned, removing all IPs except the asset tag one - jynus@cumin1003"
11:03 jynus@cumin1003: START - Cookbook sre.dns.netbox
11:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P91036 and previous config saved to /var/cache/conftool/dbconfig/20260417-110242-fceratto.json
10:55 jynus@cumin1003: START - Cookbook sre.hosts.decommission for hosts ms-backup2002.codfw.wmnet
10:54 jynus@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ms-backup2001.codfw.wmnet
10:54 jynus@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
10:54 jynus@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ms-backup2001.codfw.wmnet decommissioned, removing all IPs except the asset tag one - jynus@cumin1003"
10:53 jynus@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ms-backup2001.codfw.wmnet decommissioned, removing all IPs except the asset tag one - jynus@cumin1003"
10:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162 (T419961)', diff saved to https://phabricator.wikimedia.org/P91035 and previous config saved to /var/cache/conftool/dbconfig/20260417-105234-fceratto.json
10:48 jynus@cumin1003: START - Cookbook sre.dns.netbox
10:43 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1162 (T419961)', diff saved to https://phabricator.wikimedia.org/P91034 and previous config saved to /var/cache/conftool/dbconfig/20260417-104327-fceratto.json
10:43 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1162.eqiad.wmnet with reason: Maintenance
10:43 jynus@cumin1003: START - Cookbook sre.hosts.decommission for hosts ms-backup2001.codfw.wmnet
10:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156 (T419961)', diff saved to https://phabricator.wikimedia.org/P91033 and previous config saved to /var/cache/conftool/dbconfig/20260417-104257-fceratto.json
10:37 jynus@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ms-backup1002.eqiad.wmnet
10:37 jynus@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
10:37 jynus@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ms-backup1002.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jynus@cumin1003"
10:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P91032 and previous config saved to /var/cache/conftool/dbconfig/20260417-103249-fceratto.json
10:31 jynus@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ms-backup1002.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jynus@cumin1003"
10:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P91031 and previous config saved to /var/cache/conftool/dbconfig/20260417-102241-fceratto.json
10:20 jynus@cumin1003: START - Cookbook sre.dns.netbox
10:13 jynus@cumin1003: START - Cookbook sre.hosts.decommission for hosts ms-backup1002.eqiad.wmnet
10:13 jynus@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ms-backup1001.eqiad.wmnet
10:13 jynus@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
10:12 jynus@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ms-backup1001.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jynus@cumin1003"
10:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156 (T419961)', diff saved to https://phabricator.wikimedia.org/P91030 and previous config saved to /var/cache/conftool/dbconfig/20260417-101233-fceratto.json
10:11 jynus@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ms-backup1001.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jynus@cumin1003"
10:04 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1156 (T419961)', diff saved to https://phabricator.wikimedia.org/P91029 and previous config saved to /var/cache/conftool/dbconfig/20260417-100401-fceratto.json
10:03 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1014,1018].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
10:03 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1156.eqiad.wmnet with reason: Maintenance
10:00 jynus@cumin1003: START - Cookbook sre.dns.netbox
09:55 jynus@cumin1003: START - Cookbook sre.hosts.decommission for hosts ms-backup1001.eqiad.wmnet
09:54 marostegui: pool esams
09:53 marostegui@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: pool esams [reason: no reason specified, no task ID specified]
09:53 marostegui@cumin1003: START - Cookbook sre.dns.admin DNS admin: pool esams [reason: no reason specified, no task ID specified]
09:44 moritzm: initialise eqsin02 Ganeti cluster T421863
09:36 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device lsw1-f3-codfw
09:36 ayounsi@cumin1003: START - Cookbook sre.network.tls for network device lsw1-f3-codfw
09:36 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device ssw1-f1-codfw
09:36 ayounsi@cumin1003: START - Cookbook sre.network.tls for network device ssw1-f1-codfw
09:35 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device ssw1-e1-codfw
09:35 ayounsi@cumin1003: START - Cookbook sre.network.tls for network device ssw1-e1-codfw
09:35 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device lsw1-e1-codfw
09:35 ayounsi@cumin1003: START - Cookbook sre.network.tls for network device lsw1-e1-codfw
09:35 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device lsw1-e3-codfw
09:35 ayounsi@cumin1003: START - Cookbook sre.network.tls for network device lsw1-e3-codfw
08:51 topranks: depool esams due to connectivity issues
08:51 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: depool esams [reason: no reason specified, no task ID specified]
08:51 cmooney@cumin1003: START - Cookbook sre.dns.admin DNS admin: depool esams [reason: no reason specified, no task ID specified]
08:50 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
08:48 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
08:46 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
08:45 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
08:44 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
08:42 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
08:42 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
08:39 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
08:39 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
08:37 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
07:33 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1201: after reimage to trixie
07:11 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
07:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1263 (T419635)', diff saved to https://phabricator.wikimedia.org/P91025 and previous config saved to /var/cache/conftool/dbconfig/20260417-071048-fceratto.json
07:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1263', diff saved to https://phabricator.wikimedia.org/P91023 and previous config saved to /var/cache/conftool/dbconfig/20260417-070039-fceratto.json
06:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1263', diff saved to https://phabricator.wikimedia.org/P91022 and previous config saved to /var/cache/conftool/dbconfig/20260417-065031-fceratto.json
06:49 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2158: repool after maintenance
06:48 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1201: after reimage to trixie
06:46 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1201.eqiad.wmnet with OS trixie
06:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1263 (T419635)', diff saved to https://phabricator.wikimedia.org/P91019 and previous config saved to /var/cache/conftool/dbconfig/20260417-064023-fceratto.json
06:24 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1201.eqiad.wmnet with reason: host reimage
06:22 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1201.eqiad.wmnet with reason: host reimage
06:08 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1201.eqiad.wmnet with OS trixie
06:07 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1201: Reimage to Trixie
06:06 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1201: Reimage to Trixie
06:06 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1201.eqiad.wmnet with reason: Reimage to Trixie
06:04 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2158: repool after maintenance
06:00 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2158.codfw.wmnet with OS trixie
05:39 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2158.codfw.wmnet with reason: host reimage
05:33 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2158.codfw.wmnet with reason: host reimage
05:16 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2158.codfw.wmnet with OS trixie
05:13 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2158: Reimage to Trixie
05:13 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2158: Reimage to Trixie
05:13 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2158.codfw.wmnet with reason: Reimage to Trixie
04:45 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1263 (T419635)', diff saved to https://phabricator.wikimedia.org/P91013 and previous config saved to /var/cache/conftool/dbconfig/20260417-044543-fceratto.json
04:45 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1263.eqiad.wmnet with reason: Maintenance
04:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1262 (T419635)', diff saved to https://phabricator.wikimedia.org/P91012 and previous config saved to /var/cache/conftool/dbconfig/20260417-044518-fceratto.json
04:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1262', diff saved to https://phabricator.wikimedia.org/P91011 and previous config saved to /var/cache/conftool/dbconfig/20260417-043510-fceratto.json
04:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1262', diff saved to https://phabricator.wikimedia.org/P91010 and previous config saved to /var/cache/conftool/dbconfig/20260417-042502-fceratto.json
04:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1262 (T419635)', diff saved to https://phabricator.wikimedia.org/P91009 and previous config saved to /var/cache/conftool/dbconfig/20260417-041454-fceratto.json
02:16 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1262 (T419635)', diff saved to https://phabricator.wikimedia.org/P91008 and previous config saved to /var/cache/conftool/dbconfig/20260417-021624-fceratto.json
02:16 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1262.eqiad.wmnet with reason: Maintenance
02:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1261 (T419635)', diff saved to https://phabricator.wikimedia.org/P91007 and previous config saved to /var/cache/conftool/dbconfig/20260417-021558-fceratto.json
02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 25s)
02:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1261', diff saved to https://phabricator.wikimedia.org/P91006 and previous config saved to /var/cache/conftool/dbconfig/20260417-020550-fceratto.json
02:01 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
01:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1261', diff saved to https://phabricator.wikimedia.org/P91005 and previous config saved to /var/cache/conftool/dbconfig/20260417-015542-fceratto.json
01:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1261 (T419635)', diff saved to https://phabricator.wikimedia.org/P91004 and previous config saved to /var/cache/conftool/dbconfig/20260417-014534-fceratto.json
00:10 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2006.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
00:03 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2006.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
00:03 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2006.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART

2026-04-16

23:56 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2006.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
23:51 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1261 (T419635)', diff saved to https://phabricator.wikimedia.org/P91003 and previous config saved to /var/cache/conftool/dbconfig/20260416-235123-fceratto.json
23:51 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1261.eqiad.wmnet with reason: Maintenance
23:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1260 (T419635)', diff saved to https://phabricator.wikimedia.org/P91002 and previous config saved to /var/cache/conftool/dbconfig/20260416-235059-fceratto.json
23:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1260', diff saved to https://phabricator.wikimedia.org/P91001 and previous config saved to /var/cache/conftool/dbconfig/20260416-234052-fceratto.json
23:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1260', diff saved to https://phabricator.wikimedia.org/P91000 and previous config saved to /var/cache/conftool/dbconfig/20260416-233044-fceratto.json
23:25 musikanimal@deploy1003: Finished scap sync-world: Backport for CommonSettings: use CodeMirror instead of CodeEditor in AbuseFilter (T399673) (duration: 06m 35s)
23:21 musikanimal@deploy1003: musikanimal: Continuing with sync
23:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1260 (T419635)', diff saved to https://phabricator.wikimedia.org/P90999 and previous config saved to /var/cache/conftool/dbconfig/20260416-232036-fceratto.json
23:20 musikanimal@deploy1003: musikanimal: Backport for CommonSettings: use CodeMirror instead of CodeEditor in AbuseFilter (T399673) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
23:18 musikanimal@deploy1003: Started scap sync-world: Backport for CommonSettings: use CodeMirror instead of CodeEditor in AbuseFilter (T399673)
22:14 James_F: jforrester@deploy1003:/srv/mediawiki-staging$ foreachwikiindblist sul extensions/Wikibase/lib/maintenance/populateSitesTable.php # T423660
22:08 cscott@deploy1003: Finished scap sync-world: Backport for ConverterRule: convert `null` to `false` when needed (T423639), Convert language to internal code in tests, ParsoidCachePrewarmJob: Define the title in the req context (T422780), Move language variant parser option setting from Article to WikiPage (T423534) (duration: 09m 41s)
22:04 cscott@deploy1003: cscott: Continuing with sync
22:00 cscott@deploy1003: cscott: Backport for ConverterRule: convert `null` to `false` when needed (T423639), Convert language to internal code in tests, ParsoidCachePrewarmJob: Define the title in the req context (T422780), Move language variant parser option setting from Article to WikiPage (T423534) synced to the testservers (see https://wikitech
21:58 zabe@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
21:58 cscott@deploy1003: Started scap sync-world: Backport for ConverterRule: convert `null` to `false` when needed (T423639), Convert language to internal code in tests, ParsoidCachePrewarmJob: Define the title in the req context (T422780), Move language variant parser option setting from Article to WikiPage (T423534)
21:57 zabe@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
21:33 cscott@deploy1003: Finished scap sync-world: Backport for Deploy PRV to 4 wikis (T423188), [bnwikisource] Enable PageImages on NS:4, NS:100, NS:104, NS:106, NS:114 (duration: 17m 26s)
21:29 cscott@deploy1003: cscott, arlolra, bodhisattwa: Continuing with sync
21:23 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1260 (T419635)', diff saved to https://phabricator.wikimedia.org/P90997 and previous config saved to /var/cache/conftool/dbconfig/20260416-212348-fceratto.json
21:23 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1260.eqiad.wmnet with reason: Maintenance
21:23 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1252 (T419635)', diff saved to https://phabricator.wikimedia.org/P90996 and previous config saved to /var/cache/conftool/dbconfig/20260416-212323-fceratto.json
21:17 cscott@deploy1003: cscott, arlolra, bodhisattwa: Backport for Deploy PRV to 4 wikis (T423188), [bnwikisource] Enable PageImages on NS:4, NS:100, NS:104, NS:106, NS:114 synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
21:16 cscott@deploy1003: Started scap sync-world: Backport for Deploy PRV to 4 wikis (T423188), [bnwikisource] Enable PageImages on NS:4, NS:100, NS:104, NS:106, NS:114
21:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1252', diff saved to https://phabricator.wikimedia.org/P90995 and previous config saved to /var/cache/conftool/dbconfig/20260416-211315-fceratto.json
21:03 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1252', diff saved to https://phabricator.wikimedia.org/P90994 and previous config saved to /var/cache/conftool/dbconfig/20260416-210307-fceratto.json
20:53 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1252 (T419635)', diff saved to https://phabricator.wikimedia.org/P90993 and previous config saved to /var/cache/conftool/dbconfig/20260416-205258-fceratto.json
20:51 stran@deploy1003: Finished scap sync-world: Backport for Deploy IRS to enwiki's Event Talk namespace (T423042), Make abstractwiki a multi-lingual Wikidata client (T420420), Enable attribution.v0-beta in RestSandboxSpecs for all wikis (T419545) (duration: 08m 36s)
20:48 stran@deploy1003: aaron, stran, jforrester: Continuing with sync
20:44 stran@deploy1003: aaron, stran, jforrester: Backport for Deploy IRS to enwiki's Event Talk namespace (T423042), Make abstractwiki a multi-lingual Wikidata client (T420420), Enable attribution.v0-beta in RestSandboxSpecs for all wikis (T419545) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
20:43 stran@deploy1003: Started scap sync-world: Backport for Deploy IRS to enwiki's Event Talk namespace (T423042), Make abstractwiki a multi-lingual Wikidata client (T420420), Enable attribution.v0-beta in RestSandboxSpecs for all wikis (T419545)
20:36 stran@deploy1003: Finished scap sync-world: Backport for Allow the 'ReportIncidentEnabledNamespaces' config to be ovewritten (T423042) (duration: 09m 07s)
20:33 stran@deploy1003: stran: Continuing with sync
20:29 stran@deploy1003: stran: Backport for Allow the 'ReportIncidentEnabledNamespaces' config to be ovewritten (T423042) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
20:27 stran@deploy1003: Started scap sync-world: Backport for Allow the 'ReportIncidentEnabledNamespaces' config to be ovewritten (T423042)
20:17 maryum: Removed private mitigation for T419137
20:09 mstyles@deploy1003: Finished scap sync-world: Backport for config: Enable EmailConfirmationBanner on selected wikis (T421366) (duration: 06m 06s)
20:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2238 (T419961)', diff saved to https://phabricator.wikimedia.org/P90992 and previous config saved to /var/cache/conftool/dbconfig/20260416-200839-fceratto.json
20:05 mstyles@deploy1003: mmartorana, mstyles: Continuing with sync
20:05 mstyles@deploy1003: mmartorana, mstyles: Backport for config: Enable EmailConfirmationBanner on selected wikis (T421366) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
20:03 mstyles@deploy1003: Started scap sync-world: Backport for config: Enable EmailConfirmationBanner on selected wikis (T421366)
19:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2238', diff saved to https://phabricator.wikimedia.org/P90991 and previous config saved to /var/cache/conftool/dbconfig/20260416-195831-fceratto.json
19:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2238', diff saved to https://phabricator.wikimedia.org/P90990 and previous config saved to /var/cache/conftool/dbconfig/20260416-194823-fceratto.json
19:48 zabe@deploy1003: Finished scap sync-world: Backport for Set $wgGlobalUsageSharedRepoWiki for testcommonswiki (T421914), Also disable updates for GloballyWantedFiles on testcommonswiki (T421914) (duration: 06m 48s)
19:44 zabe@deploy1003: zabe: Continuing with sync
19:43 zabe@deploy1003: zabe: Backport for Set $wgGlobalUsageSharedRepoWiki for testcommonswiki (T421914), Also disable updates for GloballyWantedFiles on testcommonswiki (T421914) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
19:41 zabe@deploy1003: Started scap sync-world: Backport for Set $wgGlobalUsageSharedRepoWiki for testcommonswiki (T421914), Also disable updates for GloballyWantedFiles on testcommonswiki (T421914)
19:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2238 (T419961)', diff saved to https://phabricator.wikimedia.org/P90989 and previous config saved to /var/cache/conftool/dbconfig/20260416-193814-fceratto.json
19:36 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-ctrl2005.codfw.wmnet with OS trixie
19:34 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-ctrl2004.codfw.wmnet with OS trixie
19:31 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2238 (T419961)', diff saved to https://phabricator.wikimedia.org/P90988 and previous config saved to /var/cache/conftool/dbconfig/20260416-193100-fceratto.json
19:30 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2238.codfw.wmnet with reason: Maintenance
19:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2226 (T419961)', diff saved to https://phabricator.wikimedia.org/P90987 and previous config saved to /var/cache/conftool/dbconfig/20260416-193028-fceratto.json
19:21 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-ctrl2005.codfw.wmnet with reason: host reimage
19:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2226', diff saved to https://phabricator.wikimedia.org/P90986 and previous config saved to /var/cache/conftool/dbconfig/20260416-192020-fceratto.json
19:19 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-ctrl2004.codfw.wmnet with reason: host reimage
19:16 jasmine@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-ctrl2005.codfw.wmnet with reason: host reimage
19:15 jasmine@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-ctrl2004.codfw.wmnet with reason: host reimage
19:14 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
19:14 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
19:12 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
19:12 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
19:11 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
19:11 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
19:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2226', diff saved to https://phabricator.wikimedia.org/P90985 and previous config saved to /var/cache/conftool/dbconfig/20260416-191012-fceratto.json
19:03 jasmine@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-ctrl2005.codfw.wmnet with OS trixie
19:02 jasmine@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-ctrl2004.codfw.wmnet with OS trixie
19:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2226 (T419961)', diff saved to https://phabricator.wikimedia.org/P90984 and previous config saved to /var/cache/conftool/dbconfig/20260416-190004-fceratto.json
18:59 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
18:57 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1252 (T419635)', diff saved to https://phabricator.wikimedia.org/P90983 and previous config saved to /var/cache/conftool/dbconfig/20260416-185757-fceratto.json
18:57 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1252.eqiad.wmnet with reason: Maintenance
18:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1249 (T419635)', diff saved to https://phabricator.wikimedia.org/P90982 and previous config saved to /var/cache/conftool/dbconfig/20260416-185731-fceratto.json
18:52 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2226 (T419961)', diff saved to https://phabricator.wikimedia.org/P90981 and previous config saved to /var/cache/conftool/dbconfig/20260416-185253-fceratto.json
18:52 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2226.codfw.wmnet with reason: Maintenance
18:52 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
18:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2225 (T419961)', diff saved to https://phabricator.wikimedia.org/P90980 and previous config saved to /var/cache/conftool/dbconfig/20260416-185222-fceratto.json
18:49 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
18:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1249', diff saved to https://phabricator.wikimedia.org/P90979 and previous config saved to /var/cache/conftool/dbconfig/20260416-184723-fceratto.json
18:46 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
18:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2225', diff saved to https://phabricator.wikimedia.org/P90978 and previous config saved to /var/cache/conftool/dbconfig/20260416-184213-fceratto.json
18:42 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
18:39 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
18:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1249', diff saved to https://phabricator.wikimedia.org/P90977 and previous config saved to /var/cache/conftool/dbconfig/20260416-183715-fceratto.json
18:36 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
18:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2225', diff saved to https://phabricator.wikimedia.org/P90976 and previous config saved to /var/cache/conftool/dbconfig/20260416-183205-fceratto.json
18:32 dduvall@deploy1003: rebuilt and synchronized wikiversions files: group2 to 1.46.0-wmf.24 refs T420482
18:28 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
18:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1249 (T419635)', diff saved to https://phabricator.wikimedia.org/P90975 and previous config saved to /var/cache/conftool/dbconfig/20260416-182707-fceratto.json
18:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2225 (T419961)', diff saved to https://phabricator.wikimedia.org/P90974 and previous config saved to /var/cache/conftool/dbconfig/20260416-182157-fceratto.json
18:14 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2225 (T419961)', diff saved to https://phabricator.wikimedia.org/P90973 and previous config saved to /var/cache/conftool/dbconfig/20260416-181447-fceratto.json
18:14 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2225.codfw.wmnet with reason: Maintenance
18:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2204 (T419961)', diff saved to https://phabricator.wikimedia.org/P90972 and previous config saved to /var/cache/conftool/dbconfig/20260416-181415-fceratto.json
18:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2204', diff saved to https://phabricator.wikimedia.org/P90971 and previous config saved to /var/cache/conftool/dbconfig/20260416-180407-fceratto.json
17:53 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2204', diff saved to https://phabricator.wikimedia.org/P90970 and previous config saved to /var/cache/conftool/dbconfig/20260416-175358-fceratto.json
17:43 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2204 (T419961)', diff saved to https://phabricator.wikimedia.org/P90969 and previous config saved to /var/cache/conftool/dbconfig/20260416-174350-fceratto.json
17:36 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2204 (T419961)', diff saved to https://phabricator.wikimedia.org/P90968 and previous config saved to /var/cache/conftool/dbconfig/20260416-173640-fceratto.json
17:36 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2204.codfw.wmnet with reason: Maintenance
17:31 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2197.codfw.wmnet with reason: Maintenance
17:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2189 (T419961)', diff saved to https://phabricator.wikimedia.org/P90967 and previous config saved to /var/cache/conftool/dbconfig/20260416-173058-fceratto.json
17:28 bd808@deploy1003: helmfile [eqiad] DONE helmfile.d/services/developer-portal: apply
17:27 bd808@deploy1003: helmfile [eqiad] START helmfile.d/services/developer-portal: apply
17:27 bd808@deploy1003: helmfile [codfw] DONE helmfile.d/services/developer-portal: apply
17:27 bd808@deploy1003: helmfile [codfw] START helmfile.d/services/developer-portal: apply
17:26 bd808@deploy1003: helmfile [staging] DONE helmfile.d/services/developer-portal: apply
17:26 bd808@deploy1003: helmfile [staging] START helmfile.d/services/developer-portal: apply
17:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2189', diff saved to https://phabricator.wikimedia.org/P90966 and previous config saved to /var/cache/conftool/dbconfig/20260416-172050-fceratto.json
17:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2189', diff saved to https://phabricator.wikimedia.org/P90964 and previous config saved to /var/cache/conftool/dbconfig/20260416-171041-fceratto.json
17:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2189 (T419961)', diff saved to https://phabricator.wikimedia.org/P90963 and previous config saved to /var/cache/conftool/dbconfig/20260416-170033-fceratto.json
16:53 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2189 (T419961)', diff saved to https://phabricator.wikimedia.org/P90962 and previous config saved to /var/cache/conftool/dbconfig/20260416-165326-fceratto.json
16:53 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2189.codfw.wmnet with reason: Maintenance
16:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2175 (T419961)', diff saved to https://phabricator.wikimedia.org/P90961 and previous config saved to /var/cache/conftool/dbconfig/20260416-165253-fceratto.json
16:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2175', diff saved to https://phabricator.wikimedia.org/P90960 and previous config saved to /var/cache/conftool/dbconfig/20260416-164245-fceratto.json
16:38 cdanis: 💙cdanis@cumin1003.eqiad.wmnet ~ 🕧☕ sudo cumin 'A:swift-fe' 'enable-puppet "cdanis deploy 8ad070a466 T328872"'
16:38 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1249 (T419635)', diff saved to https://phabricator.wikimedia.org/P90959 and previous config saved to /var/cache/conftool/dbconfig/20260416-163800-fceratto.json
16:37 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1249.eqiad.wmnet with reason: Maintenance
16:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1248 (T419635)', diff saved to https://phabricator.wikimedia.org/P90958 and previous config saved to /var/cache/conftool/dbconfig/20260416-163736-fceratto.json
16:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2175', diff saved to https://phabricator.wikimedia.org/P90957 and previous config saved to /var/cache/conftool/dbconfig/20260416-163237-fceratto.json
16:30 cdanis: 💙cdanis@cumin1003.eqiad.wmnet ~ 🕧☕ sudo cumin 'A:swift-fe' 'disable-puppet "cdanis deploy 8ad070a466 T328872"'
16:27 urandom: upgrade envoyproxy, restbase[1031,2024] (canary) — T419637 & T410975
16:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1248', diff saved to https://phabricator.wikimedia.org/P90956 and previous config saved to /var/cache/conftool/dbconfig/20260416-162727-fceratto.json
16:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2175 (T419961)', diff saved to https://phabricator.wikimedia.org/P90955 and previous config saved to /var/cache/conftool/dbconfig/20260416-162229-fceratto.json
16:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1248', diff saved to https://phabricator.wikimedia.org/P90953 and previous config saved to /var/cache/conftool/dbconfig/20260416-161719-fceratto.json
16:15 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2175 (T419961)', diff saved to https://phabricator.wikimedia.org/P90952 and previous config saved to /var/cache/conftool/dbconfig/20260416-161504-fceratto.json
16:14 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2175.codfw.wmnet with reason: Maintenance
16:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2148 (T419961)', diff saved to https://phabricator.wikimedia.org/P90951 and previous config saved to /var/cache/conftool/dbconfig/20260416-161432-fceratto.json
16:11 eevans@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on aqs1010.eqiad.wmnet with reason: Bootstrapping — T412830
16:07 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1248 (T419635)', diff saved to https://phabricator.wikimedia.org/P90950 and previous config saved to /var/cache/conftool/dbconfig/20260416-160710-fceratto.json
16:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2148', diff saved to https://phabricator.wikimedia.org/P90949 and previous config saved to /var/cache/conftool/dbconfig/20260416-160424-fceratto.json
15:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2148', diff saved to https://phabricator.wikimedia.org/P90948 and previous config saved to /var/cache/conftool/dbconfig/20260416-155416-fceratto.json
15:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2148 (T419961)', diff saved to https://phabricator.wikimedia.org/P90947 and previous config saved to /var/cache/conftool/dbconfig/20260416-154408-fceratto.json
15:35 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2148 (T419961)', diff saved to https://phabricator.wikimedia.org/P90946 and previous config saved to /var/cache/conftool/dbconfig/20260416-153547-fceratto.json
15:35 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2148.codfw.wmnet with reason: Maintenance
15:35 jhathaway@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on krb2002.codfw.wmnet with reason: T407726
15:35 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
15:35 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
15:34 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
15:34 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
15:31 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
15:30 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
15:29 cdanis: 💔cdanis@cumin1003.eqiad.wmnet ~ 🕦☕ sudo cumin 'A:swift-fe' 'disable-puppet "cdanis deploy I3aaec0ca T328872"'
15:16 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
15:16 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
15:15 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
15:14 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
15:14 moritzm: installing sequoia-sqv security updates
15:13 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
15:10 daniel@deploy1003: Finished scap sync-world: Backport for API rate limits: add highlimits-user class (T419796) (duration: 10m 47s)
15:04 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
15:03 daniel@deploy1003: daniel: Continuing with sync
15:01 daniel@deploy1003: daniel: Backport for API rate limits: add highlimits-user class (T419796) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
15:00 root@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mr1-codfw,mr1-codfw IPv6,mr1-codfw.oob with reason: router upgrade
14:59 daniel@deploy1003: Started scap sync-world: Backport for API rate limits: add highlimits-user class (T419796)
14:58 root@cumin2002: DONE (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 1:00:00 on mr1-codfw IPv6,mr-codfw with reason: router upgrade
14:58 papaul: ongoing maintenace on mr1-codfw
14:56 root@cumin2002: DONE (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 1:00:00 on mr1-codfw IPv6,mr1-codfw.oob,mr-codfw with reason: router upgrade
14:56 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
14:56 jelto@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host gerrit2002.wikimedia.org
14:56 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
14:54 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
14:54 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
14:52 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
14:45 jelto@cumin1003: START - Cookbook sre.hosts.reboot-single for host gerrit2002.wikimedia.org
14:44 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
14:29 jforrester@deploy1003: Finished scap sync-world: Backport for mc: Use MCROUTER_SERVER values rather than local sidepod for WF cache (T423311) (duration: 09m 36s)
14:25 jforrester@deploy1003: jforrester: Continuing with sync
14:25 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
14:21 jforrester@deploy1003: jforrester: Backport for mc: Use MCROUTER_SERVER values rather than local sidepod for WF cache (T423311) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
14:20 jforrester@deploy1003: Started scap sync-world: Backport for mc: Use MCROUTER_SERVER values rather than local sidepod for WF cache (T423311)
14:18 mlitn@deploy1003: Finished scap sync-world: Backport for fix: add missing hook registration for create account stats (T422283) (duration: 06m 07s)
14:18 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
14:15 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1248 (T419635)', diff saved to https://phabricator.wikimedia.org/P90945 and previous config saved to /var/cache/conftool/dbconfig/20260416-141515-fceratto.json
14:15 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1248.eqiad.wmnet with reason: Maintenance
14:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1247 (T419635)', diff saved to https://phabricator.wikimedia.org/P90944 and previous config saved to /var/cache/conftool/dbconfig/20260416-141450-fceratto.json
14:14 mlitn@deploy1003: mlitn, migr: Continuing with sync
14:14 mlitn@deploy1003: mlitn, migr: Backport for fix: add missing hook registration for create account stats (T422283) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
14:12 mlitn@deploy1003: Started scap sync-world: Backport for fix: add missing hook registration for create account stats (T422283)
14:07 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host testvm2002.codfw.wmnet with OS trixie
14:05 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
14:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1247', diff saved to https://phabricator.wikimedia.org/P90943 and previous config saved to /var/cache/conftool/dbconfig/20260416-140442-fceratto.json
14:04 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
14:01 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
14:01 mlitn@deploy1003: Finished scap sync-world: Backport for siwikitionary: update logo to localised svg version. (T342173) (duration: 07m 11s)
14:01 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
14:00 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
13:58 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1245.eqiad.wmnet with reason: Maintenance
13:57 mlitn@deploy1003: mlitn, robertsky: Continuing with sync
13:56 mlitn@deploy1003: mlitn, robertsky: Backport for siwikitionary: update logo to localised svg version. (T342173) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
13:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1210 (T419961)', diff saved to https://phabricator.wikimedia.org/P90942 and previous config saved to /var/cache/conftool/dbconfig/20260416-135549-fceratto.json
13:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1247', diff saved to https://phabricator.wikimedia.org/P90941 and previous config saved to /var/cache/conftool/dbconfig/20260416-135434-fceratto.json
13:54 mlitn@deploy1003: Started scap sync-world: Backport for siwikitionary: update logo to localised svg version. (T342173)
13:54 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on testvm2002.codfw.wmnet with reason: host reimage
13:51 mlitn@deploy1003: Finished scap sync-world: Backport for Squashed diff to master (duration: 30m 21s)
13:51 jiji@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-debug: apply
13:49 jiji@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-debug: apply
13:48 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on testvm2002.codfw.wmnet with reason: host reimage
13:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1210', diff saved to https://phabricator.wikimedia.org/P90940 and previous config saved to /var/cache/conftool/dbconfig/20260416-134541-fceratto.json
13:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1247 (T419635)', diff saved to https://phabricator.wikimedia.org/P90939 and previous config saved to /var/cache/conftool/dbconfig/20260416-134426-fceratto.json
13:41 urandom: decommissioning Cassandra [a,b] on aqs1010 — T412830
13:39 mlitn@deploy1003: mlitn: Continuing with sync
13:38 mlitn@deploy1003: mlitn: Backport for Squashed diff to master synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
13:38 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
13:38 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
13:36 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
13:36 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
13:36 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2222 (T410589)', diff saved to https://phabricator.wikimedia.org/P90938 and previous config saved to /var/cache/conftool/dbconfig/20260416-133600-ladsgroup.json
13:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1210', diff saved to https://phabricator.wikimedia.org/P90937 and previous config saved to /var/cache/conftool/dbconfig/20260416-133533-fceratto.json
13:34 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
13:33 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
13:32 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host testvm2002.codfw.wmnet with OS trixie
13:31 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
13:29 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host testvm2004.codfw.wmnet with OS trixie
13:28 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
13:26 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
13:25 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2222', diff saved to https://phabricator.wikimedia.org/P90936 and previous config saved to /var/cache/conftool/dbconfig/20260416-132551-ladsgroup.json
13:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1210 (T419961)', diff saved to https://phabricator.wikimedia.org/P90935 and previous config saved to /var/cache/conftool/dbconfig/20260416-132525-fceratto.json
13:23 jhathaway@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on krb2002.codfw.wmnet with reason: T407726
13:21 mlitn@deploy1003: Started scap sync-world: Backport for Squashed diff to master
13:18 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1210 (T419961)', diff saved to https://phabricator.wikimedia.org/P90934 and previous config saved to /var/cache/conftool/dbconfig/20260416-131836-fceratto.json
13:18 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1210.eqiad.wmnet with reason: Maintenance
13:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1207 (T419961)', diff saved to https://phabricator.wikimedia.org/P90933 and previous config saved to /var/cache/conftool/dbconfig/20260416-131806-fceratto.json
13:17 Lucas_WMDE: correction, namespaceDupes sahwikisource run was for T423374, my bad
13:17 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host apus-be1005.eqiad.wmnet
13:17 lucaswerkmeister-wmde@deploy1003: mwscript-k8s job started: namespaceDupes sahwikisource --fix # T423273
13:16 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for etwikiquote: delete unused temporary logo files (T313698), sahwikisource: add Ааптар (author) namespace (T423374) (duration: 10m 59s)
13:16 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
13:15 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2222', diff saved to https://phabricator.wikimedia.org/P90932 and previous config saved to /var/cache/conftool/dbconfig/20260416-131543-ladsgroup.json
13:15 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
13:13 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
13:12 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
13:12 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host apus-be1005.eqiad.wmnet
13:10 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
13:10 lucaswerkmeister-wmde@deploy1003: anzx, lucaswerkmeister-wmde: Continuing with sync
13:09 lucaswerkmeister-wmde@deploy1003: anzx, lucaswerkmeister-wmde: Backport for etwikiquote: delete unused temporary logo files (T313698), sahwikisource: add Ааптар (author) namespace (T423374) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
13:07 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1207', diff saved to https://phabricator.wikimedia.org/P90930 and previous config saved to /var/cache/conftool/dbconfig/20260416-130758-fceratto.json
13:05 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2222 (T410589)', diff saved to https://phabricator.wikimedia.org/P90929 and previous config saved to /var/cache/conftool/dbconfig/20260416-130535-ladsgroup.json
13:05 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for etwikiquote: delete unused temporary logo files (T313698), sahwikisource: add Ааптар (author) namespace (T423374)
13:03 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
12:58 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
12:58 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
12:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1207', diff saved to https://phabricator.wikimedia.org/P90928 and previous config saved to /var/cache/conftool/dbconfig/20260416-125750-fceratto.json
12:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1207 (T419961)', diff saved to https://phabricator.wikimedia.org/P90927 and previous config saved to /var/cache/conftool/dbconfig/20260416-124742-fceratto.json
12:43 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
12:41 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
12:40 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1207 (T419961)', diff saved to https://phabricator.wikimedia.org/P90926 and previous config saved to /var/cache/conftool/dbconfig/20260416-124032-fceratto.json
12:40 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1207.eqiad.wmnet with reason: Maintenance
12:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1200 (T419961)', diff saved to https://phabricator.wikimedia.org/P90925 and previous config saved to /var/cache/conftool/dbconfig/20260416-124001-fceratto.json
12:38 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
12:38 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
12:37 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
12:36 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
12:30 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts moss-be[2001-2002].codfw.wmnet
12:30 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
12:30 mvernon@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: moss-be[2001-2002].codfw.wmnet decommissioned, removing all IPs except the asset tag one - mvernon@cumin2002"
12:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1200', diff saved to https://phabricator.wikimedia.org/P90924 and previous config saved to /var/cache/conftool/dbconfig/20260416-122953-fceratto.json
12:29 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
12:28 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
12:28 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
12:27 mvernon@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: moss-be[2001-2002].codfw.wmnet decommissioned, removing all IPs except the asset tag one - mvernon@cumin2002"
12:26 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
12:26 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
12:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1200', diff saved to https://phabricator.wikimedia.org/P90923 and previous config saved to /var/cache/conftool/dbconfig/20260416-121945-fceratto.json
12:19 mvernon@cumin2002: START - Cookbook sre.dns.netbox
12:14 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
12:11 mvernon@cumin2002: START - Cookbook sre.hosts.decommission for hosts moss-be[2001-2002].codfw.wmnet
12:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1200 (T419961)', diff saved to https://phabricator.wikimedia.org/P90922 and previous config saved to /var/cache/conftool/dbconfig/20260416-120935-fceratto.json
12:09 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on P{ml-serve1013.eqiad.wmnet} and (A:ml-serve-master-eqiad or A:ml-serve-worker-eqiad)
12:09 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1013.eqiad.wmnet
12:09 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1013.eqiad.wmnet
12:03 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1013.eqiad.wmnet
12:02 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host testvm2004.codfw.wmnet with OS trixie
12:01 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1200 (T419961)', diff saved to https://phabricator.wikimedia.org/P90921 and previous config saved to /var/cache/conftool/dbconfig/20260416-120104-fceratto.json
12:00 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1200.eqiad.wmnet with reason: Maintenance
12:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1185 (T419961)', diff saved to https://phabricator.wikimedia.org/P90920 and previous config saved to /var/cache/conftool/dbconfig/20260416-120033-fceratto.json
11:56 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
11:53 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1013.eqiad.wmnet
11:53 klausman@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P{ml-serve1013.eqiad.wmnet} and (A:ml-serve-master-eqiad or A:ml-serve-worker-eqiad)
11:53 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
11:50 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1247 (T419635)', diff saved to https://phabricator.wikimedia.org/P90919 and previous config saved to /var/cache/conftool/dbconfig/20260416-115055-fceratto.json
11:50 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1247.eqiad.wmnet with reason: Maintenance
11:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1185', diff saved to https://phabricator.wikimedia.org/P90918 and previous config saved to /var/cache/conftool/dbconfig/20260416-115024-fceratto.json
11:43 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-wmde: apply
11:42 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-wmde: apply
11:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1185', diff saved to https://phabricator.wikimedia.org/P90916 and previous config saved to /var/cache/conftool/dbconfig/20260416-114014-fceratto.json
11:38 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on P{ml-serve1012.eqiad.wmnet} and (A:ml-serve-master-eqiad or A:ml-serve-worker-eqiad)
11:38 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-serve1012.eqiad.wmnet
11:38 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host ml-serve1012.eqiad.wmnet
11:33 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-wikidata: apply
11:33 klausman@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-serve1012.eqiad.wmnet
11:32 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-wikidata: apply
11:32 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-sre: apply
11:31 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-sre: apply
11:31 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-search: apply
11:31 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-search: apply
11:30 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-research: apply
11:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1185 (T419961)', diff saved to https://phabricator.wikimedia.org/P90915 and previous config saved to /var/cache/conftool/dbconfig/20260416-113005-fceratto.json
11:30 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-research: apply
11:29 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-platform-eng: apply
11:29 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-platform-eng: apply
11:29 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-ml: apply
11:28 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-ml: apply
11:28 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-main: apply
11:28 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-main: apply
11:23 klausman@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host ml-serve1012.eqiad.wmnet
11:23 klausman@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P{ml-serve1012.eqiad.wmnet} and (A:ml-serve-master-eqiad or A:ml-serve-worker-eqiad)
11:21 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1185 (T419961)', diff saved to https://phabricator.wikimedia.org/P90914 and previous config saved to /var/cache/conftool/dbconfig/20260416-112136-fceratto.json
11:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1185.eqiad.wmnet with reason: Maintenance
11:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1161 (T419961)', diff saved to https://phabricator.wikimedia.org/P90913 and previous config saved to /var/cache/conftool/dbconfig/20260416-112105-fceratto.json
11:19 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-fr-tech: apply
11:19 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-fr-tech: apply
11:16 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-analytics-test: apply
11:16 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-analytics-test: apply
11:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P90911 and previous config saved to /var/cache/conftool/dbconfig/20260416-111058-fceratto.json
11:10 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
11:07 moritzm: updating debdeploy on bookworm to 0.0.99.15
11:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P90910 and previous config saved to /var/cache/conftool/dbconfig/20260416-110049-fceratto.json
10:57 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-analytics-product: apply
10:57 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-analytics-product: apply
10:56 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-test-k8s: apply
10:56 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
10:55 moritzm: imported debdeploy 0.0.99.15 for bookworm-wikimedia (compat release for Cumin 6)
10:52 jmm@cumin2002: DONE (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: wikikube-ctrl2006.codfw.wmnet
10:50 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
10:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1161 (T419961)', diff saved to https://phabricator.wikimedia.org/P90909 and previous config saved to /var/cache/conftool/dbconfig/20260416-105040-fceratto.json
10:50 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
10:47 jmm@cumin2002: DONE (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: wikikube-ctrl2005.codfw.wmnet
10:47 jmm@cumin2002: DONE (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: wikikube-ctrl2004.codfw.wmnet
10:42 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1161 (T419961)', diff saved to https://phabricator.wikimedia.org/P90908 and previous config saved to /var/cache/conftool/dbconfig/20260416-104240-fceratto.json
10:42 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1016,1020].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
10:42 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1161.eqiad.wmnet with reason: Maintenance
10:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1159 (T419961)', diff saved to https://phabricator.wikimedia.org/P90907 and previous config saved to /var/cache/conftool/dbconfig/20260416-104201-fceratto.json
10:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1159', diff saved to https://phabricator.wikimedia.org/P90906 and previous config saved to /var/cache/conftool/dbconfig/20260416-103152-fceratto.json
10:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1159', diff saved to https://phabricator.wikimedia.org/P90905 and previous config saved to /var/cache/conftool/dbconfig/20260416-102143-fceratto.json
10:15 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
10:15 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1245.eqiad.wmnet with reason: Maintenance
10:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1244 (T419635)', diff saved to https://phabricator.wikimedia.org/P90904 and previous config saved to /var/cache/conftool/dbconfig/20260416-101514-fceratto.json
10:15 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
10:14 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
10:13 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
10:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1159 (T419961)', diff saved to https://phabricator.wikimedia.org/P90903 and previous config saved to /var/cache/conftool/dbconfig/20260416-101135-fceratto.json
10:09 jynus: backup1014 returns from maintenance, backups and recovery can flow as usual T421719
10:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1244', diff saved to https://phabricator.wikimedia.org/P90902 and previous config saved to /var/cache/conftool/dbconfig/20260416-100505-fceratto.json
09:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1244', diff saved to https://phabricator.wikimedia.org/P90901 and previous config saved to /var/cache/conftool/dbconfig/20260416-095455-fceratto.json
09:54 jmm@cumin2002: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:aux-master-eqiad
09:52 moritzm: installing qemu security updates
09:47 jynus@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host backup1014
09:47 jynus@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host backup1014
09:45 jynus@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host backup1014
09:45 jynus@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) backup1014.eqiad.wmnet 20.48.64.10.in-addr.arpa 0.2.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
09:45 jynus@cumin1003: START - Cookbook sre.dns.wipe-cache backup1014.eqiad.wmnet 20.48.64.10.in-addr.arpa 0.2.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
09:45 jynus@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
09:45 jynus@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host backup1014 - jynus@cumin1003"
09:45 jynus@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host backup1014 - jynus@cumin1003"
09:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1244 (T419635)', diff saved to https://phabricator.wikimedia.org/P90900 and previous config saved to /var/cache/conftool/dbconfig/20260416-094436-fceratto.json
09:44 jmm@cumin2002: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:aux-master-eqiad
09:42 jmm@cumin2002: DONE (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: cp2042.codfw.wmnet
09:42 jmm@cumin2002: DONE (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: cp2041.codfw.wmnet
09:41 jynus@cumin1003: START - Cookbook sre.dns.netbox
09:40 jynus@cumin1003: START - Cookbook sre.hosts.move-vlan for host backup1014
09:37 moritzm: installing imagemagick security updates
09:31 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
09:29 jynus: setting backup1014 in maintenance, no backup or recovery will run while it T421719
09:24 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
09:24 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/thumbor: apply
09:20 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
09:20 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
09:18 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/thumbor: apply
09:16 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2169: repool after maintenance
09:15 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host backup1007
09:15 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host backup1007
09:15 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host backup1007
09:15 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) backup1007.eqiad.wmnet 88.48.64.10.in-addr.arpa 8.8.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
09:15 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache backup1007.eqiad.wmnet 88.48.64.10.in-addr.arpa 8.8.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
09:14 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
09:14 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host backup1007 - ayounsi@cumin1003"
09:14 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host backup1007 - ayounsi@cumin1003"
09:13 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/thumbor: apply
09:13 jmm@deploy1003: helmfile [staging] START helmfile.d/services/thumbor: apply
09:11 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1159 (T419961)', diff saved to https://phabricator.wikimedia.org/P90898 and previous config saved to /var/cache/conftool/dbconfig/20260416-091115-fceratto.json
09:11 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1159.eqiad.wmnet with reason: Maintenance
09:11 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
09:10 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host backup1007
09:03 ayounsi@cumin1003: END (FAIL) - Cookbook sre.hosts.move-vlan (exit_code=99) for host backup1007
09:03 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host backup1007
08:56 jelto@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
08:56 jelto@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
08:55 jelto@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
08:55 jelto@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
08:51 jmm@dns1004: END - running authdns-update
08:50 jmm@dns1004: START - running authdns-update
08:43 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1225.eqiad.wmnet with reason: Maintenance
08:43 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1201 (T419961)', diff saved to https://phabricator.wikimedia.org/P90895 and previous config saved to /var/cache/conftool/dbconfig/20260416-084331-fceratto.json
08:36 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on backup[1007,1014].eqiad.wmnet with reason: maintenance
08:33 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1201', diff saved to https://phabricator.wikimedia.org/P90894 and previous config saved to /var/cache/conftool/dbconfig/20260416-083323-fceratto.json
08:31 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2169: repool after maintenance
08:27 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2169.codfw.wmnet with OS trixie
08:23 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1201', diff saved to https://phabricator.wikimedia.org/P90892 and previous config saved to /var/cache/conftool/dbconfig/20260416-082314-fceratto.json
08:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1201 (T419961)', diff saved to https://phabricator.wikimedia.org/P90891 and previous config saved to /var/cache/conftool/dbconfig/20260416-081305-fceratto.json
08:05 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2169.codfw.wmnet with reason: host reimage
08:04 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1201 (T419961)', diff saved to https://phabricator.wikimedia.org/P90890 and previous config saved to /var/cache/conftool/dbconfig/20260416-080445-fceratto.json
08:04 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1201.eqiad.wmnet with reason: Maintenance
08:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1165 (T419961)', diff saved to https://phabricator.wikimedia.org/P90889 and previous config saved to /var/cache/conftool/dbconfig/20260416-080420-fceratto.json
08:00 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2169.codfw.wmnet with reason: host reimage
07:55 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1244 (T419635)', diff saved to https://phabricator.wikimedia.org/P90888 and previous config saved to /var/cache/conftool/dbconfig/20260416-075522-fceratto.json
07:55 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1244.eqiad.wmnet with reason: Maintenance
07:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1243 (T419635)', diff saved to https://phabricator.wikimedia.org/P90887 and previous config saved to /var/cache/conftool/dbconfig/20260416-075457-fceratto.json
07:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1165', diff saved to https://phabricator.wikimedia.org/P90886 and previous config saved to /var/cache/conftool/dbconfig/20260416-075410-fceratto.json
07:45 jmm@cumin2002: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:aux-worker-eqiad
07:45 jmm@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker1009.eqiad.wmnet
07:45 jmm@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker1009.eqiad.wmnet
07:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1243', diff saved to https://phabricator.wikimedia.org/P90885 and previous config saved to /var/cache/conftool/dbconfig/20260416-074448-fceratto.json
07:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1165', diff saved to https://phabricator.wikimedia.org/P90884 and previous config saved to /var/cache/conftool/dbconfig/20260416-074402-fceratto.json
07:41 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2169.codfw.wmnet with OS trixie
07:40 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2169: Reimage to Trixie
07:40 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2169: Reimage to Trixie
07:40 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2169.codfw.wmnet with reason: Reimage to Trixie
07:39 jmm@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker1009.eqiad.wmnet
07:39 jmm@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker1009.eqiad.wmnet
07:39 jmm@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker1008.eqiad.wmnet
07:39 jmm@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker1008.eqiad.wmnet
07:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1243', diff saved to https://phabricator.wikimedia.org/P90882 and previous config saved to /var/cache/conftool/dbconfig/20260416-073440-fceratto.json
07:33 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1165 (T419961)', diff saved to https://phabricator.wikimedia.org/P90881 and previous config saved to /var/cache/conftool/dbconfig/20260416-073354-fceratto.json
07:33 jmm@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker1008.eqiad.wmnet
07:33 jmm@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker1008.eqiad.wmnet
07:32 jmm@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker1007.eqiad.wmnet
07:32 jmm@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker1007.eqiad.wmnet
07:27 jmm@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker1007.eqiad.wmnet
07:27 jmm@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker1007.eqiad.wmnet
07:26 jmm@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker1006.eqiad.wmnet
07:26 jmm@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker1006.eqiad.wmnet
07:26 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1165 (T419961)', diff saved to https://phabricator.wikimedia.org/P90880 and previous config saved to /var/cache/conftool/dbconfig/20260416-072650-fceratto.json
07:26 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb1015.eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
07:26 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1165.eqiad.wmnet with reason: Maintenance
07:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1243 (T419635)', diff saved to https://phabricator.wikimedia.org/P90879 and previous config saved to /var/cache/conftool/dbconfig/20260416-072432-fceratto.json
07:23 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2193: after reimage to trixie
07:21 jmm@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker1006.eqiad.wmnet
07:16 jmm@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker1006.eqiad.wmnet
06:59 jmm@cumin2002: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:aux-worker-eqiad
06:55 moritzm: imported opensearch-madvise 0.2+deb13u1 to component/opensearch2 of trixie-wikimedia T422860
06:40 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for wikikube-worker2280.codfw.wmnet
06:40 jayme@cumin2002: START - Cookbook sre.hosts.remove-downtime for wikikube-worker2280.codfw.wmnet
06:40 jayme@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2280.codfw.wmnet
06:40 jayme@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2280.codfw.wmnet
06:38 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2193: after reimage to trixie
06:37 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2193.codfw.wmnet with OS trixie
06:14 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2193.codfw.wmnet with reason: host reimage
06:08 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2193.codfw.wmnet with reason: host reimage
05:49 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2193.codfw.wmnet with OS trixie
05:48 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2193: Reimage to Trixie
05:47 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2193: Reimage to Trixie
05:47 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2193.codfw.wmnet with reason: Reimage to Trixie
05:37 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1243 (T419635)', diff saved to https://phabricator.wikimedia.org/P90873 and previous config saved to /var/cache/conftool/dbconfig/20260416-053659-fceratto.json
05:36 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1243.eqiad.wmnet with reason: Maintenance
05:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1242 (T419635)', diff saved to https://phabricator.wikimedia.org/P90872 and previous config saved to /var/cache/conftool/dbconfig/20260416-053635-fceratto.json
05:31 marostegui@cumin1003: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts clouddb1019.eqiad.wmnet
05:31 marostegui@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
05:31 marostegui@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: clouddb1019.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
05:30 marostegui@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: clouddb1019.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1003"
05:27 marostegui@cumin1003: START - Cookbook sre.dns.netbox
05:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1242', diff saved to https://phabricator.wikimedia.org/P90871 and previous config saved to /var/cache/conftool/dbconfig/20260416-052626-fceratto.json
05:22 marostegui@cumin1003: START - Cookbook sre.hosts.decommission for hosts clouddb1019.eqiad.wmnet
05:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1242', diff saved to https://phabricator.wikimedia.org/P90870 and previous config saved to /var/cache/conftool/dbconfig/20260416-051618-fceratto.json
05:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1242 (T419635)', diff saved to https://phabricator.wikimedia.org/P90869 and previous config saved to /var/cache/conftool/dbconfig/20260416-050609-fceratto.json
03:19 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1242 (T419635)', diff saved to https://phabricator.wikimedia.org/P90868 and previous config saved to /var/cache/conftool/dbconfig/20260416-031934-fceratto.json
03:19 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1242.eqiad.wmnet with reason: Maintenance
03:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1241 (T419635)', diff saved to https://phabricator.wikimedia.org/P90867 and previous config saved to /var/cache/conftool/dbconfig/20260416-031910-fceratto.json
03:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1241', diff saved to https://phabricator.wikimedia.org/P90866 and previous config saved to /var/cache/conftool/dbconfig/20260416-030902-fceratto.json
02:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1241', diff saved to https://phabricator.wikimedia.org/P90865 and previous config saved to /var/cache/conftool/dbconfig/20260416-025853-fceratto.json
02:53 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
02:52 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1253 (T410589)', diff saved to https://phabricator.wikimedia.org/P90864 and previous config saved to /var/cache/conftool/dbconfig/20260416-025247-ladsgroup.json
02:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1241 (T419635)', diff saved to https://phabricator.wikimedia.org/P90863 and previous config saved to /var/cache/conftool/dbconfig/20260416-024845-fceratto.json
02:42 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1253', diff saved to https://phabricator.wikimedia.org/P90862 and previous config saved to /var/cache/conftool/dbconfig/20260416-024239-ladsgroup.json
02:32 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1253', diff saved to https://phabricator.wikimedia.org/P90861 and previous config saved to /var/cache/conftool/dbconfig/20260416-023231-ladsgroup.json
02:22 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1253 (T410589)', diff saved to https://phabricator.wikimedia.org/P90860 and previous config saved to /var/cache/conftool/dbconfig/20260416-022223-ladsgroup.json
02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 16s)
02:01 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
01:27 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db2222 (T410589)', diff saved to https://phabricator.wikimedia.org/P90859 and previous config saved to /var/cache/conftool/dbconfig/20260416-012755-ladsgroup.json
01:27 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db2222.codfw.wmnet with reason: Maintenance
01:27 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2221 (T410589)', diff saved to https://phabricator.wikimedia.org/P90858 and previous config saved to /var/cache/conftool/dbconfig/20260416-012730-ladsgroup.json
01:17 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2221', diff saved to https://phabricator.wikimedia.org/P90857 and previous config saved to /var/cache/conftool/dbconfig/20260416-011722-ladsgroup.json
01:07 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2221', diff saved to https://phabricator.wikimedia.org/P90856 and previous config saved to /var/cache/conftool/dbconfig/20260416-010714-ladsgroup.json
01:02 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1241 (T419635)', diff saved to https://phabricator.wikimedia.org/P90855 and previous config saved to /var/cache/conftool/dbconfig/20260416-010218-fceratto.json
01:02 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1241.eqiad.wmnet with reason: Maintenance
01:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1238 (T419635)', diff saved to https://phabricator.wikimedia.org/P90854 and previous config saved to /var/cache/conftool/dbconfig/20260416-010154-fceratto.json
00:57 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2221 (T410589)', diff saved to https://phabricator.wikimedia.org/P90853 and previous config saved to /var/cache/conftool/dbconfig/20260416-005706-ladsgroup.json
00:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1238', diff saved to https://phabricator.wikimedia.org/P90852 and previous config saved to /var/cache/conftool/dbconfig/20260416-005146-fceratto.json
00:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1238', diff saved to https://phabricator.wikimedia.org/P90851 and previous config saved to /var/cache/conftool/dbconfig/20260416-004138-fceratto.json
00:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1238 (T419635)', diff saved to https://phabricator.wikimedia.org/P90850 and previous config saved to /var/cache/conftool/dbconfig/20260416-003130-fceratto.json

2026-04-15

23:35 cscott@deploy1003: Finished scap sync-world: Backport for Exclude parser functions from SpecialLintTemplateErrors (T420102) (duration: 32m 47s)
23:23 cscott@deploy1003: cscott: Continuing with sync
23:20 cscott@deploy1003: cscott: Backport for Exclude parser functions from SpecialLintTemplateErrors (T420102) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
23:05 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
23:05 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
23:03 cscott@deploy1003: Started scap sync-world: Backport for Exclude parser functions from SpecialLintTemplateErrors (T420102)
22:57 cscott@deploy1003: Finished scap sync-world: Backport for Pass preferred LanguageConverter variant explicitly instead of implicitly (T415435), Make variant into a parser option for parsoid language conversion (T415435) (duration: 16m 00s)
22:53 cscott@deploy1003: cscott: Continuing with sync
22:43 cscott@deploy1003: cscott: Backport for Pass preferred LanguageConverter variant explicitly instead of implicitly (T415435), Make variant into a parser option for parsoid language conversion (T415435) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
22:43 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1238 (T419635)', diff saved to https://phabricator.wikimedia.org/P90849 and previous config saved to /var/cache/conftool/dbconfig/20260415-224305-fceratto.json
22:42 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1238.eqiad.wmnet with reason: Maintenance
22:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1221 (T419635)', diff saved to https://phabricator.wikimedia.org/P90848 and previous config saved to /var/cache/conftool/dbconfig/20260415-224241-fceratto.json
22:41 cscott@deploy1003: Started scap sync-world: Backport for Pass preferred LanguageConverter variant explicitly instead of implicitly (T415435), Make variant into a parser option for parsoid language conversion (T415435)
22:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1221', diff saved to https://phabricator.wikimedia.org/P90847 and previous config saved to /var/cache/conftool/dbconfig/20260415-223233-fceratto.json
22:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1221', diff saved to https://phabricator.wikimedia.org/P90846 and previous config saved to /var/cache/conftool/dbconfig/20260415-222225-fceratto.json
22:15 jforrester@deploy1003: Finished scap sync-world: Backport for PageRenderingHandler: Don't run repo-mode lang check in non-repo world either (T423515) (duration: 08m 48s)
22:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1221 (T419635)', diff saved to https://phabricator.wikimedia.org/P90845 and previous config saved to /var/cache/conftool/dbconfig/20260415-221216-fceratto.json
22:11 jforrester@deploy1003: jforrester: Continuing with sync
22:08 jforrester@deploy1003: jforrester: Backport for PageRenderingHandler: Don't run repo-mode lang check in non-repo world either (T423515) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
22:06 jforrester@deploy1003: Started scap sync-world: Backport for PageRenderingHandler: Don't run repo-mode lang check in non-repo world either (T423515)
21:29 eevans@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for aqs1027.eqiad.wmnet
21:29 eevans@cumin1003: START - Cookbook sre.hosts.remove-downtime for aqs1027.eqiad.wmnet
21:14 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
21:13 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
21:13 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
21:13 dduvall@deploy1003: rebuilt and synchronized wikiversions files: group1 to 1.46.0-wmf.24 refs T420482
21:13 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
21:12 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
21:12 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
21:07 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
21:06 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
21:06 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
21:06 bking@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on cloudelastic1012.eqiad.wmnet with reason: still fixing Puppet
21:06 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
21:05 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
21:05 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
21:05 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
21:05 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
21:05 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
21:05 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
21:04 swfrench@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
21:04 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
21:04 swfrench@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
21:03 swfrench@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
21:03 swfrench@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
21:02 swfrench@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
20:58 jforrester@deploy1003: Finished scap sync-world: Backport for PageRenderingHandler: Handle Wikibase's OutOfBoundsException for "we don't have a label" (T423514), PageRenderingHandler: Don't run repo-mode stuff in non-repo world (T423515) (duration: 06m 08s)
20:54 jforrester@deploy1003: jforrester: Continuing with sync
20:54 jforrester@deploy1003: jforrester: Backport for PageRenderingHandler: Handle Wikibase's OutOfBoundsException for "we don't have a label" (T423514), PageRenderingHandler: Don't run repo-mode stuff in non-repo world (T423515) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
20:52 jforrester@deploy1003: Started scap sync-world: Backport for PageRenderingHandler: Handle Wikibase's OutOfBoundsException for "we don't have a label" (T423514), PageRenderingHandler: Don't run repo-mode stuff in non-repo world (T423515)
20:46 jforrester@deploy1003: Finished scap sync-world: Backport for Drop 1.5x logos (T246054), Enwikinews: disable lingering FlaggedRevs template processing (T423512), Record file usage from TemplateStyles pages (T413707) (duration: 09m 15s)
20:42 jforrester@deploy1003: jforrester, bawolff, pppery: Continuing with sync
20:42 topranks: enable BGP over GRE between cr1-drmrs and cr2-eqiad
20:38 jforrester@deploy1003: jforrester, bawolff, pppery: Backport for Drop 1.5x logos (T246054), Enwikinews: disable lingering FlaggedRevs template processing (T423512), Record file usage from TemplateStyles pages (T413707) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
20:37 jforrester@deploy1003: Started scap sync-world: Backport for Drop 1.5x logos (T246054), Enwikinews: disable lingering FlaggedRevs template processing (T423512), Record file usage from TemplateStyles pages (T413707)
20:36 cmooney@dns2005: END - running authdns-update
20:35 cmooney@dns2005: START - running authdns-update
20:34 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
20:34 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: generate v6 reverse records for 2a02:ec80:600:fe0a::1/64 - cmooney@cumin1003"
20:33 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: generate v6 reverse records for 2a02:ec80:600:fe0a::1/64 - cmooney@cumin1003"
20:31 mstyles@deploy1003: Finished scap sync-world: Backport for Force Reauth (T419621), Rename Test Kitchen Experiment (T420007), Rename Test Kitchen Experiment (T420007), Rename Test Kitchen Experiment (T420007), Rename Test Kitchen Experiment (T420007) (duration: 07m 48s)
20:30 cmooney@cumin1003: START - Cookbook sre.dns.netbox
20:27 mstyles@deploy1003: mstyles: Continuing with sync
20:26 topranks: enable ospf on GRE cr1-drmrs <-> cr2-eqiad
20:25 mstyles@deploy1003: mstyles: Backport for Force Reauth (T419621), Rename Test Kitchen Experiment (T420007), Rename Test Kitchen Experiment (T420007), Rename Test Kitchen Experiment (T420007), Rename Test Kitchen Experiment (T420007) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now
20:23 mstyles@deploy1003: Started scap sync-world: Backport for Force Reauth (T419621), Rename Test Kitchen Experiment (T420007), Rename Test Kitchen Experiment (T420007), Rename Test Kitchen Experiment (T420007), Rename Test Kitchen Experiment (T420007)
20:20 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
20:19 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
20:17 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1221 (T419635)', diff saved to https://phabricator.wikimedia.org/P90844 and previous config saved to /var/cache/conftool/dbconfig/20260415-201700-fceratto.json
20:16 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1015,1024-1025].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
20:16 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1221.eqiad.wmnet with reason: Maintenance
20:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1199 (T419635)', diff saved to https://phabricator.wikimedia.org/P90843 and previous config saved to /var/cache/conftool/dbconfig/20260415-201613-fceratto.json
20:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1199', diff saved to https://phabricator.wikimedia.org/P90842 and previous config saved to /var/cache/conftool/dbconfig/20260415-200605-fceratto.json
20:05 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
20:05 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add ipv4 dns names for eqiad-drmrs gre tunnel - cmooney@cumin1003"
20:00 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add ipv4 dns names for eqiad-drmrs gre tunnel - cmooney@cumin1003"
19:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1199', diff saved to https://phabricator.wikimedia.org/P90841 and previous config saved to /var/cache/conftool/dbconfig/20260415-195556-fceratto.json
19:55 cmooney@cumin1003: START - Cookbook sre.dns.netbox
19:55 topranks: add static routes on cr1-drmrs and cr2-eqiad for arelion GRE far-side IPv4 addresses
19:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1199 (T419635)', diff saved to https://phabricator.wikimedia.org/P90840 and previous config saved to /var/cache/conftool/dbconfig/20260415-194548-fceratto.json
19:38 topranks: add GRE tunnel to cr2-eqiad towards cr1-drmrs
19:37 topranks: add GRE tunnel to cr1-drmrs towards cr2-eqiad
18:50 dduvall@deploy1003: rebuilt and synchronized wikiversions files: group0 to 1.46.0-wmf.24 refs T420482
18:43 dduvall: rolling back due to steady `Term with languageCode "en" not found` errors (cc T420482)
18:27 eevans@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on aqs1027.eqiad.wmnet with reason: Bootstrapping — T412830
18:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2228 (T419961)', diff saved to https://phabricator.wikimedia.org/P90839 and previous config saved to /var/cache/conftool/dbconfig/20260415-181833-fceratto.json
18:15 dduvall@deploy1003: rebuilt and synchronized wikiversions files: group1 to 1.46.0-wmf.24 refs T420482
18:08 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp1115.*
18:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2228', diff saved to https://phabricator.wikimedia.org/P90837 and previous config saved to /var/cache/conftool/dbconfig/20260415-180825-fceratto.json
18:01 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp1115.eqiad.wmnet with OS trixie
18:01 brett@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - brett@cumin2002"
17:58 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/page-analytics: apply
17:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2228', diff saved to https://phabricator.wikimedia.org/P90836 and previous config saved to /var/cache/conftool/dbconfig/20260415-175817-fceratto.json
17:58 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/page-analytics: apply
17:57 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/media-analytics: apply
17:57 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/media-analytics: apply
17:57 brett@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - brett@cumin2002"
17:56 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/geo-analytics: apply
17:56 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/geo-analytics: apply
17:56 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/editor-analytics: apply
17:56 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/editor-analytics: apply
17:55 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/edit-analytics: apply
17:55 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/edit-analytics: apply
17:54 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/device-analytics: apply
17:54 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/device-analytics: apply
17:53 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/data-gateway: apply
17:53 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/data-gateway: apply
17:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2228 (T419961)', diff saved to https://phabricator.wikimedia.org/P90835 and previous config saved to /var/cache/conftool/dbconfig/20260415-174808-fceratto.json
17:48 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/page-analytics: apply
17:47 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/page-analytics: apply
17:47 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/media-analytics: apply
17:46 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/media-analytics: apply
17:46 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/geo-analytics: apply
17:46 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/geo-analytics: apply
17:45 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/editor-analytics: apply
17:45 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/editor-analytics: apply
17:44 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/edit-analytics: apply
17:44 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/edit-analytics: apply
17:44 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/device-analytics: apply
17:43 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/device-analytics: apply
17:43 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/data-gateway: apply
17:43 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/data-gateway: apply
17:42 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db1253 (T410589)', diff saved to https://phabricator.wikimedia.org/P90834 and previous config saved to /var/cache/conftool/dbconfig/20260415-174236-ladsgroup.json
17:42 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db1253.eqiad.wmnet with reason: Maintenance
17:42 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1236 (T410589)', diff saved to https://phabricator.wikimedia.org/P90833 and previous config saved to /var/cache/conftool/dbconfig/20260415-174212-ladsgroup.json
17:41 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2228 (T419961)', diff saved to https://phabricator.wikimedia.org/P90832 and previous config saved to /var/cache/conftool/dbconfig/20260415-174107-fceratto.json
17:40 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2228.codfw.wmnet with reason: Maintenance
17:40 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/page-analytics: apply
17:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2223 (T419961)', diff saved to https://phabricator.wikimedia.org/P90831 and previous config saved to /var/cache/conftool/dbconfig/20260415-174035-fceratto.json
17:40 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/page-analytics: apply
17:39 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/media-analytics: apply
17:39 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/media-analytics: apply
17:39 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/geo-analytics: apply
17:39 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/geo-analytics: apply
17:38 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/editor-analytics: apply
17:38 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/editor-analytics: apply
17:37 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/edit-analytics: apply
17:37 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/edit-analytics: apply
17:37 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/device-analytics: apply
17:37 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/device-analytics: apply
17:36 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/data-gateway: apply
17:36 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/data-gateway: apply
17:36 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1199 (T419635)', diff saved to https://phabricator.wikimedia.org/P90830 and previous config saved to /var/cache/conftool/dbconfig/20260415-173602-fceratto.json
17:35 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1199.eqiad.wmnet with reason: Maintenance
17:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1190 (T419635)', diff saved to https://phabricator.wikimedia.org/P90829 and previous config saved to /var/cache/conftool/dbconfig/20260415-173525-fceratto.json
17:34 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp1115.eqiad.wmnet with reason: host reimage
17:32 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1236', diff saved to https://phabricator.wikimedia.org/P90828 and previous config saved to /var/cache/conftool/dbconfig/20260415-173203-ladsgroup.json
17:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2223', diff saved to https://phabricator.wikimedia.org/P90827 and previous config saved to /var/cache/conftool/dbconfig/20260415-173027-fceratto.json
17:29 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp1115.eqiad.wmnet with reason: host reimage
17:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1190', diff saved to https://phabricator.wikimedia.org/P90826 and previous config saved to /var/cache/conftool/dbconfig/20260415-172517-fceratto.json
17:21 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1236', diff saved to https://phabricator.wikimedia.org/P90825 and previous config saved to /var/cache/conftool/dbconfig/20260415-172155-ladsgroup.json
17:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2223', diff saved to https://phabricator.wikimedia.org/P90824 and previous config saved to /var/cache/conftool/dbconfig/20260415-172019-fceratto.json
17:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1190', diff saved to https://phabricator.wikimedia.org/P90823 and previous config saved to /var/cache/conftool/dbconfig/20260415-171509-fceratto.json
17:12 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp1115.eqiad.wmnet with OS trixie
17:11 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1236 (T410589)', diff saved to https://phabricator.wikimedia.org/P90822 and previous config saved to /var/cache/conftool/dbconfig/20260415-171147-ladsgroup.json
17:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2223 (T419961)', diff saved to https://phabricator.wikimedia.org/P90821 and previous config saved to /var/cache/conftool/dbconfig/20260415-171011-fceratto.json
17:09 kamila@deploy1003: Finished scap sync-world: Backport for Revert "Temporarily add shellbox-icu to $wgShellboxUrls" (T422546), Revert "Enable $wgTempCategoryCollations for testwiki." (T422546) (duration: 16m 10s)
17:05 kamila@deploy1003: kamila: Continuing with sync
17:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1190 (T419635)', diff saved to https://phabricator.wikimedia.org/P90820 and previous config saved to /var/cache/conftool/dbconfig/20260415-170501-fceratto.json
17:03 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2223 (T419961)', diff saved to https://phabricator.wikimedia.org/P90819 and previous config saved to /var/cache/conftool/dbconfig/20260415-170310-fceratto.json
17:03 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2223.codfw.wmnet with reason: Maintenance
17:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2211 (T419961)', diff saved to https://phabricator.wikimedia.org/P90818 and previous config saved to /var/cache/conftool/dbconfig/20260415-170239-fceratto.json
16:55 kamila@deploy1003: kamila: Backport for Revert "Temporarily add shellbox-icu to $wgShellboxUrls" (T422546), Revert "Enable $wgTempCategoryCollations for testwiki." (T422546) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
16:53 kamila@deploy1003: Started scap sync-world: Backport for Revert "Temporarily add shellbox-icu to $wgShellboxUrls" (T422546), Revert "Enable $wgTempCategoryCollations for testwiki." (T422546)
16:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2211', diff saved to https://phabricator.wikimedia.org/P90817 and previous config saved to /var/cache/conftool/dbconfig/20260415-165231-fceratto.json
16:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2211', diff saved to https://phabricator.wikimedia.org/P90816 and previous config saved to /var/cache/conftool/dbconfig/20260415-164223-fceratto.json
16:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2211 (T419961)', diff saved to https://phabricator.wikimedia.org/P90815 and previous config saved to /var/cache/conftool/dbconfig/20260415-163215-fceratto.json
16:25 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2211 (T419961)', diff saved to https://phabricator.wikimedia.org/P90814 and previous config saved to /var/cache/conftool/dbconfig/20260415-162513-fceratto.json
16:25 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2211.codfw.wmnet with reason: Maintenance
16:20 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2201.codfw.wmnet with reason: Maintenance
16:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2192 (T419961)', diff saved to https://phabricator.wikimedia.org/P90813 and previous config saved to /var/cache/conftool/dbconfig/20260415-161936-fceratto.json
16:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2192', diff saved to https://phabricator.wikimedia.org/P90810 and previous config saved to /var/cache/conftool/dbconfig/20260415-160928-fceratto.json
15:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2192', diff saved to https://phabricator.wikimedia.org/P90809 and previous config saved to /var/cache/conftool/dbconfig/20260415-155920-fceratto.json
15:56 jynus@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for backup1012.eqiad.wmnet
15:56 jynus@cumin1003: START - Cookbook sre.hosts.remove-downtime for backup1012.eqiad.wmnet
15:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2192 (T419961)', diff saved to https://phabricator.wikimedia.org/P90807 and previous config saved to /var/cache/conftool/dbconfig/20260415-154911-fceratto.json
15:43 blake@deploy1003: Finished scap sync-world: Backport for ProductionServices: re-add poolcounter1007.eqiad. (T420171) (duration: 06m 09s)
15:42 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2192 (T419961)', diff saved to https://phabricator.wikimedia.org/P90806 and previous config saved to /var/cache/conftool/dbconfig/20260415-154210-fceratto.json
15:42 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2192.codfw.wmnet with reason: Maintenance
15:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2178 (T419961)', diff saved to https://phabricator.wikimedia.org/P90805 and previous config saved to /var/cache/conftool/dbconfig/20260415-154138-fceratto.json
15:39 blake@deploy1003: blake: Continuing with sync
15:39 blake@deploy1003: blake: Backport for ProductionServices: re-add poolcounter1007.eqiad. (T420171) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
15:37 blake@deploy1003: Started scap sync-world: Backport for ProductionServices: re-add poolcounter1007.eqiad. (T420171)
15:35 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host poolcounter1007.eqiad.wmnet
15:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2178', diff saved to https://phabricator.wikimedia.org/P90804 and previous config saved to /var/cache/conftool/dbconfig/20260415-153130-fceratto.json
15:31 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host poolcounter1007.eqiad.wmnet
15:31 blake@deploy1003: Finished scap sync-world: Backport for ProductionServices: remove poolcounter1007.eqiad, add 1006 (T420171) (duration: 06m 19s)
15:30 Emperor: update & restart envoy on ms swift frontends T410975 T419637
15:30 Emperor: update & restart envoy on thanos frontends T410975 T419637
15:27 blake@deploy1003: blake: Continuing with sync
15:26 blake@deploy1003: blake: Backport for ProductionServices: remove poolcounter1007.eqiad, add 1006 (T420171) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
15:26 Emperor: update & restart envoy on apus frontends T410975 T419637
15:24 jmm@cumin2002: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:aux-master-codfw
15:24 blake@deploy1003: Started scap sync-world: Backport for ProductionServices: remove poolcounter1007.eqiad, add 1006 (T420171)
15:24 Emperor: update & restart envoy on apus frontends T423065 T382824
15:22 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host poolcounter1006.eqiad.wmnet
15:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2178', diff saved to https://phabricator.wikimedia.org/P90803 and previous config saved to /var/cache/conftool/dbconfig/20260415-152122-fceratto.json
15:19 moritzm: installing Dovecot security updates on mx-out*
15:18 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host poolcounter1006.eqiad.wmnet
15:18 blake@deploy1003: Finished scap sync-world: Backport for ProductionServices: remove poolcounter1006.eqiad (T420171) (duration: 06m 59s)
15:14 blake@deploy1003: blake: Continuing with sync
15:14 jmm@cumin2002: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:aux-master-codfw
15:13 blake@deploy1003: blake: Backport for ProductionServices: remove poolcounter1006.eqiad (T420171) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
15:12 moritzm: installing inetutils security updates
15:11 blake@deploy1003: Started scap sync-world: Backport for ProductionServices: remove poolcounter1006.eqiad (T420171)
15:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2178 (T419961)', diff saved to https://phabricator.wikimedia.org/P90802 and previous config saved to /var/cache/conftool/dbconfig/20260415-151114-fceratto.json
15:08 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-test-k8s: apply
15:06 samtar@deploy1003: Finished scap sync-world: Backport for Revert^2 "lbwiki: Set minimum requirement of 10 edits for wgAutoConfirmCount" (duration: 06m 54s)
15:04 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2178 (T419961)', diff saved to https://phabricator.wikimedia.org/P90801 and previous config saved to /var/cache/conftool/dbconfig/20260415-150415-fceratto.json
15:04 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2178.codfw.wmnet with reason: Maintenance
15:03 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2171 (T419961)', diff saved to https://phabricator.wikimedia.org/P90800 and previous config saved to /var/cache/conftool/dbconfig/20260415-150344-fceratto.json
15:02 samtar@deploy1003: samtar: Continuing with sync
15:02 samtar@deploy1003: samtar: Backport for Revert^2 "lbwiki: Set minimum requirement of 10 edits for wgAutoConfirmCount" synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
15:01 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
15:01 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
14:59 samtar@deploy1003: Started scap sync-world: Backport for Revert^2 "lbwiki: Set minimum requirement of 10 edits for wgAutoConfirmCount"
14:59 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1190 (T419635)', diff saved to https://phabricator.wikimedia.org/P90799 and previous config saved to /var/cache/conftool/dbconfig/20260415-145918-fceratto.json
14:59 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1190.eqiad.wmnet with reason: Maintenance
14:57 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
14:57 elukey@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'sync'.
14:57 elukey@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'sync'.
14:56 elukey@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'sync'.
14:56 elukey@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'sync'.
14:55 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on backup1012.eqiad.wmnet with reason: maintenance
14:53 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2171', diff saved to https://phabricator.wikimedia.org/P90798 and previous config saved to /var/cache/conftool/dbconfig/20260415-145335-fceratto.json
14:53 samtar@deploy1003: Finished scap sync-world: Backport for Revert "lbwiki: Set minimum requirement of 10 edits for wgAutoConfirmCount" (duration: 06m 12s)
14:52 cgoubert@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker1273.eqiad.wmnet
14:51 cgoubert@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker1273.eqiad.wmnet
14:49 samtar@deploy1003: samtar: Continuing with sync
14:49 samtar@deploy1003: samtar: Backport for Revert "lbwiki: Set minimum requirement of 10 edits for wgAutoConfirmCount" synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
14:47 samtar@deploy1003: Started scap sync-world: Backport for Revert "lbwiki: Set minimum requirement of 10 edits for wgAutoConfirmCount"
14:46 cgoubert@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2280.codfw.wmnet
14:45 cgoubert@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2280.codfw.wmnet
14:43 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifeeds: apply
14:43 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2171', diff saved to https://phabricator.wikimedia.org/P90797 and previous config saved to /var/cache/conftool/dbconfig/20260415-144327-fceratto.json
14:42 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/wikifeeds: apply
14:42 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifeeds: apply
14:42 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifeeds: apply
14:41 cgoubert@cumin1003: END (FAIL) - Cookbook sre.k8s.pool-depool-node (exit_code=99) depool for host wikikube-worker2280.codfw.wmnet
14:40 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
14:40 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
14:39 cgoubert@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2280.codfw.wmnet
14:36 jgiannelos@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifeeds: apply
14:36 jgiannelos@deploy1003: helmfile [staging] START helmfile.d/services/wikifeeds: apply
14:33 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2171 (T419961)', diff saved to https://phabricator.wikimedia.org/P90796 and previous config saved to /var/cache/conftool/dbconfig/20260415-143319-fceratto.json
14:26 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2171 (T419961)', diff saved to https://phabricator.wikimedia.org/P90795 and previous config saved to /var/cache/conftool/dbconfig/20260415-142615-fceratto.json
14:26 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2171.codfw.wmnet with reason: Maintenance
14:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2157 (T419961)', diff saved to https://phabricator.wikimedia.org/P90794 and previous config saved to /var/cache/conftool/dbconfig/20260415-142543-fceratto.json
14:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2157', diff saved to https://phabricator.wikimedia.org/P90792 and previous config saved to /var/cache/conftool/dbconfig/20260415-141535-fceratto.json
14:06 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
14:06 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
14:06 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
14:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2157', diff saved to https://phabricator.wikimedia.org/P90790 and previous config saved to /var/cache/conftool/dbconfig/20260415-140527-fceratto.json
14:04 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
14:04 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
13:56 jmm@cumin2002: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:aux-worker-codfw
13:56 jmm@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker2009.codfw.wmnet
13:56 jmm@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker2009.codfw.wmnet
13:55 samtar@deploy1003: samtar, codenamenoreste: Continuing with sync
13:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2157 (T419961)', diff saved to https://phabricator.wikimedia.org/P90789 and previous config saved to /var/cache/conftool/dbconfig/20260415-135519-fceratto.json
13:53 samtar@deploy1003: samtar, codenamenoreste: Backport for lbwiki: Set minimum requirement of 10 edits for wgAutoConfirmCount (T423102) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
13:51 samtar@deploy1003: Started scap sync-world: Backport for lbwiki: Set minimum requirement of 10 edits for wgAutoConfirmCount (T423102)
13:51 jmm@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker2009.codfw.wmnet
13:50 jmm@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker2009.codfw.wmnet
13:50 jmm@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker2008.codfw.wmnet
13:50 jmm@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker2008.codfw.wmnet
13:47 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2157 (T419961)', diff saved to https://phabricator.wikimedia.org/P90788 and previous config saved to /var/cache/conftool/dbconfig/20260415-134704-fceratto.json
13:46 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2157.codfw.wmnet with reason: Maintenance
13:45 jmm@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker2008.codfw.wmnet
13:45 jmm@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker2008.codfw.wmnet
13:44 jmm@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker2007.codfw.wmnet
13:44 jmm@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker2007.codfw.wmnet
13:44 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2068.codfw.wmnet with OS bullseye
13:39 jmm@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker2007.codfw.wmnet
13:34 jmm@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker2007.codfw.wmnet
13:34 jmm@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host aux-k8s-worker2006.codfw.wmnet
13:34 jmm@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host aux-k8s-worker2006.codfw.wmnet
13:29 jmm@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host aux-k8s-worker2006.codfw.wmnet
13:28 jmm@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host aux-k8s-worker2006.codfw.wmnet
13:25 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2068.codfw.wmnet with reason: host reimage
13:21 eevans@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for aqs1026.eqiad.wmnet
13:21 eevans@cumin1003: START - Cookbook sre.hosts.remove-downtime for aqs1026.eqiad.wmnet
13:19 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2068.codfw.wmnet with reason: host reimage
13:18 jgiannelos@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifeeds: apply
13:17 jgiannelos@deploy1003: helmfile [staging] START helmfile.d/services/wikifeeds: apply
13:17 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2197.codfw.wmnet with reason: Maintenance
13:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2193 (T419961)', diff saved to https://phabricator.wikimedia.org/P90787 and previous config saved to /var/cache/conftool/dbconfig/20260415-131657-fceratto.json
13:16 kartik@deploy1003: Finished scap sync-world: Backport for Register ArticleGuidance extension and enable in labs (T423295) (duration: 12m 02s)
13:12 kartik@deploy1003: sbisson, kartik: Continuing with sync
13:08 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db2221 (T410589)', diff saved to https://phabricator.wikimedia.org/P90786 and previous config saved to /var/cache/conftool/dbconfig/20260415-130849-ladsgroup.json
13:08 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db2221.codfw.wmnet with reason: Maintenance
13:08 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2218 (T410589)', diff saved to https://phabricator.wikimedia.org/P90785 and previous config saved to /var/cache/conftool/dbconfig/20260415-130836-ladsgroup.json
13:08 jmm@cumin2002: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:aux-worker-codfw
13:08 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1150.eqiad.wmnet with reason: Maintenance
13:07 fceratto@cumin1003: DONE (ERROR) - Cookbook sre.hosts.downtime (exit_code=97) for 1 day, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1014,1018].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
13:07 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1158.eqiad.wmnet with reason: Maintenance
13:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2193', diff saved to https://phabricator.wikimedia.org/P90784 and previous config saved to /var/cache/conftool/dbconfig/20260415-130649-fceratto.json
13:06 kartik@deploy1003: sbisson, kartik: Backport for Register ArticleGuidance extension and enable in labs (T423295) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
13:04 kartik@deploy1003: Started scap sync-world: Backport for Register ArticleGuidance extension and enable in labs (T423295)
13:00 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2068.codfw.wmnet with OS bullseye
12:58 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2218', diff saved to https://phabricator.wikimedia.org/P90783 and previous config saved to /var/cache/conftool/dbconfig/20260415-125828-ladsgroup.json
12:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2193', diff saved to https://phabricator.wikimedia.org/P90782 and previous config saved to /var/cache/conftool/dbconfig/20260415-125640-fceratto.json
12:48 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2218', diff saved to https://phabricator.wikimedia.org/P90781 and previous config saved to /var/cache/conftool/dbconfig/20260415-124819-ladsgroup.json
12:48 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2069.codfw.wmnet with OS bullseye
12:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2193 (T419961)', diff saved to https://phabricator.wikimedia.org/P90780 and previous config saved to /var/cache/conftool/dbconfig/20260415-124633-fceratto.json
12:46 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.update-replication (exit_code=0)
12:45 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication
12:45 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.update-replication (exit_code=0)
12:45 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication
12:45 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.update-replication (exit_code=0)
12:45 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication
12:44 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.update-replication (exit_code=0)
12:44 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication
12:44 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.update-replication (exit_code=0)
12:44 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication
12:43 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.update-replication (exit_code=0)
12:43 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication
12:43 dreamyjazz@deploy1003: Finished scap sync-world: Backport for VisualEditor hCaptcha: Clear challenge container for new render (T423294) (duration: 08m 11s)
12:41 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.update-replication (exit_code=0)
12:41 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication
12:41 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.update-replication (exit_code=0)
12:40 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication
12:40 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.update-replication (exit_code=0)
12:40 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication
12:39 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2193 (T419961)', diff saved to https://phabricator.wikimedia.org/P90779 and previous config saved to /var/cache/conftool/dbconfig/20260415-123937-fceratto.json
12:39 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2193.codfw.wmnet with reason: Maintenance
12:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2169 (T419961)', diff saved to https://phabricator.wikimedia.org/P90778 and previous config saved to /var/cache/conftool/dbconfig/20260415-123915-fceratto.json
12:39 dreamyjazz@deploy1003: dreamyjazz: Continuing with sync
12:38 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.update-replication (exit_code=0)
12:38 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication
12:38 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2218 (T410589)', diff saved to https://phabricator.wikimedia.org/P90777 and previous config saved to /var/cache/conftool/dbconfig/20260415-123811-ladsgroup.json
12:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1226 (T419635)', diff saved to https://phabricator.wikimedia.org/P90776 and previous config saved to /var/cache/conftool/dbconfig/20260415-123803-fceratto.json
12:37 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.update-replication (exit_code=0)
12:37 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication
12:37 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.update-replication (exit_code=0)
12:37 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication
12:36 dreamyjazz@deploy1003: dreamyjazz: Backport for VisualEditor hCaptcha: Clear challenge container for new render (T423294) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
12:36 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.update-replication (exit_code=0)
12:34 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication
12:34 dreamyjazz@deploy1003: Started scap sync-world: Backport for VisualEditor hCaptcha: Clear challenge container for new render (T423294)
12:34 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.update-replication (exit_code=0)
12:34 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication
12:34 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.update-replication (exit_code=0)
12:33 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication
12:32 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.update-replication (exit_code=0)
12:32 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication
12:31 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.update-replication (exit_code=0)
12:31 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication
12:30 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.update-replication (exit_code=0)
12:30 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication
12:30 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.update-replication (exit_code=0)
12:29 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication
12:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2169', diff saved to https://phabricator.wikimedia.org/P90775 and previous config saved to /var/cache/conftool/dbconfig/20260415-122907-fceratto.json
12:28 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2069.codfw.wmnet with reason: host reimage
12:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1226', diff saved to https://phabricator.wikimedia.org/P90774 and previous config saved to /var/cache/conftool/dbconfig/20260415-122756-fceratto.json
12:27 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.update-replication (exit_code=0)
12:27 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication
12:26 kart_: Updated cxserver to 2026-04-14-071531-production
12:25 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.update-replication (exit_code=0)
12:25 kartik@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
12:25 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication
12:25 fceratto@cumin1003: END (FAIL) - Cookbook sre.mysql.update-replication (exit_code=99)
12:25 kartik@deploy1003: helmfile [eqiad] START helmfile.d/services/cxserver: apply
12:23 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication
12:22 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.update-replication (exit_code=0)
12:22 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication
12:22 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2069.codfw.wmnet with reason: host reimage
12:22 kartik@deploy1003: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
12:21 kartik@deploy1003: helmfile [codfw] START helmfile.d/services/cxserver: apply
12:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2169', diff saved to https://phabricator.wikimedia.org/P90773 and previous config saved to /var/cache/conftool/dbconfig/20260415-121859-fceratto.json
12:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1226', diff saved to https://phabricator.wikimedia.org/P90772 and previous config saved to /var/cache/conftool/dbconfig/20260415-121748-fceratto.json
12:11 kartik@deploy1003: helmfile [staging] DONE helmfile.d/services/cxserver: apply
12:11 kartik@deploy1003: helmfile [staging] START helmfile.d/services/cxserver: apply
12:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2169 (T419961)', diff saved to https://phabricator.wikimedia.org/P90771 and previous config saved to /var/cache/conftool/dbconfig/20260415-120851-fceratto.json
12:07 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1226 (T419635)', diff saved to https://phabricator.wikimedia.org/P90770 and previous config saved to /var/cache/conftool/dbconfig/20260415-120739-fceratto.json
12:04 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2069.codfw.wmnet with OS bullseye
12:03 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1226 (T419635)', diff saved to https://phabricator.wikimedia.org/P90769 and previous config saved to /var/cache/conftool/dbconfig/20260415-120331-fceratto.json
12:03 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1226.eqiad.wmnet with reason: Maintenance
12:03 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1214 (T419635)', diff saved to https://phabricator.wikimedia.org/P90768 and previous config saved to /var/cache/conftool/dbconfig/20260415-120305-fceratto.json
12:01 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2169 (T419961)', diff saved to https://phabricator.wikimedia.org/P90767 and previous config saved to /var/cache/conftool/dbconfig/20260415-120138-fceratto.json
12:01 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2169.codfw.wmnet with reason: Maintenance
12:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2158 (T419961)', diff saved to https://phabricator.wikimedia.org/P90766 and previous config saved to /var/cache/conftool/dbconfig/20260415-120117-fceratto.json
12:00 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2069.codfw.wmnet with OS trixie
11:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1214', diff saved to https://phabricator.wikimedia.org/P90765 and previous config saved to /var/cache/conftool/dbconfig/20260415-115257-fceratto.json
11:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2158', diff saved to https://phabricator.wikimedia.org/P90764 and previous config saved to /var/cache/conftool/dbconfig/20260415-115109-fceratto.json
11:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1214', diff saved to https://phabricator.wikimedia.org/P90762 and previous config saved to /var/cache/conftool/dbconfig/20260415-114249-fceratto.json
11:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2158', diff saved to https://phabricator.wikimedia.org/P90761 and previous config saved to /var/cache/conftool/dbconfig/20260415-114101-fceratto.json
11:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1214 (T419635)', diff saved to https://phabricator.wikimedia.org/P90758 and previous config saved to /var/cache/conftool/dbconfig/20260415-113241-fceratto.json
11:31 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
11:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2158 (T419961)', diff saved to https://phabricator.wikimedia.org/P90757 and previous config saved to /var/cache/conftool/dbconfig/20260415-113053-fceratto.json
11:30 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
11:30 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
11:29 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1214 (T419635)', diff saved to https://phabricator.wikimedia.org/P90756 and previous config saved to /var/cache/conftool/dbconfig/20260415-112937-fceratto.json
11:29 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1214.eqiad.wmnet with reason: Maintenance
11:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1203 (T419635)', diff saved to https://phabricator.wikimedia.org/P90755 and previous config saved to /var/cache/conftool/dbconfig/20260415-112913-fceratto.json
11:27 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
11:24 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2158 (T419961)', diff saved to https://phabricator.wikimedia.org/P90754 and previous config saved to /var/cache/conftool/dbconfig/20260415-112445-fceratto.json
11:24 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2158.codfw.wmnet with reason: Maintenance
11:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2151 (T419961)', diff saved to https://phabricator.wikimedia.org/P90753 and previous config saved to /var/cache/conftool/dbconfig/20260415-112413-fceratto.json
11:20 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
11:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1203', diff saved to https://phabricator.wikimedia.org/P90752 and previous config saved to /var/cache/conftool/dbconfig/20260415-111905-fceratto.json
11:18 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
11:17 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
11:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2151', diff saved to https://phabricator.wikimedia.org/P90751 and previous config saved to /var/cache/conftool/dbconfig/20260415-111405-fceratto.json
11:12 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
11:11 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
11:08 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1203', diff saved to https://phabricator.wikimedia.org/P90750 and previous config saved to /var/cache/conftool/dbconfig/20260415-110856-fceratto.json
11:05 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2069.codfw.wmnet with reason: host reimage
11:03 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2151', diff saved to https://phabricator.wikimedia.org/P90749 and previous config saved to /var/cache/conftool/dbconfig/20260415-110357-fceratto.json
11:01 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
10:59 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2069.codfw.wmnet with reason: host reimage
10:58 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1203 (T419635)', diff saved to https://phabricator.wikimedia.org/P90748 and previous config saved to /var/cache/conftool/dbconfig/20260415-105848-fceratto.json
10:53 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2151 (T419961)', diff saved to https://phabricator.wikimedia.org/P90747 and previous config saved to /var/cache/conftool/dbconfig/20260415-105349-fceratto.json
10:53 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1203 (T419635)', diff saved to https://phabricator.wikimedia.org/P90746 and previous config saved to /var/cache/conftool/dbconfig/20260415-105338-fceratto.json
10:53 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1203.eqiad.wmnet with reason: Maintenance
10:53 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1193 (T419635)', diff saved to https://phabricator.wikimedia.org/P90745 and previous config saved to /var/cache/conftool/dbconfig/20260415-105314-fceratto.json
10:45 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2151 (T419961)', diff saved to https://phabricator.wikimedia.org/P90744 and previous config saved to /var/cache/conftool/dbconfig/20260415-104535-fceratto.json
10:45 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2151.codfw.wmnet with reason: Maintenance
10:44 taavi@dns1004: END - running authdns-update
10:43 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1193', diff saved to https://phabricator.wikimedia.org/P90743 and previous config saved to /var/cache/conftool/dbconfig/20260415-104306-fceratto.json
10:42 taavi@dns1004: START - running authdns-update
10:39 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2069.codfw.wmnet with OS trixie
10:37 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 365 days, 0:00:00 on dborch1001.wikimedia.org with reason: T416582
10:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1193', diff saved to https://phabricator.wikimedia.org/P90742 and previous config saved to /var/cache/conftool/dbconfig/20260415-103258-fceratto.json
10:29 mvernon@cumin2002: END (FAIL) - Cookbook sre.swift.convert-disks (exit_code=99) for host ms-be2069
10:29 mvernon@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host ms-be2069.codfw.wmnet with OS bullseye
10:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1193 (T419635)', diff saved to https://phabricator.wikimedia.org/P90741 and previous config saved to /var/cache/conftool/dbconfig/20260415-102250-fceratto.json
10:19 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1193 (T419635)', diff saved to https://phabricator.wikimedia.org/P90740 and previous config saved to /var/cache/conftool/dbconfig/20260415-101942-fceratto.json
10:19 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1193.eqiad.wmnet with reason: Maintenance
10:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1192 (T419635)', diff saved to https://phabricator.wikimedia.org/P90739 and previous config saved to /var/cache/conftool/dbconfig/20260415-101917-fceratto.json
10:10 jayme@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2280.codfw.wmnet
10:10 jayme@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2280.codfw.wmnet
10:10 jayme@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for wikikube-worker2280.codfw.wmnet
10:10 jayme@cumin2002: START - Cookbook sre.hosts.remove-downtime for wikikube-worker2280.codfw.wmnet
10:10 elukey: upgrade spicerack on cumin nodes
10:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1192', diff saved to https://phabricator.wikimedia.org/P90738 and previous config saved to /var/cache/conftool/dbconfig/20260415-100908-fceratto.json
10:08 elukey: uploaded spicerack_12.4.0 to apt.wikimedia.org bookworm-wikimedia
10:00 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
09:59 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2069.codfw.wmnet with reason: host reimage
09:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1192', diff saved to https://phabricator.wikimedia.org/P90737 and previous config saved to /var/cache/conftool/dbconfig/20260415-095901-fceratto.json
09:58 jayme@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on wikikube-worker2280.codfw.wmnet with reason: hardware issues
09:56 jayme@cumin2002: END (FAIL) - Cookbook sre.k8s.pool-depool-node (exit_code=99) depool for host wikikube-worker2280.codfw.wmnet
09:53 jayme@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2280.codfw.wmnet
09:53 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2069.codfw.wmnet with reason: host reimage
09:51 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
09:50 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
09:50 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
09:49 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db1236 (T410589)', diff saved to https://phabricator.wikimedia.org/P90736 and previous config saved to /var/cache/conftool/dbconfig/20260415-094902-ladsgroup.json
09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1192 (T419635)', diff saved to https://phabricator.wikimedia.org/P90735 and previous config saved to /var/cache/conftool/dbconfig/20260415-094852-fceratto.json
09:48 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db1236.eqiad.wmnet with reason: Maintenance
09:48 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1231 (T410589)', diff saved to https://phabricator.wikimedia.org/P90734 and previous config saved to /var/cache/conftool/dbconfig/20260415-094831-ladsgroup.json
09:45 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1192 (T419635)', diff saved to https://phabricator.wikimedia.org/P90733 and previous config saved to /var/cache/conftool/dbconfig/20260415-094544-fceratto.json
09:45 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1192.eqiad.wmnet with reason: Maintenance
09:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1178 (T419635)', diff saved to https://phabricator.wikimedia.org/P90732 and previous config saved to /var/cache/conftool/dbconfig/20260415-094519-fceratto.json
09:41 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
09:38 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1231', diff saved to https://phabricator.wikimedia.org/P90731 and previous config saved to /var/cache/conftool/dbconfig/20260415-093823-ladsgroup.json
09:35 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2069.codfw.wmnet with OS bullseye
09:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1178', diff saved to https://phabricator.wikimedia.org/P90730 and previous config saved to /var/cache/conftool/dbconfig/20260415-093511-fceratto.json
09:28 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1231', diff saved to https://phabricator.wikimedia.org/P90729 and previous config saved to /var/cache/conftool/dbconfig/20260415-092815-ladsgroup.json
09:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1178', diff saved to https://phabricator.wikimedia.org/P90728 and previous config saved to /var/cache/conftool/dbconfig/20260415-092502-fceratto.json
09:18 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1231 (T410589)', diff saved to https://phabricator.wikimedia.org/P90727 and previous config saved to /var/cache/conftool/dbconfig/20260415-091807-ladsgroup.json
09:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1178 (T419635)', diff saved to https://phabricator.wikimedia.org/P90726 and previous config saved to /var/cache/conftool/dbconfig/20260415-091454-fceratto.json
09:09 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1178 (T419635)', diff saved to https://phabricator.wikimedia.org/P90725 and previous config saved to /var/cache/conftool/dbconfig/20260415-090945-fceratto.json
09:09 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1178.eqiad.wmnet with reason: Maintenance
09:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1177 (T419635)', diff saved to https://phabricator.wikimedia.org/P90724 and previous config saved to /var/cache/conftool/dbconfig/20260415-090920-fceratto.json
08:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1177', diff saved to https://phabricator.wikimedia.org/P90723 and previous config saved to /var/cache/conftool/dbconfig/20260415-085912-fceratto.json
08:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1177', diff saved to https://phabricator.wikimedia.org/P90722 and previous config saved to /var/cache/conftool/dbconfig/20260415-084904-fceratto.json
08:43 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
08:43 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
08:41 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
08:40 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
08:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1177 (T419635)', diff saved to https://phabricator.wikimedia.org/P90721 and previous config saved to /var/cache/conftool/dbconfig/20260415-083857-fceratto.json
08:38 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
08:35 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1177 (T419635)', diff saved to https://phabricator.wikimedia.org/P90720 and previous config saved to /var/cache/conftool/dbconfig/20260415-083547-fceratto.json
08:35 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1177.eqiad.wmnet with reason: Maintenance
08:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1172 (T419635)', diff saved to https://phabricator.wikimedia.org/P90719 and previous config saved to /var/cache/conftool/dbconfig/20260415-083522-fceratto.json
08:34 mvernon@cumin2002: START - Cookbook sre.swift.convert-disks for host ms-be2069
08:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1172', diff saved to https://phabricator.wikimedia.org/P90718 and previous config saved to /var/cache/conftool/dbconfig/20260415-082514-fceratto.json
08:15 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1172', diff saved to https://phabricator.wikimedia.org/P90717 and previous config saved to /var/cache/conftool/dbconfig/20260415-081506-fceratto.json
08:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1172 (T419635)', diff saved to https://phabricator.wikimedia.org/P90716 and previous config saved to /var/cache/conftool/dbconfig/20260415-080458-fceratto.json
08:01 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1172 (T419635)', diff saved to https://phabricator.wikimedia.org/P90715 and previous config saved to /var/cache/conftool/dbconfig/20260415-080150-fceratto.json
08:01 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1172.eqiad.wmnet with reason: Maintenance
08:00 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1171.eqiad.wmnet with reason: Maintenance
08:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1167 (T419635)', diff saved to https://phabricator.wikimedia.org/P90714 and previous config saved to /var/cache/conftool/dbconfig/20260415-075959-fceratto.json
07:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1167', diff saved to https://phabricator.wikimedia.org/P90713 and previous config saved to /var/cache/conftool/dbconfig/20260415-074951-fceratto.json
07:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1167', diff saved to https://phabricator.wikimedia.org/P90712 and previous config saved to /var/cache/conftool/dbconfig/20260415-073942-fceratto.json
07:32 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be2006.codfw.wmnet
07:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1167 (T419635)', diff saved to https://phabricator.wikimedia.org/P90711 and previous config saved to /var/cache/conftool/dbconfig/20260415-072935-fceratto.json
07:26 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1167 (T419635)', diff saved to https://phabricator.wikimedia.org/P90710 and previous config saved to /var/cache/conftool/dbconfig/20260415-072626-fceratto.json
07:26 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1016,1020].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
07:25 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1167.eqiad.wmnet with reason: Maintenance
07:25 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host thanos-be2006.codfw.wmnet
07:23 Emperor: discard /srv/log/swift/server.log.5.gz on thanos-be2006 to free disk space
07:17 Emperor: discard /srv/log/swift/server.log.1 on thanos-be2006 to free disk space
02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 14s)
02:01 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
01:51 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db1231 (T410589)', diff saved to https://phabricator.wikimedia.org/P90709 and previous config saved to /var/cache/conftool/dbconfig/20260415-015138-ladsgroup.json
01:51 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db1231.eqiad.wmnet with reason: Maintenance
01:51 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1227 (T410589)', diff saved to https://phabricator.wikimedia.org/P90708 and previous config saved to /var/cache/conftool/dbconfig/20260415-015113-ladsgroup.json
01:41 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1227', diff saved to https://phabricator.wikimedia.org/P90707 and previous config saved to /var/cache/conftool/dbconfig/20260415-014104-ladsgroup.json
01:30 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1227', diff saved to https://phabricator.wikimedia.org/P90706 and previous config saved to /var/cache/conftool/dbconfig/20260415-013056-ladsgroup.json
01:20 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1227 (T410589)', diff saved to https://phabricator.wikimedia.org/P90705 and previous config saved to /var/cache/conftool/dbconfig/20260415-012048-ladsgroup.json
01:00 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db2218 (T410589)', diff saved to https://phabricator.wikimedia.org/P90704 and previous config saved to /var/cache/conftool/dbconfig/20260415-010004-ladsgroup.json
00:59 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db2218.codfw.wmnet with reason: Maintenance
00:59 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2208 (T410589)', diff saved to https://phabricator.wikimedia.org/P90703 and previous config saved to /var/cache/conftool/dbconfig/20260415-005940-ladsgroup.json
00:49 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2208', diff saved to https://phabricator.wikimedia.org/P90702 and previous config saved to /var/cache/conftool/dbconfig/20260415-004932-ladsgroup.json
00:39 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2208', diff saved to https://phabricator.wikimedia.org/P90701 and previous config saved to /var/cache/conftool/dbconfig/20260415-003923-ladsgroup.json
00:29 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2208 (T410589)', diff saved to https://phabricator.wikimedia.org/P90700 and previous config saved to /var/cache/conftool/dbconfig/20260415-002915-ladsgroup.json
00:16 ladsgroup@deploy1003: Finished scap sync-world: Backport for Api: Remove deprecation warning for missing rvslots (T412637), Api: Remove deprecation warning for missing rvslots (T412637) (duration: 06m 41s)
00:13 ladsgroup@deploy1003: ladsgroup: Continuing with sync
00:12 ladsgroup@deploy1003: ladsgroup: Backport for Api: Remove deprecation warning for missing rvslots (T412637), Api: Remove deprecation warning for missing rvslots (T412637) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
00:10 ladsgroup@deploy1003: Started scap sync-world: Backport for Api: Remove deprecation warning for missing rvslots (T412637), Api: Remove deprecation warning for missing rvslots (T412637)

2026-04-14

23:11 Amir1: optimizing globalblocks table on s7 (T423349)
22:44 jasmine@dns1004: END - running authdns-update
22:43 jasmine@dns1004: START - running authdns-update
21:12 bvibber@deploy1003: Finished scap sync-world: Backport for Enable ReaderExperiments for itwiki, plwiki (T423173) (duration: 09m 48s)
21:08 bvibber@deploy1003: bvibber: Continuing with sync
21:04 bvibber@deploy1003: bvibber: Backport for Enable ReaderExperiments for itwiki, plwiki (T423173) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
21:02 bvibber@deploy1003: Started scap sync-world: Backport for Enable ReaderExperiments for itwiki, plwiki (T423173)
20:57 catrope@deploy1003: Finished scap sync-world: Backport for Update wikimaniawiki namespace search (T423278), Enforce 2FA requirements for phase 1 groups (T423118) (duration: 07m 28s)
20:53 catrope@deploy1003: catrope, robertsky: Continuing with sync
20:51 catrope@deploy1003: catrope, robertsky: Backport for Update wikimaniawiki namespace search (T423278), Enforce 2FA requirements for phase 1 groups (T423118) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
20:49 catrope@deploy1003: Started scap sync-world: Backport for Update wikimaniawiki namespace search (T423278), Enforce 2FA requirements for phase 1 groups (T423118)
20:40 cscott@deploy1003: Finished scap sync-world: Backport for ParsoidLanguageConverter: convert inside <indicator> (T422961), LanguageConverter: Allow disabling top-level variant "guess" (T419328) (duration: 10m 18s)
20:36 cscott@deploy1003: cscott: Continuing with sync
20:32 cscott@deploy1003: cscott: Backport for ParsoidLanguageConverter: convert inside <indicator> (T422961), LanguageConverter: Allow disabling top-level variant "guess" (T419328) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
20:30 cscott@deploy1003: Started scap sync-world: Backport for ParsoidLanguageConverter: convert inside <indicator> (T422961), LanguageConverter: Allow disabling top-level variant "guess" (T419328)
20:16 mstyles@deploy1003: Finished scap sync-world: Backport for Route email confirmation funnel through Test Kitchen experiment (T420007) (duration: 09m 25s)
20:12 mstyles@deploy1003: mstyles: Continuing with sync
20:09 mstyles@deploy1003: mstyles: Backport for Route email confirmation funnel through Test Kitchen experiment (T420007) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
20:07 mstyles@deploy1003: Started scap sync-world: Backport for Route email confirmation funnel through Test Kitchen experiment (T420007)
19:30 swfrench-wmf: applied external-services network policy updates for cassandra-analytics-query-service-storage-[ab]-eqiad (aqs1026) and dumps-wikimedia in wikikube clusters
19:27 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
19:27 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
19:24 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
19:23 swfrench@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
19:22 swfrench@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
19:21 swfrench@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
19:20 swfrench@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
19:19 swfrench@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
19:16 eevans@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on aqs1026.eqiad.wmnet with reason: Bootstrapping — T412830
18:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2216 (T419635)', diff saved to https://phabricator.wikimedia.org/P90699 and previous config saved to /var/cache/conftool/dbconfig/20260414-184440-fceratto.json
18:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2216', diff saved to https://phabricator.wikimedia.org/P90698 and previous config saved to /var/cache/conftool/dbconfig/20260414-183432-fceratto.json
18:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2216', diff saved to https://phabricator.wikimedia.org/P90697 and previous config saved to /var/cache/conftool/dbconfig/20260414-182424-fceratto.json
18:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2216 (T419635)', diff saved to https://phabricator.wikimedia.org/P90696 and previous config saved to /var/cache/conftool/dbconfig/20260414-181416-fceratto.json
18:11 dancy@deploy1003: rebuilt and synchronized wikiversions files: group0 to 1.46.0-wmf.24 refs T420482
18:04 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-upgrade-ats (exit_code=0) Rolling upgrade of ATS on A:cp-text_eqiad - 9.2.13 Upgrade ()
18:03 ladsgroup@deploy1003: Finished scap sync-world: Backport for src: Fix typos (duration: 07m 13s)
17:59 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2216 (T419635)', diff saved to https://phabricator.wikimedia.org/P90695 and previous config saved to /var/cache/conftool/dbconfig/20260414-175927-fceratto.json
17:59 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2216.codfw.wmnet with reason: Maintenance
17:59 ladsgroup@deploy1003: ladsgroup: Continuing with sync
17:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2203 (T419635)', diff saved to https://phabricator.wikimedia.org/P90694 and previous config saved to /var/cache/conftool/dbconfig/20260414-175902-fceratto.json
17:58 ladsgroup@deploy1003: ladsgroup: Backport for src: Fix typos synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
17:56 ladsgroup@deploy1003: Started scap sync-world: Backport for src: Fix typos
17:56 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-upgrade-ats (exit_code=0) Rolling upgrade of ATS on A:cp-upload_eqiad - 9.2.13 Upgrade ()
17:51 mvernon@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ms-be2068.codfw.wmnet with OS bullseye
17:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2203', diff saved to https://phabricator.wikimedia.org/P90693 and previous config saved to /var/cache/conftool/dbconfig/20260414-174854-fceratto.json
17:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2203', diff saved to https://phabricator.wikimedia.org/P90692 and previous config saved to /var/cache/conftool/dbconfig/20260414-173846-fceratto.json
17:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2203 (T419635)', diff saved to https://phabricator.wikimedia.org/P90691 and previous config saved to /var/cache/conftool/dbconfig/20260414-172838-fceratto.json
17:17 brett@cumin2002: START - Cookbook sre.cdn.roll-upgrade-ats Rolling upgrade of ATS on A:cp-text_eqiad - 9.2.13 Upgrade ()
17:17 brett@cumin2002: START - Cookbook sre.cdn.roll-upgrade-ats Rolling upgrade of ATS on A:cp-upload_eqiad - 9.2.13 Upgrade ()
17:12 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2203 (T419635)', diff saved to https://phabricator.wikimedia.org/P90690 and previous config saved to /var/cache/conftool/dbconfig/20260414-171246-fceratto.json
17:12 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2203.codfw.wmnet with reason: Maintenance
17:07 taavi: updating caprica hostlists on cloud-hosts-in cr firewall policies
17:00 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2202.codfw.wmnet with reason: Maintenance
17:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2188 (T419635)', diff saved to https://phabricator.wikimedia.org/P90689 and previous config saved to /var/cache/conftool/dbconfig/20260414-170010-fceratto.json
16:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2188', diff saved to https://phabricator.wikimedia.org/P90688 and previous config saved to /var/cache/conftool/dbconfig/20260414-165001-fceratto.json
16:43 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on zuul2001.codfw.wmnet with reason: T421398
16:43 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on zuul1001.eqiad.wmnet with reason: T421398
16:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2188', diff saved to https://phabricator.wikimedia.org/P90687 and previous config saved to /var/cache/conftool/dbconfig/20260414-163953-fceratto.json
16:35 daniel@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/aux-k8s-services/redioscope: apply
16:35 daniel@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/aux-k8s-services/redioscope: apply
16:34 daniel@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/aux-k8s-services/redioscope: apply
16:34 daniel@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/aux-k8s-services/redioscope: apply
16:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2188 (T419635)', diff saved to https://phabricator.wikimedia.org/P90686 and previous config saved to /var/cache/conftool/dbconfig/20260414-162945-fceratto.json
16:20 jforrester@deploy1003: Finished scap sync-world: Backport for wmgMonologChannels: Reduce WikiLambda* sub-channels logging from debug to info (duration: 08m 27s)
16:16 jforrester@deploy1003: jforrester: Continuing with sync
16:15 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2068.codfw.wmnet with reason: host reimage
16:13 jforrester@deploy1003: jforrester: Backport for wmgMonologChannels: Reduce WikiLambda* sub-channels logging from debug to info synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
16:13 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2188 (T419635)', diff saved to https://phabricator.wikimedia.org/P90685 and previous config saved to /var/cache/conftool/dbconfig/20260414-161351-fceratto.json
16:13 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2188.codfw.wmnet with reason: Maintenance
16:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2176 (T419635)', diff saved to https://phabricator.wikimedia.org/P90684 and previous config saved to /var/cache/conftool/dbconfig/20260414-161326-fceratto.json
16:12 jforrester@deploy1003: Started scap sync-world: Backport for wmgMonologChannels: Reduce WikiLambda* sub-channels logging from debug to info
16:10 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2068.codfw.wmnet with reason: host reimage
16:08 jforrester@deploy1003: Finished scap sync-world: Backport for wmgMonologChannels: Add WikiLambda* sub-channels, all at debug for some quick checks (duration: 06m 32s)
16:04 jforrester@deploy1003: jforrester: Continuing with sync
16:03 jforrester@deploy1003: jforrester: Backport for wmgMonologChannels: Add WikiLambda* sub-channels, all at debug for some quick checks synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
16:03 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2176', diff saved to https://phabricator.wikimedia.org/P90683 and previous config saved to /var/cache/conftool/dbconfig/20260414-160319-fceratto.json
16:01 jforrester@deploy1003: Started scap sync-world: Backport for wmgMonologChannels: Add WikiLambda* sub-channels, all at debug for some quick checks
15:53 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2176', diff saved to https://phabricator.wikimedia.org/P90682 and previous config saved to /var/cache/conftool/dbconfig/20260414-155310-fceratto.json
15:52 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2068.codfw.wmnet with OS bullseye
15:50 mvernon@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ms-be2068.codfw.wmnet
15:45 cdanis@deploy1003: Finished scap sync-world: Backport for SwiftFileBackend: propagate tracing context to HTTP client (duration: 08m 24s)
15:43 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2176 (T419635)', diff saved to https://phabricator.wikimedia.org/P90681 and previous config saved to /var/cache/conftool/dbconfig/20260414-154302-fceratto.json
15:41 cdanis@deploy1003: cdanis: Continuing with sync
15:38 cdanis@deploy1003: cdanis: Backport for SwiftFileBackend: propagate tracing context to HTTP client synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
15:37 cdanis@deploy1003: Started scap sync-world: Backport for SwiftFileBackend: propagate tracing context to HTTP client
15:33 mvernon@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ms-be2068.codfw.wmnet
15:32 mvernon@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host ms-be2068.codfw.wmnet with OS bullseye
15:26 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
15:24 jasmine@dns1004: END - running authdns-update
15:24 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
15:23 jasmine@dns1004: START - running authdns-update
15:21 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2176 (T419635)', diff saved to https://phabricator.wikimedia.org/P90680 and previous config saved to /var/cache/conftool/dbconfig/20260414-152156-fceratto.json
15:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2176.codfw.wmnet with reason: Maintenance
15:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2174 (T419635)', diff saved to https://phabricator.wikimedia.org/P90679 and previous config saved to /var/cache/conftool/dbconfig/20260414-152132-fceratto.json
15:18 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
15:18 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
15:17 daniel@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/aux-k8s-services/redioscope: apply
15:15 daniel@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/aux-k8s-services/redioscope: apply
15:13 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
15:12 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
15:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2174', diff saved to https://phabricator.wikimedia.org/P90678 and previous config saved to /var/cache/conftool/dbconfig/20260414-151123-fceratto.json
15:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2174', diff saved to https://phabricator.wikimedia.org/P90677 and previous config saved to /var/cache/conftool/dbconfig/20260414-150115-fceratto.json
14:59 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2068.codfw.wmnet with reason: host reimage
14:56 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
14:54 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2068.codfw.wmnet with reason: host reimage
14:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2174 (T419635)', diff saved to https://phabricator.wikimedia.org/P90676 and previous config saved to /var/cache/conftool/dbconfig/20260414-145107-fceratto.json
14:50 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
14:49 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
14:44 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
14:39 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2214: after reimage to trixie
14:36 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2068.codfw.wmnet with OS bullseye
14:35 mvernon@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ms-be2068.codfw.wmnet
14:33 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2174 (T419635)', diff saved to https://phabricator.wikimedia.org/P90673 and previous config saved to /var/cache/conftool/dbconfig/20260414-143301-fceratto.json
14:32 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2174.codfw.wmnet with reason: Maintenance
14:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2173 (T419635)', diff saved to https://phabricator.wikimedia.org/P90672 and previous config saved to /var/cache/conftool/dbconfig/20260414-143235-fceratto.json
14:26 mvernon@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ms-be2068.codfw.wmnet
14:26 mvernon@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ms-be2068.codfw.wmnet
14:25 mvernon@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ms-be2068.codfw.wmnet
14:25 mvernon@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ms-be2068.codfw.wmnet
14:25 mvernon@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ms-be2068.codfw.wmnet
14:24 mvernon@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ms-be2068.codfw.wmnet
14:22 mvernon@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ms-be2068.codfw.wmnet
14:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2173', diff saved to https://phabricator.wikimedia.org/P90670 and previous config saved to /var/cache/conftool/dbconfig/20260414-142227-fceratto.json
14:18 sukhe@dns1004: END - running authdns-update
14:17 sukhe@dns1004: START - running authdns-update
14:16 ayounsi@cumin1003: conftool action : set/pooled=yes; selector: cluster=dnsbox,dc=esams [reason: esams maintenance over]
14:16 mvernon@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host ms-be2068.codfw.wmnet with OS bullseye
14:16 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: pool esams [reason: network maintenance finished, T416450]
14:16 ayounsi@cumin1003: START - Cookbook sre.dns.admin DNS admin: pool esams [reason: network maintenance finished, T416450]
14:14 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for 13 hosts
14:14 ayounsi@cumin1003: START - Cookbook sre.hosts.remove-downtime for 13 hosts
14:14 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for 12 hosts
14:14 ayounsi@cumin1003: START - Cookbook sre.hosts.remove-downtime for 12 hosts
14:13 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for 8 hosts
14:13 ayounsi@cumin1003: START - Cookbook sre.hosts.remove-downtime for 8 hosts
14:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2173', diff saved to https://phabricator.wikimedia.org/P90669 and previous config saved to /var/cache/conftool/dbconfig/20260414-141219-fceratto.json
14:02 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2173 (T419635)', diff saved to https://phabricator.wikimedia.org/P90667 and previous config saved to /var/cache/conftool/dbconfig/20260414-140211-fceratto.json
13:57 XioNoX: asw1-by27-esams> request system reboot - T416450
13:56 ayounsi@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 8 hosts with reason: router upgrade
13:55 ayounsi@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on asw1-by27-esams,asw1-by27-esams IPv6,asw1-by27-esams.mgmt with reason: router upgrade
13:55 mszwarc@deploy1003: mwscript-k8s job started: foreachwikiindblist all backfillInterwikiRightsLog.php --remote-wiki metawiki 20260311190000 # T6055 (third attempt)
13:54 ayounsi@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 12 hosts with reason: router upgrade
13:54 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2214: after reimage to trixie
13:53 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2214.codfw.wmnet with OS trixie
13:47 Lucas_WMDE: UTC afternoon backport+config window done
13:45 stran@deploy1003: Finished scap sync-world: Backport for Update webonyx/graphql-php to 15.31.5 (T423216) (duration: 07m 05s)
13:44 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2173 (T419635)', diff saved to https://phabricator.wikimedia.org/P90665 and previous config saved to /var/cache/conftool/dbconfig/20260414-134416-fceratto.json
13:44 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2173.codfw.wmnet with reason: Maintenance
13:43 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2170 (T419635)', diff saved to https://phabricator.wikimedia.org/P90664 and previous config saved to /var/cache/conftool/dbconfig/20260414-134350-fceratto.json
13:41 stran@deploy1003: stran: Continuing with sync
13:40 stran@deploy1003: stran: Backport for Update webonyx/graphql-php to 15.31.5 (T423216) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
13:38 stran@deploy1003: Started scap sync-world: Backport for Update webonyx/graphql-php to 15.31.5 (T423216)
13:36 XioNoX: asw1-bw27-esams> request system reboot - T416450
13:35 ayounsi@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on asw1-bw27-esams,asw1-bw27-esams IPv6,asw1-bw27-esams.mgmt with reason: router upgrade
13:35 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2068.codfw.wmnet with reason: host reimage
13:33 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2170', diff saved to https://phabricator.wikimedia.org/P90663 and previous config saved to /var/cache/conftool/dbconfig/20260414-133342-fceratto.json
13:33 ayounsi@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 13 hosts with reason: router upgrade
13:31 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2068.codfw.wmnet with reason: host reimage
13:31 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2214.codfw.wmnet with reason: host reimage
13:26 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2214.codfw.wmnet with reason: host reimage
13:23 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2170', diff saved to https://phabricator.wikimedia.org/P90662 and previous config saved to /var/cache/conftool/dbconfig/20260414-132334-fceratto.json
13:23 Amir1: on testcommonswiki drop table if exists categorylinks; drop table if exists externallinks; drop table if exists linktarget; drop table if exists collation; drop table if exists imagelinks; drop table if exists iwlinks; drop table if exists existencelinks; (T421914)
13:19 dreamyjazz@deploy1003: Finished scap sync-world: Backport for Stop setting $wgCampaignEventsEnableEventGoals (T414150), Revert "zhwiki: Temporary Logo Change for WP25" (T414299), Enable VisualEditor hCaptcha on testwiki (T423252) (duration: 09m 27s)
13:16 dreamyjazz@deploy1003: daimona, stang, dreamyjazz: Continuing with sync
13:15 XioNoX: cr2-esams - request vmhost reboot - T416450
13:14 elukey: disable cert-renewal on wikikube staging clusters as a test for the PKI discovery intermediate rollout - To rollback, revert: https://gerrit.wikimedia.org/r/c/operations/deployment-charts/+/1270873 - T420993
13:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2170 (T419635)', diff saved to https://phabricator.wikimedia.org/P90661 and previous config saved to /var/cache/conftool/dbconfig/20260414-131326-fceratto.json
13:13 elukey@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'sync'.
13:12 elukey@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'sync'.
13:12 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2068.codfw.wmnet with OS bullseye
13:12 elukey@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'sync'.
13:12 elukey@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'sync'.
13:12 dreamyjazz@deploy1003: daimona, stang, dreamyjazz: Backport for Stop setting $wgCampaignEventsEnableEventGoals (T414150), Revert "zhwiki: Temporary Logo Change for WP25" (T414299), Enable VisualEditor hCaptcha on testwiki (T423252) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
13:12 mvernon@cumin2002: END (FAIL) - Cookbook sre.swift.convert-disks (exit_code=99) for host ms-be2068
13:12 mvernon@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host ms-be2068.codfw.wmnet with OS bullseye
13:10 dreamyjazz@deploy1003: Started scap sync-world: Backport for Stop setting $wgCampaignEventsEnableEventGoals (T414150), Revert "zhwiki: Temporary Logo Change for WP25" (T414299), Enable VisualEditor hCaptcha on testwiki (T423252)
13:09 ayounsi@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on cr2-esams,cr2-esams IPv6,cr2-esams.mgmt with reason: router upgrade
13:06 jmm@dns1004: END - running authdns-update
13:06 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2214.codfw.wmnet with OS trixie
13:05 jmm@dns1004: START - running authdns-update
13:03 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2214: Reimage to Trixie
13:03 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2214: Reimage to Trixie
13:03 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2214.codfw.wmnet with reason: Reimage to Trixie
13:01 XioNoX: cr1-esams - request chassis routing-engine master switch - T416450
12:59 ayounsi@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on cr1-esams,cr1-esams IPv6,re0.cr1-esams.mgmt with reason: router upgrade
12:56 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2170 (T419635)', diff saved to https://phabricator.wikimedia.org/P90659 and previous config saved to /var/cache/conftool/dbconfig/20260414-125642-fceratto.json
12:56 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2170.codfw.wmnet with reason: Maintenance
12:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2153 (T419635)', diff saved to https://phabricator.wikimedia.org/P90658 and previous config saved to /var/cache/conftool/dbconfig/20260414-125628-fceratto.json
12:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2153', diff saved to https://phabricator.wikimedia.org/P90657 and previous config saved to /var/cache/conftool/dbconfig/20260414-124620-fceratto.json
12:41 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2068.codfw.wmnet with reason: host reimage
12:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2153', diff saved to https://phabricator.wikimedia.org/P90656 and previous config saved to /var/cache/conftool/dbconfig/20260414-123611-fceratto.json
12:35 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2068.codfw.wmnet with reason: host reimage
12:35 dbrant@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifeeds: apply
12:34 dbrant@deploy1003: helmfile [codfw] START helmfile.d/services/wikifeeds: apply
12:33 XioNoX: cr1-esams - request chassis routing-engine master switch - T416450
12:33 dbrant@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifeeds: apply
12:32 dbrant@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifeeds: apply
12:28 dbrant@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifeeds: apply
12:28 root@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1151: Security updates
12:28 root@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
12:28 root@cumin1003: START - Cookbook sre.mysql.parsercache
12:28 root@cumin1003: START - Cookbook sre.mysql.pool pool db1151: Security updates
12:27 dbrant@deploy1003: helmfile [staging] START helmfile.d/services/wikifeeds: apply
12:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2153 (T419635)', diff saved to https://phabricator.wikimedia.org/P90654 and previous config saved to /var/cache/conftool/dbconfig/20260414-122603-fceratto.json
12:22 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: depool esams [reason: network maintenance continue, T416450]
12:22 ayounsi@cumin1003: START - Cookbook sre.dns.admin DNS admin: depool esams [reason: network maintenance continue, T416450]
12:17 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: pool esams [reason: network maintenance paused, T416450]
12:17 ayounsi@cumin1003: START - Cookbook sre.dns.admin DNS admin: pool esams [reason: network maintenance paused, T416450]
12:14 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/proton: apply
12:08 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2153 (T419635)', diff saved to https://phabricator.wikimedia.org/P90653 and previous config saved to /var/cache/conftool/dbconfig/20260414-120812-fceratto.json
12:08 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2153.codfw.wmnet with reason: Maintenance
12:07 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2146 (T419635)', diff saved to https://phabricator.wikimedia.org/P90652 and previous config saved to /var/cache/conftool/dbconfig/20260414-120747-fceratto.json
12:03 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/proton: apply
12:02 root@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1151: Security updates
12:02 root@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
12:02 root@cumin1003: START - Cookbook sre.mysql.parsercache
12:02 root@cumin1003: START - Cookbook sre.mysql.depool depool db1151: Security updates
12:02 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/proton: apply
12:02 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db1227 (T410589)', diff saved to https://phabricator.wikimedia.org/P90650 and previous config saved to /var/cache/conftool/dbconfig/20260414-120200-ladsgroup.json
12:01 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db1227.eqiad.wmnet with reason: Maintenance
12:01 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/proton: apply
11:57 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db2208 (T410589)', diff saved to https://phabricator.wikimedia.org/P90649 and previous config saved to /var/cache/conftool/dbconfig/20260414-115752-ladsgroup.json
11:57 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db2208.codfw.wmnet with reason: Maintenance
11:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2146', diff saved to https://phabricator.wikimedia.org/P90648 and previous config saved to /var/cache/conftool/dbconfig/20260414-115739-fceratto.json
11:57 ayounsi@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on cr1-esams,cr1-esams IPv6,re0.cr1-esams.mgmt with reason: router upgrade
11:55 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/proton: apply
11:54 jmm@deploy1003: helmfile [staging] START helmfile.d/services/proton: apply
11:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2146', diff saved to https://phabricator.wikimedia.org/P90647 and previous config saved to /var/cache/conftool/dbconfig/20260414-114732-fceratto.json
11:47 ayounsi@cumin1003: conftool action : set/pooled=no; selector: cluster=dnsbox,dc=esams [reason: esams maintenance over]
11:46 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: depool esams [reason: network maintenance, T416450]
11:46 ayounsi@cumin1003: START - Cookbook sre.dns.admin DNS admin: depool esams [reason: network maintenance, T416450]
11:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2146 (T419635)', diff saved to https://phabricator.wikimedia.org/P90646 and previous config saved to /var/cache/conftool/dbconfig/20260414-113721-fceratto.json
11:35 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2146 (T419635)', diff saved to https://phabricator.wikimedia.org/P90645 and previous config saved to /var/cache/conftool/dbconfig/20260414-113510-fceratto.json
11:35 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2146.codfw.wmnet with reason: Maintenance
11:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2145 (T419635)', diff saved to https://phabricator.wikimedia.org/P90644 and previous config saved to /var/cache/conftool/dbconfig/20260414-113456-fceratto.json
11:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2145', diff saved to https://phabricator.wikimedia.org/P90643 and previous config saved to /var/cache/conftool/dbconfig/20260414-112448-fceratto.json
11:24 root@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1153: Security updates
11:24 root@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
11:24 root@cumin1003: START - Cookbook sre.mysql.parsercache
11:24 root@cumin1003: START - Cookbook sre.mysql.pool pool db1153: Security updates
11:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2145', diff saved to https://phabricator.wikimedia.org/P90641 and previous config saved to /var/cache/conftool/dbconfig/20260414-111440-fceratto.json
11:04 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2145 (T419635)', diff saved to https://phabricator.wikimedia.org/P90640 and previous config saved to /var/cache/conftool/dbconfig/20260414-110432-fceratto.json
11:00 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2068.codfw.wmnet with OS bullseye
10:59 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2145 (T419635)', diff saved to https://phabricator.wikimedia.org/P90639 and previous config saved to /var/cache/conftool/dbconfig/20260414-105920-fceratto.json
10:59 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2145.codfw.wmnet with reason: Maintenance
10:59 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2141.codfw.wmnet with reason: Maintenance
10:56 root@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1153: Security updates
10:56 root@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
10:56 root@cumin1003: START - Cookbook sre.mysql.parsercache
10:56 root@cumin1003: START - Cookbook sre.mysql.depool depool db1153: Security updates
10:55 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1151: Security update
10:55 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
10:54 fceratto@cumin1003: START - Cookbook sre.mysql.parsercache
10:54 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1151: Security update
10:25 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cumin1003.eqiad.wmnet
10:24 volans@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:05:00 on sretest1005.eqiad.wmnet with reason: Testing cumin v6.0.0
10:23 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
10:23 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
10:21 volans: install cumin v6.0.0 on cumin1003 (last host remained to upgrade)
10:19 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host cumin1003.eqiad.wmnet
10:16 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
10:14 fceratto@cumin2002: dbctl commit (dc=all): 'Pool in', diff saved to https://phabricator.wikimedia.org/P90636 and previous config saved to /var/cache/conftool/dbconfig/20260414-101428-fceratto.json
10:14 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
10:13 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2068.codfw.wmnet
10:11 marostegui@cumin1003: dbctl commit (dc=all): 'Fully repool db1168', diff saved to https://phabricator.wikimedia.org/P90635 and previous config saved to /var/cache/conftool/dbconfig/20260414-101119-marostegui.json
10:10 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
10:10 marostegui@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db1168: after reimage to trixie
10:09 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
10:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2146', diff saved to https://phabricator.wikimedia.org/P90634 and previous config saved to /var/cache/conftool/dbconfig/20260414-100942-fceratto.json
10:09 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
10:08 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
10:07 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1168: after reimage to trixie
10:06 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1168.eqiad.wmnet with OS trixie
10:04 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
10:04 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
10:03 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
10:02 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
09:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2146', diff saved to https://phabricator.wikimedia.org/P90632 and previous config saved to /var/cache/conftool/dbconfig/20260414-095934-fceratto.json
09:56 elukey: rotated debmonitor client and server certs fleetwide for intermediate certs rotation - T420993
09:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2146 (T419635)', diff saved to https://phabricator.wikimedia.org/P90631 and previous config saved to /var/cache/conftool/dbconfig/20260414-094926-fceratto.json
09:47 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool pc2011: Test depool
09:47 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
09:46 fceratto@cumin1003: START - Cookbook sre.mysql.parsercache
09:46 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool pc2011: Test depool
09:46 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool pc2011: Test depool
09:46 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
09:46 fceratto@cumin1003: START - Cookbook sre.mysql.parsercache
09:46 fceratto@cumin1003: START - Cookbook sre.mysql.depool depool pc2011: Test depool
09:46 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool pc2011: Test depool
09:46 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
09:46 fceratto@cumin1003: START - Cookbook sre.mysql.parsercache
09:46 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool pc2011: Test depool
09:45 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2144.codfw.wmnet with reason: T419961
09:38 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1151.eqiad.wmnet with reason: T419961
09:32 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool pc2011: Test depool
09:32 fceratto@cumin1003: END (FAIL) - Cookbook sre.mysql.parsercache (exit_code=99)
09:32 fceratto@cumin1003: START - Cookbook sre.mysql.parsercache
09:32 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool pc2011: Test depool
09:32 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2146 (T419635)', diff saved to https://phabricator.wikimedia.org/P90627 and previous config saved to /var/cache/conftool/dbconfig/20260414-093204-fceratto.json
09:31 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool pc2011: Test depool
09:31 fceratto@cumin1003: END (FAIL) - Cookbook sre.mysql.parsercache (exit_code=99)
09:31 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2146.codfw.wmnet with reason: Maintenance
09:31 fceratto@cumin1003: START - Cookbook sre.mysql.parsercache
09:31 fceratto@cumin1003: START - Cookbook sre.mysql.depool depool pc2011: Test depool
09:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2145 (T419635)', diff saved to https://phabricator.wikimedia.org/P90625 and previous config saved to /var/cache/conftool/dbconfig/20260414-093138-fceratto.json
09:29 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2144: Test depool
09:29 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
09:29 fceratto@cumin1003: START - Cookbook sre.mysql.parsercache
09:29 fceratto@cumin1003: START - Cookbook sre.mysql.depool depool db2144: Test depool
09:27 root@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1151: Security updates
09:27 root@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
09:27 root@cumin1003: START - Cookbook sre.mysql.parsercache
09:27 root@cumin1003: START - Cookbook sre.mysql.depool depool db1151: Security updates
09:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host backup1006
09:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host backup1006
09:24 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host backup1006
09:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) backup1006.eqiad.wmnet 162.32.64.10.in-addr.arpa 2.6.1.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
09:24 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache backup1006.eqiad.wmnet 162.32.64.10.in-addr.arpa 2.6.1.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
09:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
09:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host backup1006 - ayounsi@cumin1003"
09:24 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host backup1006 - ayounsi@cumin1003"
09:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2145', diff saved to https://phabricator.wikimedia.org/P90623 and previous config saved to /var/cache/conftool/dbconfig/20260414-092130-fceratto.json
09:19 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
09:17 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host backup1006
09:12 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on backup[1006-1007,1014].eqiad.wmnet with reason: maintenance
09:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2145', diff saved to https://phabricator.wikimedia.org/P90621 and previous config saved to /var/cache/conftool/dbconfig/20260414-091122-fceratto.json
09:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2145 (T419635)', diff saved to https://phabricator.wikimedia.org/P90620 and previous config saved to /var/cache/conftool/dbconfig/20260414-090112-fceratto.json
08:45 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2180: repool after maintenance
08:43 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2145 (T419635)', diff saved to https://phabricator.wikimedia.org/P90618 and previous config saved to /var/cache/conftool/dbconfig/20260414-084353-fceratto.json
08:43 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2145.codfw.wmnet with reason: Maintenance
08:34 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2068.codfw.wmnet
08:34 mvernon@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ms-be2068.codfw.wmnet
08:31 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1168.eqiad.wmnet with reason: host reimage
08:28 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2141.codfw.wmnet with reason: Maintenance
08:25 mvernon@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ms-be2068.codfw.wmnet
08:25 mvernon@cumin2002: START - Cookbook sre.swift.convert-disks for host ms-be2068
08:23 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2068.codfw.wmnet with OS bullseye
08:22 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1168.eqiad.wmnet with reason: host reimage
08:20 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool pc2012: T419961
08:20 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
08:20 fceratto@cumin1003: START - Cookbook sre.mysql.parsercache
08:20 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool pc2012: T419961
08:07 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1168.eqiad.wmnet with OS trixie
08:06 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1168: Reimage to Trixie
08:05 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1168: Reimage to Trixie
08:05 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1168.eqiad.wmnet with reason: Reimage to Trixie
08:05 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2068.codfw.wmnet with reason: host reimage
08:04 moritzm: installing libnginx-mod-http-lua security updates
08:02 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool pc2012: T419961
08:02 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
08:02 fceratto@cumin1003: START - Cookbook sre.mysql.parsercache
08:02 fceratto@cumin1003: START - Cookbook sre.mysql.depool depool pc2012: T419961
08:02 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool pc1012: T419961
08:02 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
08:02 fceratto@cumin1003: START - Cookbook sre.mysql.parsercache
08:02 fceratto@cumin1003: START - Cookbook sre.mysql.depool depool pc1012: T419961
08:02 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool pc1012: T419961
08:02 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
08:02 fceratto@cumin1003: START - Cookbook sre.mysql.parsercache
08:01 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool pc1012: T419961
08:00 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2180: repool after maintenance
08:00 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2180.codfw.wmnet with OS trixie
07:58 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2068.codfw.wmnet with reason: host reimage
07:51 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1180: after upgrade
07:49 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on pc1012.eqiad.wmnet with reason: T419961
07:49 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on pc2012.codfw.wmnet with reason: T419961
07:41 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host ms-be2068
07:41 mvernon@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-be2068
07:40 mvernon@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ms-be2068
07:40 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ms-be2068.codfw.wmnet 91.32.192.10.in-addr.arpa 1.9.0.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
07:40 mvernon@cumin2002: START - Cookbook sre.dns.wipe-cache ms-be2068.codfw.wmnet 91.32.192.10.in-addr.arpa 1.9.0.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
07:40 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
07:40 mvernon@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be2068 - mvernon@cumin2002"
07:40 mvernon@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be2068 - mvernon@cumin2002"
07:38 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2180.codfw.wmnet with reason: host reimage
07:36 mvernon@cumin2002: START - Cookbook sre.dns.netbox
07:36 mvernon@cumin2002: START - Cookbook sre.hosts.move-vlan for host ms-be2068
07:35 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2068.codfw.wmnet with OS bullseye
07:32 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2180.codfw.wmnet with reason: host reimage
07:22 mszwarc@deploy1003: Finished scap sync-world: Backport for Prepare $wgOATH2FARequiredGroupRemovalPages for next groups (T423118) (duration: 12m 36s)
07:16 mszwarc@deploy1003: mszwarc: Continuing with sync
07:15 mszwarc@deploy1003: mszwarc: Backport for Prepare $wgOATH2FARequiredGroupRemovalPages for next groups (T423118) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
07:14 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2180.codfw.wmnet with OS trixie
07:12 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2180: Reimage to Trixie
07:11 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2180: Reimage to Trixie
07:11 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2180.codfw.wmnet with reason: Reimage to Trixie
07:11 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2180.codfw.wmnet with reason: Reimage to Trixie
07:10 mszwarc@deploy1003: Started scap sync-world: Backport for Prepare $wgOATH2FARequiredGroupRemovalPages for next groups (T423118)
07:09 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2180.codfw.wmnet with reason: Reimage to Trixie
07:09 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2180.codfw.wmnet with reason: Reimage to Trixie
07:06 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1180: after upgrade
06:58 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2217: repool after reimage to trixie
06:57 jmm@dns1004: END - running authdns-update
06:56 jmm@dns1004: START - running authdns-update
06:47 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host puppetserver1002.eqiad.wmnet
06:40 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host puppetserver1002.eqiad.wmnet
06:40 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host puppetserver2002.codfw.wmnet
06:33 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host puppetserver2002.codfw.wmnet
06:30 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-wmde: apply
06:30 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1180.eqiad.wmnet with OS trixie
06:30 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-wmde: apply
06:27 jmm@dns1004: END - running authdns-update
06:25 jmm@dns1004: START - running authdns-update
06:22 jmm@dns1004: END - running authdns-update
06:20 jmm@dns1004: START - running authdns-update
06:13 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2217: repool after reimage to trixie
06:11 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2217.codfw.wmnet with OS trixie
06:08 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1180.eqiad.wmnet with reason: host reimage
06:02 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1180.eqiad.wmnet with reason: host reimage
06:02 jmm@dns1004: END - running authdns-update
06:00 jmm@dns1004: START - running authdns-update
05:54 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2217.codfw.wmnet with reason: host reimage
05:49 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2217.codfw.wmnet with reason: host reimage
05:47 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1180.eqiad.wmnet with OS trixie
05:46 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.depool (exit_code=99) depool db1180: Upgrade package
05:35 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1180.eqiad.wmnet with reason: Reimage to Trixie
05:35 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1180: Upgrade package
05:30 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2217.codfw.wmnet with OS trixie
05:28 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2217: Reimage
05:27 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2217: Reimage
05:27 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2217.codfw.wmnet with reason: Reimage to Trixie
04:02 mwpresync@deploy1003: Pruned MediaWiki: 1.46.0-wmf.21 (duration: 02m 34s)
03:39 mwpresync@deploy1003: Finished scap sync-world: testwikis to 1.46.0-wmf.24 refs T420482 (duration: 35m 44s)
03:03 mwpresync@deploy1003: Started scap sync-world: testwikis to 1.46.0-wmf.24 refs T420482
00:57 ladsgroup@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1227: Work done
00:51 eevans@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for aqs1025.eqiad.wmnet
00:51 eevans@cumin1003: START - Cookbook sre.hosts.remove-downtime for aqs1025.eqiad.wmnet
00:39 ladsgroup@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2208: Work done
00:09 ladsgroup@cumin1003: START - Cookbook sre.mysql.pool pool db1227: Work done
00:08 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: sync
00:05 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: sync

2026-04-13

23:54 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/data-gateway: sync
23:53 eevans@deploy1003: helmfile [staging] START helmfile.d/services/data-gateway: sync
23:49 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: sync
23:49 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: sync
23:49 ladsgroup@cumin1003: START - Cookbook sre.mysql.pool pool db2208: Work done
23:05 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-upgrade-ats (exit_code=0) Rolling upgrade of ATS on A:cp-text_codfw - 9.2.13 Upgrade ()
23:02 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-upgrade-ats (exit_code=0) Rolling upgrade of ATS on A:cp-upload_codfw - 9.2.13 Upgrade ()
22:29 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5024.*
22:29 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5023.*
22:26 brett@cumin2002: START - Cookbook sre.cdn.roll-upgrade-ats Rolling upgrade of ATS on A:cp-upload_codfw - 9.2.13 Upgrade ()
22:26 brett@cumin2002: START - Cookbook sre.cdn.roll-upgrade-ats Rolling upgrade of ATS on A:cp-text_codfw - 9.2.13 Upgrade ()
22:23 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-upgrade-ats (exit_code=0) Rolling upgrade of ATS on P{cp[5023-5024].eqsin.wmnet} and A:cp - 9.2.13 Upgrade ()
22:15 sbassett@deploy1003: Finished scap sync-world: Deployed security fix for T422085 (duration: 30m 14s)
22:08 brett@cumin2002: START - Cookbook sre.cdn.roll-upgrade-ats Rolling upgrade of ATS on P{cp[5023-5024].eqsin.wmnet} and A:cp - 9.2.13 Upgrade ()
22:08 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-upgrade-ats (exit_code=0) Rolling upgrade of ATS on A:cp-upload_eqsin - 9.2.13 Upgrade ()
22:04 brett@dns1006: END - running authdns-update
22:04 swfrench-wmf: applied pending external-services network policy diffs for aqs1025 in wikikube clusters
22:03 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
22:02 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
22:02 brett@dns1006: START - running authdns-update
21:56 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
21:55 swfrench@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
21:55 brett@cumin2002: END (FAIL) - Cookbook sre.cdn.roll-upgrade-ats (exit_code=1) Rolling upgrade of ATS on A:cp-text_eqsin - 9.2.13 Upgrade ()
21:55 swfrench@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
21:54 swfrench@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
21:53 swfrench@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
21:52 swfrench@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
21:44 sbassett@deploy1003: Started scap sync-world: Deployed security fix for T422085
21:41 sbassett: Deployed security patch for T418533
21:16 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db2208 (T410589)', diff saved to https://phabricator.wikimedia.org/P90589 and previous config saved to /var/cache/conftool/dbconfig/20260413-211606-ladsgroup.json
21:15 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db2208.codfw.wmnet with reason: Maintenance
21:13 brett@cumin2002: START - Cookbook sre.cdn.roll-upgrade-ats Rolling upgrade of ATS on A:cp-upload_eqsin - 9.2.13 Upgrade ()
21:13 brett@cumin2002: START - Cookbook sre.cdn.roll-upgrade-ats Rolling upgrade of ATS on A:cp-text_eqsin - 9.2.13 Upgrade ()
21:08 eevans@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on aqs1025.eqiad.wmnet with reason: Bootstrapping — T412830
20:55 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2198.codfw.wmnet with reason: Maintenance
20:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2195 (T419635)', diff saved to https://phabricator.wikimedia.org/P90588 and previous config saved to /var/cache/conftool/dbconfig/20260413-205531-fceratto.json
20:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2195', diff saved to https://phabricator.wikimedia.org/P90587 and previous config saved to /var/cache/conftool/dbconfig/20260413-204523-fceratto.json
20:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2195', diff saved to https://phabricator.wikimedia.org/P90586 and previous config saved to /var/cache/conftool/dbconfig/20260413-203514-fceratto.json
20:31 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-upgrade-ats (exit_code=0) Rolling upgrade of ATS on P{cp[3066,3068-3073].esams.wmnet} and A:cp - 9.2.13 Upgrade ()
20:28 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-upgrade-ats (exit_code=0) Rolling upgrade of ATS on P{cp[3075-3081].esams.wmnet} and A:cp - 9.2.13 Upgrade ()
20:25 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2195 (T419635)', diff saved to https://phabricator.wikimedia.org/P90585 and previous config saved to /var/cache/conftool/dbconfig/20260413-202506-fceratto.json
20:22 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2195 (T419635)', diff saved to https://phabricator.wikimedia.org/P90584 and previous config saved to /var/cache/conftool/dbconfig/20260413-202201-fceratto.json
20:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2195.codfw.wmnet with reason: Maintenance
20:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2181 (T419635)', diff saved to https://phabricator.wikimedia.org/P90583 and previous config saved to /var/cache/conftool/dbconfig/20260413-202137-fceratto.json
20:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2181', diff saved to https://phabricator.wikimedia.org/P90582 and previous config saved to /var/cache/conftool/dbconfig/20260413-201130-fceratto.json
20:07 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
20:07 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
20:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2181', diff saved to https://phabricator.wikimedia.org/P90581 and previous config saved to /var/cache/conftool/dbconfig/20260413-200122-fceratto.json
20:01 andrewtavis-wmde@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-wmde: apply
20:01 andrewtavis-wmde@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-wmde: apply
19:56 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudvirtlocal1003.eqiad.wmnet
19:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2181 (T419635)', diff saved to https://phabricator.wikimedia.org/P90580 and previous config saved to /var/cache/conftool/dbconfig/20260413-195113-fceratto.json
19:49 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudvirtlocal1003.eqiad.wmnet
19:49 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudvirtlocal1002.eqiad.wmnet
19:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2181 (T419635)', diff saved to https://phabricator.wikimedia.org/P90579 and previous config saved to /var/cache/conftool/dbconfig/20260413-194759-fceratto.json
19:47 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2181.codfw.wmnet with reason: Maintenance
19:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2167 (T419635)', diff saved to https://phabricator.wikimedia.org/P90578 and previous config saved to /var/cache/conftool/dbconfig/20260413-194734-fceratto.json
19:46 brett@cumin2002: START - Cookbook sre.cdn.roll-upgrade-ats Rolling upgrade of ATS on P{cp[3075-3081].esams.wmnet} and A:cp - 9.2.13 Upgrade ()
19:45 brett@cumin2002: START - Cookbook sre.cdn.roll-upgrade-ats Rolling upgrade of ATS on P{cp[3066,3068-3073].esams.wmnet} and A:cp - 9.2.13 Upgrade ()
19:42 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudvirtlocal1002.eqiad.wmnet
19:42 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudvirtlocal1001.eqiad.wmnet
19:39 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-upgrade-ats (exit_code=0) Rolling upgrade of ATS on A:cp-text_drmrs - 9.2.13 Upgrade ()
19:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2167', diff saved to https://phabricator.wikimedia.org/P90577 and previous config saved to /var/cache/conftool/dbconfig/20260413-193726-fceratto.json
19:36 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-upgrade-ats (exit_code=0) Rolling upgrade of ATS on A:cp-upload_drmrs - 9.2.13 Upgrade ()
19:35 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudvirtlocal1001.eqiad.wmnet
19:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2167', diff saved to https://phabricator.wikimedia.org/P90576 and previous config saved to /var/cache/conftool/dbconfig/20260413-192715-fceratto.json
19:25 cdanis: 💙cdanis@apt1002.wikimedia.org ~ 🕞🍵 sudo -i reprepro --component main --restrict cidergrinder update trixie-wikimedia
19:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2167 (T419635)', diff saved to https://phabricator.wikimedia.org/P90575 and previous config saved to /var/cache/conftool/dbconfig/20260413-191707-fceratto.json
19:14 swfrench-wmf: applied aqs cassandra host list changes from https://gerrit.wikimedia.org/r/1270496 - T423168
19:13 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2167 (T419635)', diff saved to https://phabricator.wikimedia.org/P90574 and previous config saved to /var/cache/conftool/dbconfig/20260413-191355-fceratto.json
19:13 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2167.codfw.wmnet with reason: Maintenance
19:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2166 (T419635)', diff saved to https://phabricator.wikimedia.org/P90573 and previous config saved to /var/cache/conftool/dbconfig/20260413-191330-fceratto.json
19:12 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/page-analytics: apply
19:11 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/page-analytics: apply
19:11 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/media-analytics: apply
19:10 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/media-analytics: apply
19:10 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/geo-analytics: apply
19:10 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/geo-analytics: apply
19:09 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/edit-analytics: apply
19:09 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/edit-analytics: apply
19:08 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/device-analytics: apply
19:08 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/device-analytics: apply
19:07 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/data-gateway: apply
19:07 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/data-gateway: apply
19:06 zabe@deploy1003: Finished scap sync-world: Backport for Revert "NewFilesPager: Make sure filerevision is queried before file" (duration: 05m 51s)
19:03 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2166', diff saved to https://phabricator.wikimedia.org/P90572 and previous config saved to /var/cache/conftool/dbconfig/20260413-190322-fceratto.json
19:02 zabe@deploy1003: zabe: Continuing with sync
19:02 zabe@deploy1003: zabe: Backport for Revert "NewFilesPager: Make sure filerevision is queried before file" synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
19:00 zabe@deploy1003: Started scap sync-world: Backport for Revert "NewFilesPager: Make sure filerevision is queried before file"
18:55 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/page-analytics: apply
18:54 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/page-analytics: apply
18:53 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/page-analytics: apply
18:53 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/media-analytics: apply
18:53 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2166', diff saved to https://phabricator.wikimedia.org/P90571 and previous config saved to /var/cache/conftool/dbconfig/20260413-185314-fceratto.json
18:52 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/media-analytics: apply
18:52 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/geo-analytics: apply
18:52 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/geo-analytics: apply
18:52 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/edit-analytics: apply
18:51 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/edit-analytics: apply
18:51 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/device-analytics: apply
18:51 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/device-analytics: apply
18:51 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/data-gateway: apply
18:51 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/data-gateway: apply
18:45 brett@cumin2002: START - Cookbook sre.cdn.roll-upgrade-ats Rolling upgrade of ATS on A:cp-upload_drmrs - 9.2.13 Upgrade ()
18:45 brett@cumin2002: START - Cookbook sre.cdn.roll-upgrade-ats Rolling upgrade of ATS on A:cp-text_drmrs - 9.2.13 Upgrade ()
18:44 zabe@deploy1003: Sync cancelled.
18:43 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2166 (T419635)', diff saved to https://phabricator.wikimedia.org/P90570 and previous config saved to /var/cache/conftool/dbconfig/20260413-184305-fceratto.json
18:41 zabe@deploy1003: zabe: Backport for NewFilesPager: Make sure filerevision is queried before file (T422946) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
18:40 zabe@deploy1003: Started scap sync-world: Backport for NewFilesPager: Make sure filerevision is queried before file (T422946)
18:39 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2166 (T419635)', diff saved to https://phabricator.wikimedia.org/P90569 and previous config saved to /var/cache/conftool/dbconfig/20260413-183953-fceratto.json
18:39 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2166.codfw.wmnet with reason: Maintenance
18:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2164 (T419635)', diff saved to https://phabricator.wikimedia.org/P90568 and previous config saved to /var/cache/conftool/dbconfig/20260413-183927-fceratto.json
18:37 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-upgrade-ats (exit_code=0) Rolling upgrade of ATS on A:cp-text_ulsfo - 9.2.13 Upgrade ()
18:36 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-upgrade-ats (exit_code=0) Rolling upgrade of ATS on A:cp-upload_ulsfo - 9.2.13 Upgrade ()
18:30 root@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool pc1018: Security updates
18:30 root@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
18:30 root@cumin1003: START - Cookbook sre.mysql.parsercache
18:30 root@cumin1003: START - Cookbook sre.mysql.pool pool pc1018: Security updates
18:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2164', diff saved to https://phabricator.wikimedia.org/P90566 and previous config saved to /var/cache/conftool/dbconfig/20260413-182919-fceratto.json
18:19 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2164', diff saved to https://phabricator.wikimedia.org/P90565 and previous config saved to /var/cache/conftool/dbconfig/20260413-181911-fceratto.json
18:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2164 (T419635)', diff saved to https://phabricator.wikimedia.org/P90564 and previous config saved to /var/cache/conftool/dbconfig/20260413-180902-fceratto.json
18:05 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2164 (T419635)', diff saved to https://phabricator.wikimedia.org/P90563 and previous config saved to /var/cache/conftool/dbconfig/20260413-180551-fceratto.json
18:05 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2164.codfw.wmnet with reason: Maintenance
18:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2163 (T419635)', diff saved to https://phabricator.wikimedia.org/P90562 and previous config saved to /var/cache/conftool/dbconfig/20260413-180525-fceratto.json
18:04 root@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool pc1018: Security updates
18:04 root@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
18:04 root@cumin1003: START - Cookbook sre.mysql.parsercache
18:04 root@cumin1003: START - Cookbook sre.mysql.depool depool pc1018: Security updates
17:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2163', diff saved to https://phabricator.wikimedia.org/P90560 and previous config saved to /var/cache/conftool/dbconfig/20260413-175517-fceratto.json
17:52 brett@cumin2002: START - Cookbook sre.cdn.roll-upgrade-ats Rolling upgrade of ATS on A:cp-upload_ulsfo - 9.2.13 Upgrade ()
17:52 brett@cumin2002: START - Cookbook sre.cdn.roll-upgrade-ats Rolling upgrade of ATS on A:cp-text_ulsfo - 9.2.13 Upgrade ()
17:46 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/editor-analytics: apply
17:46 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/editor-analytics: apply
17:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2163', diff saved to https://phabricator.wikimedia.org/P90559 and previous config saved to /var/cache/conftool/dbconfig/20260413-174509-fceratto.json
17:40 swfrench-wmf: applied latent external-services network policy changes for aqs{1023,1024} - T423168
17:37 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
17:36 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
17:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2163 (T419635)', diff saved to https://phabricator.wikimedia.org/P90558 and previous config saved to /var/cache/conftool/dbconfig/20260413-173501-fceratto.json
17:34 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
17:33 root@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool pc1017: Security updates
17:33 root@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
17:33 root@cumin1003: START - Cookbook sre.mysql.parsercache
17:33 root@cumin1003: START - Cookbook sre.mysql.pool pool pc1017: Security updates
17:33 swfrench@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
17:33 Amir1: dropping templatelinks and pagelinks on testcommonswiki core db (T421914)
17:32 swfrench@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
17:31 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2163 (T419635)', diff saved to https://phabricator.wikimedia.org/P90556 and previous config saved to /var/cache/conftool/dbconfig/20260413-173148-fceratto.json
17:31 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2163.codfw.wmnet with reason: Maintenance
17:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2161 (T419635)', diff saved to https://phabricator.wikimedia.org/P90555 and previous config saved to /var/cache/conftool/dbconfig/20260413-173123-fceratto.json
17:31 swfrench@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
17:30 ladsgroup@deploy1003: Finished scap sync-world: Backport for Revert^6 "Use envoy for swift inside mediawiki" (duration: 07m 31s)
17:29 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-upgrade-ats (exit_code=0) Rolling upgrade of ATS on P{cp[7002-7008].magru.wmnet} and A:cp - 9.2.13 Upgrade ()
17:26 ladsgroup@deploy1003: ladsgroup: Continuing with sync
17:26 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-upgrade-ats (exit_code=0) Rolling upgrade of ATS on P{cp[7010-7016].magru.wmnet} and A:cp - 9.2.13 Upgrade ()
17:24 ladsgroup@deploy1003: ladsgroup: Backport for Revert^6 "Use envoy for swift inside mediawiki" synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
17:23 ladsgroup@deploy1003: Started scap sync-world: Backport for Revert^6 "Use envoy for swift inside mediawiki"
17:21 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host phab2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
17:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2161', diff saved to https://phabricator.wikimedia.org/P90554 and previous config saved to /var/cache/conftool/dbconfig/20260413-172115-fceratto.json
17:20 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/editor-analytics: apply
17:19 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/editor-analytics: apply
17:19 swfrench@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
17:18 swfrench@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
17:13 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/editor-analytics: apply
17:12 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host phab2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
17:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2161', diff saved to https://phabricator.wikimedia.org/P90553 and previous config saved to /var/cache/conftool/dbconfig/20260413-171107-fceratto.json
17:06 root@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool pc1017: Security updates
17:06 root@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
17:06 root@cumin1003: START - Cookbook sre.mysql.parsercache
17:06 root@cumin1003: START - Cookbook sre.mysql.depool depool pc1017: Security updates
17:03 ladsgroup@deploy1003: Finished scap sync-world: Backport for ExternalStore: Start reading and writing from clusters 32 and 33 (T421729) (duration: 06m 43s)
17:03 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/editor-analytics: apply
17:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2161 (T419635)', diff saved to https://phabricator.wikimedia.org/P90551 and previous config saved to /var/cache/conftool/dbconfig/20260413-170059-fceratto.json
16:59 ladsgroup@deploy1003: ladsgroup: Continuing with sync
16:58 ladsgroup@deploy1003: ladsgroup: Backport for ExternalStore: Start reading and writing from clusters 32 and 33 (T421729) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
16:57 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2161 (T419635)', diff saved to https://phabricator.wikimedia.org/P90550 and previous config saved to /var/cache/conftool/dbconfig/20260413-165747-fceratto.json
16:57 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2161.codfw.wmnet with reason: Maintenance
16:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2154 (T419635)', diff saved to https://phabricator.wikimedia.org/P90549 and previous config saved to /var/cache/conftool/dbconfig/20260413-165721-fceratto.json
16:56 ladsgroup@deploy1003: Started scap sync-world: Backport for ExternalStore: Start reading and writing from clusters 32 and 33 (T421729)
16:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2154', diff saved to https://phabricator.wikimedia.org/P90548 and previous config saved to /var/cache/conftool/dbconfig/20260413-164713-fceratto.json
16:46 mutante: contint2002 (prod CI) - re-enabled puppet - this applied a refresh of the contint.wikimedia.org certificate (T423152 T420993)
16:44 brett@cumin2002: START - Cookbook sre.cdn.roll-upgrade-ats Rolling upgrade of ATS on P{cp[7010-7016].magru.wmnet} and A:cp - 9.2.13 Upgrade ()
16:44 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1187: After Reimage
16:44 brett@cumin2002: START - Cookbook sre.cdn.roll-upgrade-ats Rolling upgrade of ATS on P{cp[7002-7008].magru.wmnet} and A:cp - 9.2.13 Upgrade ()
16:44 mutante: contint2002 (prod CI) - re-enabled puppet - this applied a refresh of the contint.wikimedia.org certificate
16:40 swfrench@deploy1003: helmfile [staging] DONE helmfile.d/services/editor-analytics: apply
16:40 swfrench@deploy1003: helmfile [staging] START helmfile.d/services/editor-analytics: apply
16:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2154', diff saved to https://phabricator.wikimedia.org/P90546 and previous config saved to /var/cache/conftool/dbconfig/20260413-163706-fceratto.json
16:36 root@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool pc1014: Security updates
16:36 root@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
16:35 root@cumin1003: START - Cookbook sre.mysql.parsercache
16:35 root@cumin1003: START - Cookbook sre.mysql.pool pool pc1014: Security updates
16:35 Amir1: banning non-standard thumbs with external referrer regardless of cache status (T414805)
16:28 cgoubert@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker1273.eqiad.wmnet
16:28 cgoubert@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker1273.eqiad.wmnet
16:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2154 (T419635)', diff saved to https://phabricator.wikimedia.org/P90543 and previous config saved to /var/cache/conftool/dbconfig/20260413-162657-fceratto.json
16:23 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2154 (T419635)', diff saved to https://phabricator.wikimedia.org/P90542 and previous config saved to /var/cache/conftool/dbconfig/20260413-162344-fceratto.json
16:23 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2154.codfw.wmnet with reason: Maintenance
16:23 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2152 (T419635)', diff saved to https://phabricator.wikimedia.org/P90541 and previous config saved to /var/cache/conftool/dbconfig/20260413-162318-fceratto.json
16:14 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host clouddb1019.eqiad.wmnet with OS trixie
16:13 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2152', diff saved to https://phabricator.wikimedia.org/P90539 and previous config saved to /var/cache/conftool/dbconfig/20260413-161310-fceratto.json
16:07 root@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool pc1014: Security updates
16:07 root@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
16:07 root@cumin1003: START - Cookbook sre.mysql.parsercache
16:07 root@cumin1003: START - Cookbook sre.mysql.depool depool pc1014: Security updates
16:03 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2152', diff saved to https://phabricator.wikimedia.org/P90537 and previous config saved to /var/cache/conftool/dbconfig/20260413-160301-fceratto.json
16:02 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2069.codfw.wmnet with OS bullseye
15:59 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1187: After Reimage
15:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2152 (T419635)', diff saved to https://phabricator.wikimedia.org/P90535 and previous config saved to /var/cache/conftool/dbconfig/20260413-155253-fceratto.json
15:51 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1187.eqiad.wmnet with OS trixie
15:50 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2224: After Reimage
15:49 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2152 (T419635)', diff saved to https://phabricator.wikimedia.org/P90533 and previous config saved to /var/cache/conftool/dbconfig/20260413-154937-fceratto.json
15:49 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2152.codfw.wmnet with reason: Maintenance
15:46 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1166: repool after maintenance
15:43 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2069.codfw.wmnet with reason: host reimage
15:43 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
15:43 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
15:39 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2069.codfw.wmnet with reason: host reimage
15:37 root@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool pc1013: Security updates
15:37 root@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
15:37 root@cumin1003: START - Cookbook sre.mysql.parsercache
15:37 root@cumin1003: START - Cookbook sre.mysql.pool pool pc1013: Security updates
15:36 moritzm: installing postgresql-15 security updates
15:32 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
15:32 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
15:31 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
15:31 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
15:31 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db1227 (T410589)', diff saved to https://phabricator.wikimedia.org/P90529 and previous config saved to /var/cache/conftool/dbconfig/20260413-153107-ladsgroup.json
15:30 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db1227.eqiad.wmnet with reason: Maintenance
15:30 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1202 (T410589)', diff saved to https://phabricator.wikimedia.org/P90527 and previous config saved to /var/cache/conftool/dbconfig/20260413-153042-ladsgroup.json
15:29 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1187.eqiad.wmnet with reason: host reimage
15:25 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1187.eqiad.wmnet with reason: host reimage
15:21 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
15:21 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
15:21 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
15:21 dpogorzelski@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
15:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
15:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
15:20 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
15:20 dpogorzelski@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
15:20 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1202', diff saved to https://phabricator.wikimedia.org/P90526 and previous config saved to /var/cache/conftool/dbconfig/20260413-152034-ladsgroup.json
15:16 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
15:16 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
15:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
15:13 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
15:11 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host phab2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
15:10 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1187.eqiad.wmnet with OS trixie
15:10 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1202', diff saved to https://phabricator.wikimedia.org/P90522 and previous config saved to /var/cache/conftool/dbconfig/20260413-151027-ladsgroup.json
15:10 root@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool pc1013: Security updates
15:10 root@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
15:09 root@cumin1003: START - Cookbook sre.mysql.parsercache
15:09 root@cumin1003: START - Cookbook sre.mysql.depool depool pc1013: Security updates
15:08 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1187: Upgrade package
15:08 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1187.eqiad.wmnet with reason: Reimage to Trixie
15:08 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1187: Upgrade package
15:07 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host phab2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
15:06 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host phab2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
15:05 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2224: After Reimage
15:04 marostegui@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool db2224: After Reimage
15:04 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2224: After Reimage
15:04 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host phab2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
15:03 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2224.codfw.wmnet with OS trixie
15:01 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2222 (T419635)', diff saved to https://phabricator.wikimedia.org/P90518 and previous config saved to /var/cache/conftool/dbconfig/20260413-150116-fceratto.json
15:00 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1166: repool after maintenance
15:00 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1202 (T410589)', diff saved to https://phabricator.wikimedia.org/P90516 and previous config saved to /var/cache/conftool/dbconfig/20260413-150019-ladsgroup.json
14:54 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host clouddb1019.eqiad.wmnet with OS trixie
14:53 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool pc1012: T419961
14:53 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
14:53 fceratto@cumin1003: START - Cookbook sre.mysql.parsercache
14:53 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool pc1012: T419961
14:53 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool pc2012: T419961
14:53 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
14:53 fceratto@cumin1003: START - Cookbook sre.mysql.parsercache
14:53 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool pc2012: T419961
14:51 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host ms-be2069
14:51 mvernon@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-be2069
14:50 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2222', diff saved to https://phabricator.wikimedia.org/P90514 and previous config saved to /var/cache/conftool/dbconfig/20260413-145028-fceratto.json
14:49 mvernon@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ms-be2069
14:49 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ms-be2069.codfw.wmnet 181.48.192.10.in-addr.arpa 1.8.1.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
14:48 mvernon@cumin2002: START - Cookbook sre.dns.wipe-cache ms-be2069.codfw.wmnet 181.48.192.10.in-addr.arpa 1.8.1.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
14:48 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
14:48 mvernon@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be2069 - mvernon@cumin2002"
14:48 mvernon@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be2069 - mvernon@cumin2002"
14:44 mvernon@cumin2002: START - Cookbook sre.dns.netbox
14:40 mvernon@cumin2002: START - Cookbook sre.hosts.move-vlan for host ms-be2069
14:40 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2224.codfw.wmnet with reason: host reimage
14:39 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2069.codfw.wmnet with OS bullseye
14:39 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2222', diff saved to https://phabricator.wikimedia.org/P90513 and previous config saved to /var/cache/conftool/dbconfig/20260413-143939-fceratto.json
14:36 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2224.codfw.wmnet with reason: host reimage
14:34 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2070.codfw.wmnet with OS bullseye
14:28 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2222 (T419635)', diff saved to https://phabricator.wikimedia.org/P90512 and previous config saved to /var/cache/conftool/dbconfig/20260413-142851-fceratto.json
14:22 Lucas_WMDE: UTC afternoon backport+config window done
14:22 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for Record TOR account creation failure separately (T422283), stats: add counters for experiment account creation (T422283), GrowthSuggestionToneCheck: flag as non-experimental (T422835) (duration: 10m 22s)
14:20 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host an-redacteddb1001.eqiad.wmnet with OS bookworm
14:19 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
14:19 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthbook: apply
14:19 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2224.codfw.wmnet with OS trixie
14:19 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
14:18 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook: apply
14:18 lucaswerkmeister-wmde@deploy1003: migr, lucaswerkmeister-wmde, urbanecm: Continuing with sync
14:17 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
14:16 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2224: Reimage
14:16 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2224: Reimage
14:15 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2224.codfw.wmnet with reason: Reimage to Trixie
14:14 root@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool pc1012: Security updates
14:14 root@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
14:14 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2070.codfw.wmnet with reason: host reimage
14:14 root@cumin1003: START - Cookbook sre.mysql.parsercache
14:14 root@cumin1003: START - Cookbook sre.mysql.depool depool pc1012: Security updates
14:14 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool pc2011: T419961
14:14 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
14:14 fceratto@cumin2002: dbctl commit (dc=all): 'Depooling db2222 (T419635)', diff saved to https://phabricator.wikimedia.org/P90509 and previous config saved to /var/cache/conftool/dbconfig/20260413-141414-fceratto.json
14:14 inflatador: bking@apt1002 sudo -E reprepro --ignore=wrongdistribution -C component/opensearch2 include trixie-wikimedia ~/opensearch-madvise-0.2/opensearch-madvise_0.2_amd64.changes T422860
14:13 fceratto@cumin1003: START - Cookbook sre.mysql.parsercache
14:13 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool pc2011: T419961
14:13 fceratto@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2222.codfw.wmnet with reason: Maintenance
14:13 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool pc1011: T419961
14:13 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
14:13 lucaswerkmeister-wmde@deploy1003: migr, lucaswerkmeister-wmde, urbanecm: Backport for Record TOR account creation failure separately (T422283), stats: add counters for experiment account creation (T422283), GrowthSuggestionToneCheck: flag as non-experimental (T422835) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be
14:13 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2221 (T419635)', diff saved to https://phabricator.wikimedia.org/P90507 and previous config saved to /var/cache/conftool/dbconfig/20260413-141306-fceratto.json
14:12 fceratto@cumin1003: START - Cookbook sre.mysql.parsercache
14:12 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool pc1011: T419961
14:11 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for Record TOR account creation failure separately (T422283), stats: add counters for experiment account creation (T422283), GrowthSuggestionToneCheck: flag as non-experimental (T422835)
14:10 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
14:09 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2070.codfw.wmnet with reason: host reimage
14:08 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
14:02 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2221', diff saved to https://phabricator.wikimedia.org/P90506 and previous config saved to /var/cache/conftool/dbconfig/20260413-140218-fceratto.json
14:01 lucaswerkmeister-wmde@deploy1003: mwscript-k8s job started: namespaceDupes urwikisource --fix # T422824
14:00 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for EventStreamConfig: remove unused contextual attributes causing problems (T422001), [abstractwiki] Enable wgParserEnableUserLanguage, so we don't need ⧼lang⧽s, urwikisource: add مصنف (author) namespace (T422824) (duration: 08m 30s)
13:59 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-redacteddb1001.eqiad.wmnet with reason: host reimage
13:56 lucaswerkmeister-wmde@deploy1003: sgimeno, anzx, lucaswerkmeister-wmde, jforrester: Continuing with sync
13:55 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
13:55 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
13:53 lucaswerkmeister-wmde@deploy1003: sgimeno, anzx, lucaswerkmeister-wmde, jforrester: Backport for EventStreamConfig: remove unused contextual attributes causing problems (T422001), [abstractwiki] Enable wgParserEnableUserLanguage, so we don't need ⧼lang⧽s, urwikisource: add مصنف (author) namespace (T422824) synced to the testservers (see https://wikitec
13:53 moritzm: installing postgresql-common bugfix updates
13:52 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on an-redacteddb1001.eqiad.wmnet with reason: host reimage
13:52 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for EventStreamConfig: remove unused contextual attributes causing problems (T422001), [abstractwiki] Enable wgParserEnableUserLanguage, so we don't need ⧼lang⧽s, urwikisource: add مصنف (author) namespace (T422824)
13:51 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2221', diff saved to https://phabricator.wikimedia.org/P90505 and previous config saved to /var/cache/conftool/dbconfig/20260413-135129-fceratto.json
13:50 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host ms-be2070
13:50 mvernon@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-be2070
13:49 mvernon@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ms-be2070
13:49 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ms-be2070.codfw.wmnet 86.0.192.10.in-addr.arpa 6.8.0.0.0.0.0.0.2.9.1.0.0.1.0.0.1.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
13:49 mvernon@cumin2002: START - Cookbook sre.dns.wipe-cache ms-be2070.codfw.wmnet 86.0.192.10.in-addr.arpa 6.8.0.0.0.0.0.0.2.9.1.0.0.1.0.0.1.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
13:49 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
13:49 mvernon@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be2070 - mvernon@cumin2002"
13:49 mvernon@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be2070 - mvernon@cumin2002"
13:49 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for Re-add p-personal id to the user menu (T422885) (duration: 10m 41s)
13:44 mvernon@cumin2002: START - Cookbook sre.dns.netbox
13:44 mvernon@cumin2002: START - Cookbook sre.hosts.move-vlan for host ms-be2070
13:43 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2070.codfw.wmnet with OS bullseye
13:42 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, aude: Continuing with sync
13:42 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, aude: Backport for Re-add p-personal id to the user menu (T422885) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
13:41 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host phab1006.eqiad.wmnet with OS trixie
13:41 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
13:40 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2221 (T419635)', diff saved to https://phabricator.wikimedia.org/P90504 and previous config saved to /var/cache/conftool/dbconfig/20260413-134041-fceratto.json
13:38 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for Re-add p-personal id to the user menu (T422885)
13:37 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for Opt-in new accounts to ReadingLists beta feature on pilot wikis (T422833) (duration: 34m 09s)
13:37 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
13:36 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host an-redacteddb1001
13:36 btullis@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host an-redacteddb1001
13:35 btullis@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host an-redacteddb1001
13:35 btullis@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) an-redacteddb1001.eqiad.wmnet 18.48.64.10.in-addr.arpa 8.1.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
13:35 btullis@cumin1003: START - Cookbook sre.dns.wipe-cache an-redacteddb1001.eqiad.wmnet 18.48.64.10.in-addr.arpa 8.1.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
13:35 btullis@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
13:35 btullis@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host an-redacteddb1001 - btullis@cumin1003"
13:35 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host an-redacteddb1001 - btullis@cumin1003"
13:26 fceratto@cumin2002: dbctl commit (dc=all): 'Depooling db2221 (T419635)', diff saved to https://phabricator.wikimedia.org/P90503 and previous config saved to /var/cache/conftool/dbconfig/20260413-132604-fceratto.json
13:25 fceratto@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2221.codfw.wmnet with reason: Maintenance
13:25 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2218 (T419635)', diff saved to https://phabricator.wikimedia.org/P90502 and previous config saved to /var/cache/conftool/dbconfig/20260413-132457-fceratto.json
13:24 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host clouddb1019.eqiad.wmnet with OS trixie
13:24 root@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool pc1011: Security updates
13:24 root@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
13:24 root@cumin1003: START - Cookbook sre.mysql.parsercache
13:24 root@cumin1003: START - Cookbook sre.mysql.depool depool pc1011: Security updates
13:24 lucaswerkmeister-wmde@deploy1003: aude, lucaswerkmeister-wmde: Continuing with sync
13:24 btullis@cumin1003: START - Cookbook sre.dns.netbox
13:24 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host clouddb1019.eqiad.wmnet with OS trixie
13:21 lucaswerkmeister-wmde@deploy1003: aude, lucaswerkmeister-wmde: Backport for Opt-in new accounts to ReadingLists beta feature on pilot wikis (T422833) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
13:20 btullis@cumin1003: START - Cookbook sre.hosts.move-vlan for host an-redacteddb1001
13:19 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host an-redacteddb1001.eqiad.wmnet with OS bookworm
13:18 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on phab1006.eqiad.wmnet with reason: host reimage
13:14 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2218', diff saved to https://phabricator.wikimedia.org/P90501 and previous config saved to /var/cache/conftool/dbconfig/20260413-131408-fceratto.json
13:13 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on phab1006.eqiad.wmnet with reason: host reimage
13:04 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host apt1002.wikimedia.org
13:03 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for Opt-in new accounts to ReadingLists beta feature on pilot wikis (T422833)
13:03 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2218', diff saved to https://phabricator.wikimedia.org/P90500 and previous config saved to /var/cache/conftool/dbconfig/20260413-130320-fceratto.json
13:01 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host phab1006.eqiad.wmnet with OS trixie
13:00 moritzm: installing libnginx-mod-http-lua security updates
12:58 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host apt1002.wikimedia.org
12:52 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2218 (T419635)', diff saved to https://phabricator.wikimedia.org/P90499 and previous config saved to /var/cache/conftool/dbconfig/20260413-125231-fceratto.json
12:47 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host clouddb1019.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
12:38 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host clouddb1019.eqiad.wmnet with OS trixie
12:38 fceratto@cumin2002: dbctl commit (dc=all): 'Depooling db2218 (T419635)', diff saved to https://phabricator.wikimedia.org/P90498 and previous config saved to /var/cache/conftool/dbconfig/20260413-123801-fceratto.json
12:37 fceratto@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2218.codfw.wmnet with reason: Maintenance
12:36 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2208 (T419635)', diff saved to https://phabricator.wikimedia.org/P90497 and previous config saved to /var/cache/conftool/dbconfig/20260413-123653-fceratto.json
12:36 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
12:36 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
12:26 jclark@cumin1003: START - Cookbook sre.hosts.provision for host clouddb1019.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
12:26 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2208', diff saved to https://phabricator.wikimedia.org/P90496 and previous config saved to /var/cache/conftool/dbconfig/20260413-122604-fceratto.json
12:21 jmm@dns1004: END - running authdns-update
12:20 jmm@dns1004: START - running authdns-update
12:15 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2208', diff saved to https://phabricator.wikimedia.org/P90495 and previous config saved to /var/cache/conftool/dbconfig/20260413-121516-fceratto.json
12:04 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2208 (T419635)', diff saved to https://phabricator.wikimedia.org/P90494 and previous config saved to /var/cache/conftool/dbconfig/20260413-120428-fceratto.json
12:01 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host puppetserver1003.eqiad.wmnet
11:55 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host puppetserver1003.eqiad.wmnet
11:54 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host puppetserver2004.codfw.wmnet
11:49 fceratto@cumin2002: dbctl commit (dc=all): 'Depooling db2208 (T419635)', diff saved to https://phabricator.wikimedia.org/P90493 and previous config saved to /var/cache/conftool/dbconfig/20260413-114953-fceratto.json
11:49 fceratto@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2208.codfw.wmnet with reason: Maintenance
11:48 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host puppetserver2004.codfw.wmnet
11:38 jmm@dns1004: END - running authdns-update
11:36 jmm@dns1004: START - running authdns-update
11:36 fceratto@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2200.codfw.wmnet with reason: Maintenance
11:36 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2182 (T419635)', diff saved to https://phabricator.wikimedia.org/P90492 and previous config saved to /var/cache/conftool/dbconfig/20260413-113630-fceratto.json
11:25 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2182', diff saved to https://phabricator.wikimedia.org/P90491 and previous config saved to /var/cache/conftool/dbconfig/20260413-112541-fceratto.json
11:14 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2182', diff saved to https://phabricator.wikimedia.org/P90490 and previous config saved to /var/cache/conftool/dbconfig/20260413-111452-fceratto.json
11:04 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2182 (T419635)', diff saved to https://phabricator.wikimedia.org/P90489 and previous config saved to /var/cache/conftool/dbconfig/20260413-110405-fceratto.json
10:48 fceratto@cumin2002: dbctl commit (dc=all): 'Depooling db2182 (T419635)', diff saved to https://phabricator.wikimedia.org/P90488 and previous config saved to /var/cache/conftool/dbconfig/20260413-104852-fceratto.json
10:48 fceratto@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2182.codfw.wmnet with reason: Maintenance
10:47 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2168 (T419635)', diff saved to https://phabricator.wikimedia.org/P90487 and previous config saved to /var/cache/conftool/dbconfig/20260413-104756-fceratto.json
10:38 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
10:38 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
10:37 vgutierrez@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-ats (exit_code=0) Rolling upgrade of ATS on P{cp[3067,3074].esams.wmnet} and A:cp - 9.2.13 upgrade (T422328)
10:37 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2168', diff saved to https://phabricator.wikimedia.org/P90486 and previous config saved to /var/cache/conftool/dbconfig/20260413-103707-fceratto.json
10:34 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
10:33 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
10:29 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
10:28 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
10:26 vgutierrez@cumin1003: START - Cookbook sre.cdn.roll-upgrade-ats Rolling upgrade of ATS on P{cp[3067,3074].esams.wmnet} and A:cp - 9.2.13 upgrade (T422328)
10:26 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2168', diff saved to https://phabricator.wikimedia.org/P90485 and previous config saved to /var/cache/conftool/dbconfig/20260413-102619-fceratto.json
10:19 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
10:19 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host krb1002.eqiad.wmnet
10:19 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
10:15 blake@deploy1003: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
10:15 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2168 (T419635)', diff saved to https://phabricator.wikimedia.org/P90484 and previous config saved to /var/cache/conftool/dbconfig/20260413-101530-fceratto.json
10:15 blake@deploy1003: helmfile [eqiad] START helmfile.d/services/thumbor: apply
10:14 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
10:14 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host krb1002.eqiad.wmnet
10:14 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
10:09 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
10:09 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
10:07 blake@deploy1003: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
10:06 blake@deploy1003: helmfile [codfw] START helmfile.d/services/thumbor: apply
10:05 blake@deploy1003: helmfile [staging] DONE helmfile.d/services/thumbor: apply
10:05 blake@deploy1003: helmfile [staging] START helmfile.d/services/thumbor: apply
10:00 fceratto@cumin2002: dbctl commit (dc=all): 'Depooling db2168 (T419635)', diff saved to https://phabricator.wikimedia.org/P90483 and previous config saved to /var/cache/conftool/dbconfig/20260413-100003-fceratto.json
09:59 fceratto@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2168.codfw.wmnet with reason: Maintenance
09:59 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2159 (T419635)', diff saved to https://phabricator.wikimedia.org/P90482 and previous config saved to /var/cache/conftool/dbconfig/20260413-095906-fceratto.json
09:49 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
09:48 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2159', diff saved to https://phabricator.wikimedia.org/P90481 and previous config saved to /var/cache/conftool/dbconfig/20260413-094818-fceratto.json
09:47 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
09:38 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db2200.codfw.wmnet with reason: Maintenance
09:37 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2159', diff saved to https://phabricator.wikimedia.org/P90480 and previous config saved to /var/cache/conftool/dbconfig/20260413-093729-fceratto.json
09:26 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2159 (T419635)', diff saved to https://phabricator.wikimedia.org/P90479 and previous config saved to /var/cache/conftool/dbconfig/20260413-092640-fceratto.json
09:19 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
09:19 root@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool pc1011: Security updates
09:19 root@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
09:19 root@cumin1003: START - Cookbook sre.mysql.parsercache
09:19 root@cumin1003: START - Cookbook sre.mysql.depool depool pc1011: Security updates
09:17 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool pc1011: Security updates
09:17 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
09:17 fceratto@cumin1003: START - Cookbook sre.mysql.parsercache
09:17 fceratto@cumin1003: START - Cookbook sre.mysql.depool depool pc1011: Security updates
09:15 root@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool pc1011: Security updates
09:15 root@cumin1003: END (FAIL) - Cookbook sre.mysql.parsercache (exit_code=99)
09:15 root@cumin1003: START - Cookbook sre.mysql.parsercache
09:15 root@cumin1003: START - Cookbook sre.mysql.depool depool pc1011: Security updates
09:11 fceratto@cumin2002: dbctl commit (dc=all): 'Depooling db2159 (T419635)', diff saved to https://phabricator.wikimedia.org/P90477 and previous config saved to /var/cache/conftool/dbconfig/20260413-091122-fceratto.json
09:10 fceratto@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2159.codfw.wmnet with reason: Maintenance
09:10 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2150 (T419635)', diff saved to https://phabricator.wikimedia.org/P90476 and previous config saved to /var/cache/conftool/dbconfig/20260413-091027-fceratto.json
08:59 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2150', diff saved to https://phabricator.wikimedia.org/P90474 and previous config saved to /var/cache/conftool/dbconfig/20260413-085938-fceratto.json
08:48 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host bast1003.wikimedia.org
08:48 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2150', diff saved to https://phabricator.wikimedia.org/P90473 and previous config saved to /var/cache/conftool/dbconfig/20260413-084850-fceratto.json
08:43 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host bast1003.wikimedia.org
08:38 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2150 (T419635)', diff saved to https://phabricator.wikimedia.org/P90471 and previous config saved to /var/cache/conftool/dbconfig/20260413-083801-fceratto.json
08:22 fceratto@cumin2002: dbctl commit (dc=all): 'Depooling db2150 (T419635)', diff saved to https://phabricator.wikimedia.org/P90470 and previous config saved to /var/cache/conftool/dbconfig/20260413-082233-fceratto.json
08:21 fceratto@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2150.codfw.wmnet with reason: Maintenance
08:10 taavi@dns1004: END - running authdns-update
08:09 taavi@dns1004: START - running authdns-update
08:06 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'article-descriptions' for release 'main' .
07:40 elukey@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
07:35 elukey@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
07:23 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mirror1001.wikimedia.org
07:21 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthboo-next: apply
07:20 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook-next: apply
07:16 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host mirror1001.wikimedia.org
07:09 moritzm: installing openssh security updates
05:51 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db1202 (T410589)', diff saved to https://phabricator.wikimedia.org/P90469 and previous config saved to /var/cache/conftool/dbconfig/20260413-055130-ladsgroup.json
05:51 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db1202.eqiad.wmnet with reason: Maintenance
05:51 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1194 (T410589)', diff saved to https://phabricator.wikimedia.org/P90468 and previous config saved to /var/cache/conftool/dbconfig/20260413-055106-ladsgroup.json
05:41 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1194', diff saved to https://phabricator.wikimedia.org/P90467 and previous config saved to /var/cache/conftool/dbconfig/20260413-054100-ladsgroup.json
05:30 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1194', diff saved to https://phabricator.wikimedia.org/P90466 and previous config saved to /var/cache/conftool/dbconfig/20260413-053050-ladsgroup.json
05:20 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1194 (T410589)', diff saved to https://phabricator.wikimedia.org/P90465 and previous config saved to /var/cache/conftool/dbconfig/20260413-052042-ladsgroup.json
03:34 TimStarling: on gerrit2003 restarted gerrit T423027

2026-04-12

21:21 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db2198.codfw.wmnet with reason: Maintenance
21:20 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2182 (T410589)', diff saved to https://phabricator.wikimedia.org/P90464 and previous config saved to /var/cache/conftool/dbconfig/20260412-212043-ladsgroup.json
21:10 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2182', diff saved to https://phabricator.wikimedia.org/P90463 and previous config saved to /var/cache/conftool/dbconfig/20260412-211036-ladsgroup.json
21:00 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2182', diff saved to https://phabricator.wikimedia.org/P90462 and previous config saved to /var/cache/conftool/dbconfig/20260412-210028-ladsgroup.json
20:55 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db1194 (T410589)', diff saved to https://phabricator.wikimedia.org/P90461 and previous config saved to /var/cache/conftool/dbconfig/20260412-205525-ladsgroup.json
20:55 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db1194.eqiad.wmnet with reason: Maintenance
20:55 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1191 (T410589)', diff saved to https://phabricator.wikimedia.org/P90460 and previous config saved to /var/cache/conftool/dbconfig/20260412-205500-ladsgroup.json
20:50 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2182 (T410589)', diff saved to https://phabricator.wikimedia.org/P90459 and previous config saved to /var/cache/conftool/dbconfig/20260412-205020-ladsgroup.json
20:44 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1191', diff saved to https://phabricator.wikimedia.org/P90458 and previous config saved to /var/cache/conftool/dbconfig/20260412-204451-ladsgroup.json
20:34 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1191', diff saved to https://phabricator.wikimedia.org/P90457 and previous config saved to /var/cache/conftool/dbconfig/20260412-203443-ladsgroup.json
20:24 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1191 (T410589)', diff saved to https://phabricator.wikimedia.org/P90456 and previous config saved to /var/cache/conftool/dbconfig/20260412-202435-ladsgroup.json
14:32 cgoubert@dns2004: START - running authdns-update
11:51 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db1191 (T410589)', diff saved to https://phabricator.wikimedia.org/P90452 and previous config saved to /var/cache/conftool/dbconfig/20260412-115148-ladsgroup.json
11:51 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db1191.eqiad.wmnet with reason: Maintenance
11:51 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1174 (T410589)', diff saved to https://phabricator.wikimedia.org/P90451 and previous config saved to /var/cache/conftool/dbconfig/20260412-115124-ladsgroup.json
11:41 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P90450 and previous config saved to /var/cache/conftool/dbconfig/20260412-114116-ladsgroup.json
11:31 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P90449 and previous config saved to /var/cache/conftool/dbconfig/20260412-113108-ladsgroup.json
11:21 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1174 (T410589)', diff saved to https://phabricator.wikimedia.org/P90448 and previous config saved to /var/cache/conftool/dbconfig/20260412-112100-ladsgroup.json
07:06 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db2182 (T410589)', diff saved to https://phabricator.wikimedia.org/P90447 and previous config saved to /var/cache/conftool/dbconfig/20260412-070649-ladsgroup.json
07:06 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db2182.codfw.wmnet with reason: Maintenance
07:06 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2168 (T410589)', diff saved to https://phabricator.wikimedia.org/P90446 and previous config saved to /var/cache/conftool/dbconfig/20260412-070624-ladsgroup.json
06:56 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2168', diff saved to https://phabricator.wikimedia.org/P90445 and previous config saved to /var/cache/conftool/dbconfig/20260412-065616-ladsgroup.json
06:46 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2168', diff saved to https://phabricator.wikimedia.org/P90444 and previous config saved to /var/cache/conftool/dbconfig/20260412-064608-ladsgroup.json
06:36 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2168 (T410589)', diff saved to https://phabricator.wikimedia.org/P90443 and previous config saved to /var/cache/conftool/dbconfig/20260412-063600-ladsgroup.json
02:44 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db1174 (T410589)', diff saved to https://phabricator.wikimedia.org/P90442 and previous config saved to /var/cache/conftool/dbconfig/20260412-024415-ladsgroup.json
02:44 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db1174.eqiad.wmnet with reason: Maintenance
02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 19s)
02:01 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image

2026-04-11

22:40 cmooney@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 16735
22:38 cmooney@cumin1003: START - Cookbook sre.network.peering with action 'configure' for AS: 16735
22:38 cmooney@cumin1003: END (FAIL) - Cookbook sre.network.peering (exit_code=99) with action 'configure' for AS: 16735
22:37 cmooney@cumin1003: START - Cookbook sre.network.peering with action 'configure' for AS: 16735
18:50 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2248 (T419635)', diff saved to https://phabricator.wikimedia.org/P90441 and previous config saved to /var/cache/conftool/dbconfig/20260411-185048-fceratto.json
18:40 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2248', diff saved to https://phabricator.wikimedia.org/P90440 and previous config saved to /var/cache/conftool/dbconfig/20260411-184000-fceratto.json
18:29 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2248', diff saved to https://phabricator.wikimedia.org/P90439 and previous config saved to /var/cache/conftool/dbconfig/20260411-182912-fceratto.json
18:18 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2248 (T419635)', diff saved to https://phabricator.wikimedia.org/P90438 and previous config saved to /var/cache/conftool/dbconfig/20260411-181823-fceratto.json
17:23 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db2168 (T410589)', diff saved to https://phabricator.wikimedia.org/P90437 and previous config saved to /var/cache/conftool/dbconfig/20260411-172321-ladsgroup.json
17:23 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db2168.codfw.wmnet with reason: Maintenance
17:22 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2159 (T410589)', diff saved to https://phabricator.wikimedia.org/P90436 and previous config saved to /var/cache/conftool/dbconfig/20260411-172257-ladsgroup.json
17:12 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2159', diff saved to https://phabricator.wikimedia.org/P90435 and previous config saved to /var/cache/conftool/dbconfig/20260411-171248-ladsgroup.json
17:02 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2159', diff saved to https://phabricator.wikimedia.org/P90434 and previous config saved to /var/cache/conftool/dbconfig/20260411-170240-ladsgroup.json
17:02 fceratto@cumin2002: dbctl commit (dc=all): 'Depooling db2248 (T419635)', diff saved to https://phabricator.wikimedia.org/P90433 and previous config saved to /var/cache/conftool/dbconfig/20260411-170233-fceratto.json
17:01 fceratto@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2248.codfw.wmnet with reason: Maintenance
17:01 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2247 (T419635)', diff saved to https://phabricator.wikimedia.org/P90432 and previous config saved to /var/cache/conftool/dbconfig/20260411-170138-fceratto.json
16:52 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2159 (T410589)', diff saved to https://phabricator.wikimedia.org/P90431 and previous config saved to /var/cache/conftool/dbconfig/20260411-165232-ladsgroup.json
16:50 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2247', diff saved to https://phabricator.wikimedia.org/P90430 and previous config saved to /var/cache/conftool/dbconfig/20260411-165049-fceratto.json
16:40 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2247', diff saved to https://phabricator.wikimedia.org/P90429 and previous config saved to /var/cache/conftool/dbconfig/20260411-164000-fceratto.json
16:29 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2247 (T419635)', diff saved to https://phabricator.wikimedia.org/P90428 and previous config saved to /var/cache/conftool/dbconfig/20260411-162912-fceratto.json
14:40 fceratto@cumin2002: dbctl commit (dc=all): 'Depooling db2247 (T419635)', diff saved to https://phabricator.wikimedia.org/P90427 and previous config saved to /var/cache/conftool/dbconfig/20260411-144002-fceratto.json
14:39 fceratto@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2247.codfw.wmnet with reason: Maintenance
14:38 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2246 (T419635)', diff saved to https://phabricator.wikimedia.org/P90426 and previous config saved to /var/cache/conftool/dbconfig/20260411-143854-fceratto.json
14:28 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2246', diff saved to https://phabricator.wikimedia.org/P90425 and previous config saved to /var/cache/conftool/dbconfig/20260411-142805-fceratto.json
14:17 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2246', diff saved to https://phabricator.wikimedia.org/P90424 and previous config saved to /var/cache/conftool/dbconfig/20260411-141717-fceratto.json
14:06 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2246 (T419635)', diff saved to https://phabricator.wikimedia.org/P90423 and previous config saved to /var/cache/conftool/dbconfig/20260411-140628-fceratto.json
12:43 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db1171.eqiad.wmnet with reason: Maintenance
12:42 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1170 (T410589)', diff saved to https://phabricator.wikimedia.org/P90422 and previous config saved to /var/cache/conftool/dbconfig/20260411-124244-ladsgroup.json
12:32 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1170', diff saved to https://phabricator.wikimedia.org/P90421 and previous config saved to /var/cache/conftool/dbconfig/20260411-123235-ladsgroup.json
12:22 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1170', diff saved to https://phabricator.wikimedia.org/P90420 and previous config saved to /var/cache/conftool/dbconfig/20260411-122226-ladsgroup.json
12:14 fceratto@cumin2002: dbctl commit (dc=all): 'Depooling db2246 (T419635)', diff saved to https://phabricator.wikimedia.org/P90419 and previous config saved to /var/cache/conftool/dbconfig/20260411-121410-fceratto.json
12:13 fceratto@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2246.codfw.wmnet with reason: Maintenance
12:13 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2245 (T419635)', diff saved to https://phabricator.wikimedia.org/P90418 and previous config saved to /var/cache/conftool/dbconfig/20260411-121302-fceratto.json
12:12 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1170 (T410589)', diff saved to https://phabricator.wikimedia.org/P90417 and previous config saved to /var/cache/conftool/dbconfig/20260411-121218-ladsgroup.json
12:02 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2245', diff saved to https://phabricator.wikimedia.org/P90416 and previous config saved to /var/cache/conftool/dbconfig/20260411-120214-fceratto.json
11:51 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2245', diff saved to https://phabricator.wikimedia.org/P90415 and previous config saved to /var/cache/conftool/dbconfig/20260411-115126-fceratto.json
11:40 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2245 (T419635)', diff saved to https://phabricator.wikimedia.org/P90414 and previous config saved to /var/cache/conftool/dbconfig/20260411-114037-fceratto.json
09:52 fceratto@cumin2002: dbctl commit (dc=all): 'Depooling db2245 (T419635)', diff saved to https://phabricator.wikimedia.org/P90413 and previous config saved to /var/cache/conftool/dbconfig/20260411-095220-fceratto.json
09:51 fceratto@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2245.codfw.wmnet with reason: Maintenance
09:51 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2240 (T419635)', diff saved to https://phabricator.wikimedia.org/P90412 and previous config saved to /var/cache/conftool/dbconfig/20260411-095113-fceratto.json
09:40 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2240', diff saved to https://phabricator.wikimedia.org/P90411 and previous config saved to /var/cache/conftool/dbconfig/20260411-094024-fceratto.json
09:29 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2240', diff saved to https://phabricator.wikimedia.org/P90410 and previous config saved to /var/cache/conftool/dbconfig/20260411-092936-fceratto.json
09:18 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2240 (T419635)', diff saved to https://phabricator.wikimedia.org/P90409 and previous config saved to /var/cache/conftool/dbconfig/20260411-091847-fceratto.json
07:36 fceratto@cumin2002: dbctl commit (dc=all): 'Depooling db2240 (T419635)', diff saved to https://phabricator.wikimedia.org/P90408 and previous config saved to /var/cache/conftool/dbconfig/20260411-073627-fceratto.json
07:35 fceratto@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2240.codfw.wmnet with reason: Maintenance
06:01 fceratto@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2239.codfw.wmnet with reason: Maintenance
06:01 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2237 (T419635)', diff saved to https://phabricator.wikimedia.org/P90407 and previous config saved to /var/cache/conftool/dbconfig/20260411-060126-fceratto.json
05:50 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2237', diff saved to https://phabricator.wikimedia.org/P90406 and previous config saved to /var/cache/conftool/dbconfig/20260411-055038-fceratto.json
05:39 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2237', diff saved to https://phabricator.wikimedia.org/P90405 and previous config saved to /var/cache/conftool/dbconfig/20260411-053950-fceratto.json
05:29 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2237 (T419635)', diff saved to https://phabricator.wikimedia.org/P90404 and previous config saved to /var/cache/conftool/dbconfig/20260411-052901-fceratto.json
03:45 fceratto@cumin2002: dbctl commit (dc=all): 'Depooling db2237 (T419635)', diff saved to https://phabricator.wikimedia.org/P90403 and previous config saved to /var/cache/conftool/dbconfig/20260411-034549-fceratto.json
03:45 fceratto@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2237.codfw.wmnet with reason: Maintenance
03:44 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2236 (T419635)', diff saved to https://phabricator.wikimedia.org/P90402 and previous config saved to /var/cache/conftool/dbconfig/20260411-034441-fceratto.json
03:37 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db2159 (T410589)', diff saved to https://phabricator.wikimedia.org/P90401 and previous config saved to /var/cache/conftool/dbconfig/20260411-033701-ladsgroup.json
03:36 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db2159.codfw.wmnet with reason: Maintenance
03:36 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2150 (T410589)', diff saved to https://phabricator.wikimedia.org/P90400 and previous config saved to /var/cache/conftool/dbconfig/20260411-033636-ladsgroup.json
03:33 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2236', diff saved to https://phabricator.wikimedia.org/P90399 and previous config saved to /var/cache/conftool/dbconfig/20260411-033352-fceratto.json
03:26 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2150', diff saved to https://phabricator.wikimedia.org/P90398 and previous config saved to /var/cache/conftool/dbconfig/20260411-032628-ladsgroup.json
03:23 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2236', diff saved to https://phabricator.wikimedia.org/P90397 and previous config saved to /var/cache/conftool/dbconfig/20260411-032304-fceratto.json
03:16 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2150', diff saved to https://phabricator.wikimedia.org/P90396 and previous config saved to /var/cache/conftool/dbconfig/20260411-031620-ladsgroup.json
03:12 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2236 (T419635)', diff saved to https://phabricator.wikimedia.org/P90395 and previous config saved to /var/cache/conftool/dbconfig/20260411-031216-fceratto.json
03:06 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2150 (T410589)', diff saved to https://phabricator.wikimedia.org/P90394 and previous config saved to /var/cache/conftool/dbconfig/20260411-030611-ladsgroup.json
01:31 fceratto@cumin2002: dbctl commit (dc=all): 'Depooling db2236 (T419635)', diff saved to Unable to send diff to phaste and previous config saved to /var/cache/conftool/dbconfig/20260411-013151-fceratto.json
01:31 fceratto@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2236.codfw.wmnet with reason: Maintenance
01:30 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2219 (T419635)', diff saved to https://phabricator.wikimedia.org/P90393 and previous config saved to /var/cache/conftool/dbconfig/20260411-013040-fceratto.json
01:19 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2219', diff saved to Unable to send diff to phaste and previous config saved to /var/cache/conftool/dbconfig/20260411-011948-fceratto.json
01:09 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2219', diff saved to and previous config saved to /var/cache/conftool/dbconfig/20260411-010859-fceratto.json
00:58 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2219 (T419635)', diff saved to https://phabricator.wikimedia.org/P90392 and previous config saved to /var/cache/conftool/dbconfig/20260411-005811-fceratto.json
00:08 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host phab2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
00:03 jclark@cumin1003: START - Cookbook sre.hosts.provision for host phab2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
00:01 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host phab2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED

2026-04-10

23:54 jclark@cumin1003: START - Cookbook sre.hosts.provision for host phab2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
23:53 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host phab1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
23:51 jclark@cumin1003: START - Cookbook sre.hosts.provision for host phab1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
23:49 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host apus-be1006.eqiad.wmnet with OS bookworm
23:49 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
23:49 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host phab1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
23:48 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
23:47 jclark@cumin1003: START - Cookbook sre.hosts.provision for host phab1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
23:25 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on apus-be1006.eqiad.wmnet with reason: host reimage
23:20 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on apus-be1006.eqiad.wmnet with reason: host reimage
23:17 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host apus-be1005.eqiad.wmnet with OS bookworm
23:17 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
23:16 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
23:13 fceratto@cumin2002: dbctl commit (dc=all): 'Depooling db2219 (T419635)', diff saved to https://phabricator.wikimedia.org/P90391 and previous config saved to /var/cache/conftool/dbconfig/20260410-231337-fceratto.json
23:12 fceratto@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2219.codfw.wmnet with reason: Maintenance
23:12 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2210 (T419635)', diff saved to https://phabricator.wikimedia.org/P90390 and previous config saved to /var/cache/conftool/dbconfig/20260410-231231-fceratto.json
23:05 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host apus-be1006.eqiad.wmnet with OS bookworm
23:01 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2210', diff saved to https://phabricator.wikimedia.org/P90389 and previous config saved to /var/cache/conftool/dbconfig/20260410-230143-fceratto.json
22:59 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on apus-be1005.eqiad.wmnet with reason: host reimage
22:55 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on apus-be1005.eqiad.wmnet with reason: host reimage
22:50 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2210', diff saved to https://phabricator.wikimedia.org/P90388 and previous config saved to /var/cache/conftool/dbconfig/20260410-225055-fceratto.json
22:40 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2210 (T419635)', diff saved to https://phabricator.wikimedia.org/P90387 and previous config saved to /var/cache/conftool/dbconfig/20260410-224008-fceratto.json
22:33 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host apus-be1005.eqiad.wmnet with OS bookworm
22:32 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host apus-be1005.eqiad.wmnet with OS bookworm
22:31 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host apus-be1005.eqiad.wmnet with OS bookworm
22:31 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host apus-be1005.eqiad.wmnet with OS bookworm
22:30 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host apus-be1005.eqiad.wmnet with OS bookworm
22:28 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host phab1006.eqiad.wmnet with OS trixie
22:24 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db1170 (T410589)', diff saved to https://phabricator.wikimedia.org/P90386 and previous config saved to /var/cache/conftool/dbconfig/20260410-222445-ladsgroup.json
22:24 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db1170.eqiad.wmnet with reason: Maintenance
22:24 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1158 (T410589)', diff saved to https://phabricator.wikimedia.org/P90385 and previous config saved to /var/cache/conftool/dbconfig/20260410-222421-ladsgroup.json
22:17 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host phab1006.eqiad.wmnet with OS trixie
22:14 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P90384 and previous config saved to /var/cache/conftool/dbconfig/20260410-221414-ladsgroup.json
22:13 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host phab1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
22:04 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P90383 and previous config saved to /var/cache/conftool/dbconfig/20260410-220406-ladsgroup.json
22:02 jclark@cumin1003: START - Cookbook sre.hosts.provision for host phab1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
22:00 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host phab1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
21:58 jclark@cumin1003: START - Cookbook sre.hosts.provision for host phab1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
21:57 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host phab1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
21:54 jclark@cumin1003: START - Cookbook sre.hosts.provision for host phab1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
21:54 ladsgroup@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1158 (T410589)', diff saved to https://phabricator.wikimedia.org/P90382 and previous config saved to /var/cache/conftool/dbconfig/20260410-215358-ladsgroup.json
21:37 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host clouddb1019.eqiad.wmnet with OS trixie
20:59 bking@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5 days, 0:00:00 on cloudelastic1012.eqiad.wmnet with reason: still fixing Puppet
20:54 fceratto@cumin2002: dbctl commit (dc=all): 'Depooling db2210 (T419635)', diff saved to https://phabricator.wikimedia.org/P90381 and previous config saved to /var/cache/conftool/dbconfig/20260410-205420-fceratto.json
20:53 fceratto@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2210.codfw.wmnet with reason: Maintenance
20:53 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2206 (T419635)', diff saved to https://phabricator.wikimedia.org/P90380 and previous config saved to /var/cache/conftool/dbconfig/20260410-205324-fceratto.json
20:42 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2206', diff saved to https://phabricator.wikimedia.org/P90378 and previous config saved to /var/cache/conftool/dbconfig/20260410-204236-fceratto.json
20:31 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2206', diff saved to https://phabricator.wikimedia.org/P90377 and previous config saved to /var/cache/conftool/dbconfig/20260410-203147-fceratto.json
20:21 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2206 (T419635)', diff saved to https://phabricator.wikimedia.org/P90376 and previous config saved to /var/cache/conftool/dbconfig/20260410-202059-fceratto.json
20:02 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1055.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
19:58 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1055.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
19:57 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1055.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
19:55 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1055.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
19:52 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1055.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
19:48 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1055.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
18:34 fceratto@cumin2002: dbctl commit (dc=all): 'Depooling db2206 (T419635)', diff saved to https://phabricator.wikimedia.org/P90373 and previous config saved to /var/cache/conftool/dbconfig/20260410-183455-fceratto.json
18:34 fceratto@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2206.codfw.wmnet with reason: Maintenance
18:27 dancy@deploy1003: Finished deploy [releng/jenkins-deploy@46eae53] (releasing): (no justification provided) (duration: 00m 56s)
18:26 dancy@deploy1003: Started deploy [releng/jenkins-deploy@46eae53] (releasing): (no justification provided)
17:46 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1055.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
17:41 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1055.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
17:28 dancy@deploy1003: Installation of scap version "4.248.0" completed for 2 hosts
17:27 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudelastic1012.eqiad.wmnet with OS trixie
17:26 dancy@deploy1003: Installing scap version "4.248.0" for 2 host(s)
17:12 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudelastic1012.eqiad.wmnet with reason: host reimage
17:08 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudelastic1012.eqiad.wmnet with reason: host reimage
17:00 fceratto@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2199.codfw.wmnet with reason: Maintenance
16:59 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2172 (T419635)', diff saved to https://phabricator.wikimedia.org/P90372 and previous config saved to /var/cache/conftool/dbconfig/20260410-165951-fceratto.json
16:57 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1012.eqiad.wmnet with OS trixie
16:49 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2172', diff saved to https://phabricator.wikimedia.org/P90371 and previous config saved to /var/cache/conftool/dbconfig/20260410-164902-fceratto.json
16:39 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4 days, 0:00:00 on zuul1001.eqiad.wmnet with reason: T421398
16:38 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2172', diff saved to https://phabricator.wikimedia.org/P90370 and previous config saved to /var/cache/conftool/dbconfig/20260410-163814-fceratto.json
16:27 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2172 (T419635)', diff saved to https://phabricator.wikimedia.org/P90368 and previous config saved to /var/cache/conftool/dbconfig/20260410-162726-fceratto.json
16:05 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.roll-restart-ntp (exit_code=0) rolling restart_daemons on A:dnsbox
15:59 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
15:59 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
15:46 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcephmon1006.eqiad.wmnet
15:45 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
15:44 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
15:41 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudelastic1012.eqiad.wmnet with OS trixie
15:39 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudcephmon1006.eqiad.wmnet
15:38 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
15:38 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
15:36 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
15:34 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
15:34 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudelastic1012.eqiad.wmnet with reason: host reimage
15:27 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudelastic1012.eqiad.wmnet with reason: host reimage
15:23 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcephmon1006.eqiad.wmnet
15:17 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudcephmon1006.eqiad.wmnet
15:16 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1012.eqiad.wmnet with OS trixie
14:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
14:53 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
14:38 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
14:38 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
14:36 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
14:32 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
14:30 gkyziridis@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
14:30 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
14:27 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
14:25 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
14:23 fceratto@cumin2002: dbctl commit (dc=all): 'Depooling db2172 (T419635)', diff saved to https://phabricator.wikimedia.org/P90367 and previous config saved to /var/cache/conftool/dbconfig/20260410-142308-fceratto.json
14:22 fceratto@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2172.codfw.wmnet with reason: Maintenance
14:22 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2155 (T419635)', diff saved to https://phabricator.wikimedia.org/P90366 and previous config saved to /var/cache/conftool/dbconfig/20260410-142200-fceratto.json
14:13 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
14:13 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
14:13 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
14:13 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
14:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
14:11 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
14:11 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
14:11 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2155', diff saved to https://phabricator.wikimedia.org/P90365 and previous config saved to /var/cache/conftool/dbconfig/20260410-141112-fceratto.json
14:10 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
14:10 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
14:10 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
14:00 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2155', diff saved to https://phabricator.wikimedia.org/P90363 and previous config saved to /var/cache/conftool/dbconfig/20260410-140023-fceratto.json
13:49 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2155 (T419635)', diff saved to https://phabricator.wikimedia.org/P90362 and previous config saved to /var/cache/conftool/dbconfig/20260410-134935-fceratto.json
13:44 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
13:44 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
13:40 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
13:32 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
13:32 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
13:32 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
13:30 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
13:30 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
13:30 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
13:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
13:29 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
13:28 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
13:22 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db2150 (T410589)', diff saved to https://phabricator.wikimedia.org/P90358 and previous config saved to /var/cache/conftool/dbconfig/20260410-132215-ladsgroup.json
13:22 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db2150.codfw.wmnet with reason: Maintenance
13:21 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
13:21 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depooling db1158 (T410589)', diff saved to https://phabricator.wikimedia.org/P90357 and previous config saved to /var/cache/conftool/dbconfig/20260410-132119-ladsgroup.json
13:21 ladsgroup@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db1158.eqiad.wmnet with reason: Maintenance
13:20 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
13:19 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
13:19 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
13:16 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
13:15 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
13:08 cmooney@dns2005: END - running authdns-update
13:07 cmooney@dns2005: START - running authdns-update
13:06 cmooney@dns2005: START - running authdns-update
13:05 sukhe@cumin1003: START - Cookbook sre.dns.roll-restart-ntp rolling restart_daemons on A:dnsbox
13:02 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
13:02 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
13:02 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: remove dns for decom lumen transport cct - cmooney@cumin1003"
13:01 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: remove dns for decom lumen transport cct - cmooney@cumin1003"
12:59 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
12:57 cmooney@cumin1003: START - Cookbook sre.dns.netbox
12:57 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
12:57 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
12:57 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
12:55 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
12:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
12:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
12:52 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthboo-next: apply
12:51 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
12:47 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook-next: apply
12:46 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
12:44 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
12:34 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
12:32 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
12:27 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
12:26 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
12:24 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
12:24 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
12:22 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
12:09 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
12:09 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
11:50 fceratto@cumin2002: dbctl commit (dc=all): 'Depooling db2155 (T419635)', diff saved to https://phabricator.wikimedia.org/P90351 and previous config saved to /var/cache/conftool/dbconfig/20260410-115015-fceratto.json
11:49 fceratto@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2155.codfw.wmnet with reason: Maintenance
11:49 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2147 (T419635)', diff saved to https://phabricator.wikimedia.org/P90350 and previous config saved to /var/cache/conftool/dbconfig/20260410-114919-fceratto.json
11:38 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2147', diff saved to https://phabricator.wikimedia.org/P90349 and previous config saved to /var/cache/conftool/dbconfig/20260410-113830-fceratto.json
11:27 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2147', diff saved to https://phabricator.wikimedia.org/P90348 and previous config saved to /var/cache/conftool/dbconfig/20260410-112742-fceratto.json
11:22 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
11:20 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
11:19 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
11:16 fceratto@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db2147 (T419635)', diff saved to https://phabricator.wikimedia.org/P90347 and previous config saved to /var/cache/conftool/dbconfig/20260410-111654-fceratto.json
11:16 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
11:15 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
11:10 aokoth@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1004.wikimedia.org with reason: Security Release - T422668
11:04 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
11:04 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
11:02 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
11:02 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
11:00 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
11:00 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
10:58 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
10:30 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
10:27 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
10:27 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
10:25 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
10:21 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
10:20 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
10:20 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
10:19 vgutierrez: upload haproxy 2.8.20 to thirdparty/haproxy28 for bookworm-wikimedia (apt.wm.o) - T422926
10:18 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
10:16 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
10:16 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
10:14 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
10:11 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
10:11 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
10:11 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
10:10 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
10:10 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
10:09 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
10:09 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
10:07 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
10:03 aokoth@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1004.wikimedia.org with reason: Security Release - T422668
09:58 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
09:57 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
09:50 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
09:48 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
09:48 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
09:39 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
09:36 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
09:35 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
09:31 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
09:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
09:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
09:24 elukey@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/aux-k8s-services/jaeger: sync
09:24 elukey@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/aux-k8s-services/jaeger: sync
09:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
09:24 elukey@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/aux-k8s-services/jaeger: sync
09:24 elukey@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/aux-k8s-services/jaeger: sync
09:22 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
09:22 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
09:21 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
09:21 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
09:21 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
09:21 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
09:18 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
09:17 fceratto@cumin2002: dbctl commit (dc=all): 'Depooling db2147 (T419635)', diff saved to https://phabricator.wikimedia.org/P90346 and previous config saved to /var/cache/conftool/dbconfig/20260410-091713-fceratto.json
09:17 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
09:16 fceratto@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2147.codfw.wmnet with reason: Maintenance
09:15 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
09:14 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
09:13 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
09:04 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
09:02 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
09:02 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
09:00 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
08:55 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
08:54 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
08:54 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
08:54 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
08:54 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
08:45 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
08:45 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
08:45 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
08:44 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
08:44 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
08:42 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
08:19 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
07:30 jelto@dns1004: END - running authdns-update
07:29 jelto@dns1004: START - running authdns-update
07:09 aokoth@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on gitlab1003.wikimedia.org with reason: Upgrade
06:26 marostegui@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host clouddb1019.eqiad.wmnet with OS trixie
05:05 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host clouddb1019.eqiad.wmnet with OS trixie
01:26 rzl@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
01:25 rzl@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
01:23 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
01:23 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
00:57 rzl@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
00:57 rzl@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
00:54 zabe@deploy1003: Finished scap sync-world: Backport for Stop setting specific virtual domain for link tables (T421914) (duration: 05m 51s)
00:50 zabe@deploy1003: zabe: Continuing with sync
00:50 zabe@deploy1003: zabe: Backport for Stop setting specific virtual domain for link tables (T421914) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
00:48 zabe@deploy1003: Started scap sync-world: Backport for Stop setting specific virtual domain for link tables (T421914)
00:46 zabe@deploy1003: Finished scap sync-world: Backport for Start reading from new file tables on enwiki (T416548) (duration: 06m 11s)
00:43 zabe@deploy1003: zabe: Continuing with sync
00:42 zabe@deploy1003: zabe: Backport for Start reading from new file tables on enwiki (T416548) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
00:40 zabe@deploy1003: Started scap sync-world: Backport for Start reading from new file tables on enwiki (T416548)
00:29 zabe: marked 425 content rows as bad # T393237
00:26 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host apus-be2005.codfw.wmnet with OS bookworm
00:26 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
00:21 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
00:08 zabe@deploy1003: Finished scap sync-world: Backport for Disable query pages on testcommonswiki not compatible with split (T421914) (duration: 07m 17s)
00:07 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host apus-be2006.codfw.wmnet with OS bookworm
00:07 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
00:07 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
00:04 zabe@deploy1003: zabe: Continuing with sync
00:03 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on apus-be2005.codfw.wmnet with reason: host reimage
00:02 zabe@deploy1003: zabe: Backport for Disable query pages on testcommonswiki not compatible with split (T421914) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
00:00 zabe@deploy1003: Started scap sync-world: Backport for Disable query pages on testcommonswiki not compatible with split (T421914)

2026-04-09

23:58 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on apus-be2005.codfw.wmnet with reason: host reimage
23:48 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on apus-be2006.codfw.wmnet with reason: host reimage
23:43 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on apus-be2006.codfw.wmnet with reason: host reimage
23:15 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host apus-be2006.codfw.wmnet with OS bookworm
23:15 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host apus-be2005.codfw.wmnet with OS bookworm
23:14 jhancock@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts ['apus-be2005']
23:14 jhancock@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['apus-be2005']
23:13 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host apus-be2005.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
23:11 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host apus-be2006.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
22:58 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host apus-be2006.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
22:58 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host apus-be2005.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
21:57 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host phab2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
21:27 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host phab2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
21:25 cscott@deploy1003: Finished scap sync-world: Backport for ParsoidLanguageConverter: Don't convert inside <style> elements (T422879) (duration: 06m 52s)
21:25 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host phab2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
21:22 cscott@deploy1003: cscott: Continuing with sync
21:21 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen: apply
21:21 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen: apply
21:20 cscott@deploy1003: cscott: Backport for ParsoidLanguageConverter: Don't convert inside <style> elements (T422879) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
21:19 cscott@deploy1003: Started scap sync-world: Backport for ParsoidLanguageConverter: Don't convert inside <style> elements (T422879)
21:11 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host phab2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
21:06 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host phab2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
20:56 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host phab2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
20:50 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host phab2003
20:50 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host phab2003
20:50 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host apus-be2006
20:45 inflatador: reprepro --noskipold --component thirdparty/opensearch2 update trixie-wikimedia T422860
20:45 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host apus-be2006
20:45 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host apus-be2005
20:39 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
20:37 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
20:31 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host apus-be2005
20:31 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
20:31 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding apus-be2005-6 and phab2003 to codfw - jhancock@cumin2002"
20:31 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding apus-be2005-6 and phab2003 to codfw - jhancock@cumin2002"
20:28 aude@deploy1003: Finished scap sync-world: Backport for Make onboarding dialog a little less eager beaver 🦫 (T421942) (duration: 10m 05s)
20:27 jhancock@cumin2002: START - Cookbook sre.dns.netbox
20:25 aude@deploy1003: aude: Continuing with sync
20:20 aude@deploy1003: aude: Backport for Make onboarding dialog a little less eager beaver 🦫 (T421942) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
20:18 aude@deploy1003: Started scap sync-world: Backport for Make onboarding dialog a little less eager beaver 🦫 (T421942)
20:17 tappof@cumin1003: END (PASS) - Cookbook sre.o11y.thanos-compact-restart (exit_code=0) rebalance blocks across compactor instances (patch id: 1265429)
20:17 aude@deploy1003: Finished scap sync-world: Backport for Opt-in new accounts to ReadingLists beta feature on testwiki (T422833), Add new protection level (edituserprotected) for nowiki (T367943), Turn on Parsoid Read Views for dewiki (T422524) (duration: 09m 04s)
20:15 tappof@cumin1003: START - Cookbook sre.o11y.thanos-compact-restart rebalance blocks across compactor instances (patch id: 1265429)
20:13 aude@deploy1003: cscott, jhsoby, aude: Continuing with sync
20:10 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcephmon2006-dev.codfw.wmnet
20:09 aude@deploy1003: cscott, jhsoby, aude: Backport for Opt-in new accounts to ReadingLists beta feature on testwiki (T422833), Add new protection level (edituserprotected) for nowiki (T367943), Turn on Parsoid Read Views for dewiki (T422524) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
20:07 aude@deploy1003: Started scap sync-world: Backport for Opt-in new accounts to ReadingLists beta feature on testwiki (T422833), Add new protection level (edituserprotected) for nowiki (T367943), Turn on Parsoid Read Views for dewiki (T422524)
20:04 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudcephmon2006-dev.codfw.wmnet
20:03 tappof@cumin1003: END (PASS) - Cookbook sre.o11y.thanos-compact-restart (exit_code=0) cookbook test (patch id: 1260650)
20:00 tappof@cumin1003: START - Cookbook sre.o11y.thanos-compact-restart cookbook test (patch id: 1260650)
19:46 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1055.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
19:44 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1055.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
19:30 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1055.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
19:29 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1055.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
19:24 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1055.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
19:20 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1055.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
19:18 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1055.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
19:16 inflatador: bking@apt1002 sudo -E reprepro -C thirdparty/opensearch2 copy trixie-wikimedia bookworm-wikimedia opensearch T422860
19:16 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1055.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
19:15 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1055.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
19:11 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1055.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
19:10 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ganeti1055
19:09 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host ganeti1055
19:09 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
19:06 vriley@cumin1003: START - Cookbook sre.dns.netbox
19:04 inflatador: bking@apt1002 delete old haproxy pkgs P90343
19:02 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
19:02 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
18:58 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
18:58 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
18:56 ladsgroup@deploy1003: Finished scap sync-world: Backport for Revert^5 "Use envoy for swift inside mediawiki" (duration: 05m 53s)
18:52 ladsgroup@deploy1003: ladsgroup: Continuing with sync
18:52 ladsgroup@deploy1003: ladsgroup: Backport for Revert^5 "Use envoy for swift inside mediawiki" synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
18:50 ladsgroup@deploy1003: Started scap sync-world: Backport for Revert^5 "Use envoy for swift inside mediawiki"
18:49 dancy@deploy1003: Installation of scap version "4.247.0" completed for 2 hosts
18:49 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
18:49 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
18:47 dancy@deploy1003: Installing scap version "4.247.0" for 2 host(s)
18:46 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
18:46 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
18:40 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
18:40 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
18:37 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
18:37 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
18:29 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
18:29 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
18:23 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcephmon2005-dev.codfw.wmnet
18:17 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudcephmon2005-dev.codfw.wmnet
18:13 dancy@deploy1003: rebuilt and synchronized wikiversions files: group2 to 1.46.0-wmf.23 refs T420481
18:04 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
18:04 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
18:03 dancy@deploy1003: Installation of scap version "4.246.0" completed for 2 hosts
18:02 dancy@deploy1003: Installing scap version "4.246.0" for 2 host(s)
17:52 dzahn@deploy1003: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
17:52 dzahn@deploy1003: helmfile [eqiad] START helmfile.d/services/miscweb: apply
17:52 dzahn@deploy1003: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
17:51 dzahn@deploy1003: helmfile [codfw] START helmfile.d/services/miscweb: apply
17:51 dzahn@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply
17:51 dzahn@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply
17:43 ladsgroup@deploy1003: Finished scap sync-world: Backport for Revert^4 "Use envoy for swift inside mediawiki" (duration: 06m 49s)
17:39 ladsgroup@deploy1003: ladsgroup: Continuing with sync
17:38 ladsgroup@deploy1003: ladsgroup: Backport for Revert^4 "Use envoy for swift inside mediawiki" synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
17:36 ladsgroup@deploy1003: Started scap sync-world: Backport for Revert^4 "Use envoy for swift inside mediawiki"
17:35 cgoubert@deploy1003: helmfile [staging] DONE helmfile.d/services/shellbox-video: apply
17:35 cgoubert@deploy1003: helmfile [staging] START helmfile.d/services/shellbox-video: apply
17:25 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/shellbox-video: apply
17:24 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/shellbox-video: apply
17:24 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-video: apply
17:23 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-video: apply
17:10 aokoth@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab2002.wikimedia.org with reason: Security Release - T422668
17:09 bd808@deploy1003: helmfile [eqiad] DONE helmfile.d/services/developer-portal: apply
17:09 bd808@deploy1003: helmfile [eqiad] START helmfile.d/services/developer-portal: apply
17:09 bd808@deploy1003: helmfile [codfw] DONE helmfile.d/services/developer-portal: apply
17:08 bd808@deploy1003: helmfile [codfw] START helmfile.d/services/developer-portal: apply
17:08 bd808@deploy1003: helmfile [staging] DONE helmfile.d/services/developer-portal: apply
17:07 bd808@deploy1003: helmfile [staging] START helmfile.d/services/developer-portal: apply
17:03 ladsgroup@deploy1003: Finished scap sync-world: Backport for Revert^3 "Use envoy for swift inside mediawiki" (duration: 06m 11s)
17:01 aokoth@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab2002.wikimedia.org with reason: Security Release - T422668
16:59 ladsgroup@deploy1003: ladsgroup: Continuing with sync
16:59 ladsgroup@deploy1003: ladsgroup: Backport for Revert^3 "Use envoy for swift inside mediawiki" synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
16:57 ladsgroup@deploy1003: Started scap sync-world: Backport for Revert^3 "Use envoy for swift inside mediawiki"
16:56 aokoth@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1003.wikimedia.org with reason: Security Release - T422668
16:51 ladsgroup@deploy1003: Started scap sync-world: Backport for Revert^2 "Use envoy for swift inside mediawiki" (T328872)
16:48 ladsgroup@deploy1003: Finished scap sync-world: Backport for Revert^2 "Use envoy for swift inside mediawiki" (T328872) (duration: 07m 02s)
16:46 aokoth@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1003.wikimedia.org with reason: Security Release - T422668
16:44 ladsgroup@deploy1003: ladsgroup: Continuing with sync
16:43 ladsgroup@deploy1003: ladsgroup: Backport for Revert^2 "Use envoy for swift inside mediawiki" (T328872) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
16:41 ladsgroup@deploy1003: Started scap sync-world: Backport for Revert^2 "Use envoy for swift inside mediawiki" (T328872)
16:11 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host moss-be1002.eqiad.wmnet with OS bookworm
15:47 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on moss-be1002.eqiad.wmnet with reason: host reimage
15:46 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.roll-restart (exit_code=0) rolling restart_daemons on A:dnsbox and (A:dnsbox)
15:44 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on moss-be1002.eqiad.wmnet with reason: host reimage
15:44 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-experimental: apply
15:43 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/services/mw-experimental: apply
15:43 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
15:42 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
15:36 cgoubert@deploy1003: Finished scap sync-world: swift service proxy configuration cahnges (duration: 05m 45s)
15:31 cgoubert@deploy1003: Started scap sync-world: swift service proxy configuration cahnges
15:26 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host moss-be1002
15:25 mvernon@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host moss-be1002
15:25 mvernon@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host moss-be1002
15:25 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) moss-be1002.eqiad.wmnet 79.32.64.10.in-addr.arpa 9.7.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
15:25 mvernon@cumin2002: START - Cookbook sre.dns.wipe-cache moss-be1002.eqiad.wmnet 79.32.64.10.in-addr.arpa 9.7.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
15:25 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
15:25 mvernon@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host moss-be1002 - mvernon@cumin2002"
15:25 mvernon@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host moss-be1002 - mvernon@cumin2002"
15:21 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
15:20 mvernon@cumin2002: START - Cookbook sre.dns.netbox
15:20 mvernon@cumin2002: START - Cookbook sre.hosts.move-vlan for host moss-be1002
15:19 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host moss-be1002.eqiad.wmnet with OS bookworm
15:19 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
15:19 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
15:17 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
15:03 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
14:59 dancy@deploy1003: Installation of scap version "4.245.0" completed for 2 hosts
14:58 dancy@deploy1003: Installing scap version "4.245.0" for 2 host(s)
14:53 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.roll-restart-reboot-wikimedia-dns (exit_code=0) rolling restart_daemons on A:wikidough
14:47 sukhe@cumin1003: START - Cookbook sre.dns.roll-restart rolling restart_daemons on A:dnsbox and (A:dnsbox)
14:45 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
14:44 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
14:42 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
14:41 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
14:41 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
14:40 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
14:39 sukhe@cumin1003: START - Cookbook sre.dns.roll-restart-reboot-wikimedia-dns rolling restart_daemons on A:wikidough
14:38 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
14:37 Emperor: ceph orch host drain moss-be1002 --zap-osd-devices T421719
14:35 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
14:35 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
14:30 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host apus-fe1003.eqiad.wmnet with OS bookworm
14:29 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
14:29 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
14:26 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
14:25 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
14:25 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
14:23 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
14:23 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
14:14 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
14:14 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
14:13 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
14:09 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on apus-fe1003.eqiad.wmnet with reason: host reimage
14:06 mszwarc@deploy1003: mwscript-k8s job started: foreachwikiindblist all backfillInterwikiRightsLog.php --remote-wiki metawiki 20260311190000 # T6055 (second attempt)
14:04 aude@deploy1003: Finished scap sync-world: Backport for Enable reading list beta feature for pilot wikis (T420878) (duration: 08m 40s)
14:01 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on apus-fe1003.eqiad.wmnet with reason: host reimage
14:00 aude@deploy1003: bwang, aude: Continuing with sync
13:57 aude@deploy1003: bwang, aude: Backport for Enable reading list beta feature for pilot wikis (T420878) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
13:55 aude@deploy1003: Started scap sync-world: Backport for Enable reading list beta feature for pilot wikis (T420878)
13:52 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
13:52 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
{{safesubst:SAL entry|1=13:52 hashar@deploy1003: Finished scap sync-world: Backport for fix: adjust to return type changed by upstream, browser-tests: hide Cypress tests from CI (T419574), browser-tests: hide Cypress tests from CI (T419574), Fix BackfillInterwikiRightsLog wrt. cyclic renames (T6055), [[gerrit:1269334|Fix BackfillInterwikiRightsLog wrt. cyclic renames (T605}}
13:48 hashar@deploy1003: mszwarc, hashar: Continuing with sync
13:46 bking@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-ipoid: apply
13:45 bking@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-ipoid: apply
13:44 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host apus-fe1003
13:44 mvernon@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host apus-fe1003
13:44 mvernon@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host apus-fe1003
13:44 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) apus-fe1003.eqiad.wmnet 102.32.64.10.in-addr.arpa 2.0.1.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
13:44 mvernon@cumin2002: START - Cookbook sre.dns.wipe-cache apus-fe1003.eqiad.wmnet 102.32.64.10.in-addr.arpa 2.0.1.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
13:44 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
13:44 mvernon@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host apus-fe1003 - mvernon@cumin2002"
13:44 mvernon@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host apus-fe1003 - mvernon@cumin2002"
13:44 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
13:44 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
13:40 mvernon@cumin2002: START - Cookbook sre.dns.netbox
13:40 mvernon@cumin2002: START - Cookbook sre.hosts.move-vlan for host apus-fe1003
13:39 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host apus-fe1003.eqiad.wmnet with OS bookworm
13:38 hashar@deploy1003: mszwarc, hashar: Backport for fix: adjust to return type changed by upstream, browser-tests: hide Cypress tests from CI (T419574), browser-tests: hide Cypress tests from CI (T419574), Fix BackfillInterwikiRightsLog wrt. cyclic renames (T6055), Fix BackfillInterwikiRightsLog wrt. cyclic renames (T6055) sync
13:37 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
{{safesubst:SAL entry|1=13:36 hashar@deploy1003: Started scap sync-world: Backport for fix: adjust to return type changed by upstream, browser-tests: hide Cypress tests from CI (T419574), browser-tests: hide Cypress tests from CI (T419574), Fix BackfillInterwikiRightsLog wrt. cyclic renames (T6055), [[gerrit:1269334|Fix BackfillInterwikiRightsLog wrt. cyclic renames (T6055}}
13:33 bking@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-ipoid: apply
13:33 mvernon@cumin2002: END (PASS) - Cookbook sre.swift.roll-restart-reboot-swift-thanos-proxies (exit_code=0) rolling restart_daemons on A:thanos-fe-codfw
13:31 mvernon@cumin2002: START - Cookbook sre.swift.roll-restart-reboot-swift-thanos-proxies rolling restart_daemons on A:thanos-fe-codfw
13:30 mvernon@cumin2002: END (PASS) - Cookbook sre.swift.roll-restart-reboot-swift-thanos-proxies (exit_code=0) rolling restart_daemons on P{thanos-fe[1004-1006].eqiad.wmnet} and (A:thanos-fe or A:thanos-fe-codfw or A:thanos-fe-eqiad)
13:29 mvernon@cumin2002: START - Cookbook sre.swift.roll-restart-reboot-swift-thanos-proxies rolling restart_daemons on P{thanos-fe[1004-1006].eqiad.wmnet} and (A:thanos-fe or A:thanos-fe-codfw or A:thanos-fe-eqiad)
13:28 bking@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-ipoid: apply
13:27 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host thanos-fe1007.eqiad.wmnet with OS bullseye
13:19 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
13:17 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
13:15 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
13:14 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host phab1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
13:09 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-cluster (exit_code=0) pool all services in eqiad/aux-eqiad: maintenance
13:09 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-cluster pool all services in eqiad/aux-eqiad: maintenance
13:09 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/thumbor: apply
13:08 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on thanos-fe1007.eqiad.wmnet with reason: host reimage
13:08 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.wipe-cluster (exit_code=0) Wipe the K8s cluster aux-eqiad: Kubernetes upgrade
13:08 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
13:07 jclark@cumin1003: START - Cookbook sre.hosts.provision for host phab1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
13:07 elukey@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: sync
13:07 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host phab1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
13:07 elukey@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: sync
13:06 elukey@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
13:06 elukey@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/aux-k8s-services/sophroid: sync
13:06 elukey@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/aux-k8s-services/sophroid: sync
13:06 elukey@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/aux-k8s-services/redioscope: sync
13:05 elukey@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/aux-k8s-services/redioscope: sync
13:05 elukey@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/aux-k8s-services/kafka-mirrormaker: sync
13:05 elukey@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/aux-k8s-services/kafka-mirrormaker: sync
13:05 elukey@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/aux-k8s-services/jaeger: sync
13:05 elukey@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/aux-k8s-services/jaeger: sync
13:04 jclark@cumin1003: START - Cookbook sre.hosts.provision for host phab1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
13:03 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/thumbor: apply
13:02 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on thanos-fe1007.eqiad.wmnet with reason: host reimage
13:01 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host phab1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
13:01 elukey@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'sync'.
12:59 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/thumbor: apply
12:59 jmm@deploy1003: helmfile [staging] START helmfile.d/services/thumbor: apply
12:59 elukey@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'sync'.
12:53 hashar: Directly pushed GrowthExperiments wmf/1.46.0-wmf.22 patch https://gerrit.wikimedia.org/r/c/mediawiki/extensions/GrowthExperiments/+/1269351 due to a chicken-and-egg issue on that branch
12:47 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host thanos-fe1007
12:47 mvernon@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host thanos-fe1007
12:46 mvernon@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host thanos-fe1007
12:46 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) thanos-fe1007.eqiad.wmnet 186.48.64.10.in-addr.arpa 6.8.1.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
12:46 mvernon@cumin2002: START - Cookbook sre.dns.wipe-cache thanos-fe1007.eqiad.wmnet 186.48.64.10.in-addr.arpa 6.8.1.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
12:46 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
12:46 mvernon@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host thanos-fe1007 - mvernon@cumin2002"
12:46 mvernon@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host thanos-fe1007 - mvernon@cumin2002"
12:46 elukey@cumin1003: START - Cookbook sre.k8s.wipe-cluster Wipe the K8s cluster aux-eqiad: Kubernetes upgrade
12:42 jclark@cumin1003: START - Cookbook sre.hosts.provision for host phab1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
12:42 mvernon@cumin2002: START - Cookbook sre.dns.netbox
12:42 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-cluster (exit_code=0) depool all services in eqiad/aux-eqiad: maintenance
12:42 mvernon@cumin2002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
12:42 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
12:42 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
12:42 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding phab1006 to eqiad - jclark@cumin1003"
12:42 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-cluster depool all services in eqiad/aux-eqiad: maintenance
12:42 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding phab1006 to eqiad - jclark@cumin1003"
12:40 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
12:40 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
12:38 mvernon@cumin2002: START - Cookbook sre.dns.netbox
12:38 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
12:35 mvernon@cumin2002: START - Cookbook sre.hosts.move-vlan for host thanos-fe1007
12:34 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host thanos-fe1007.eqiad.wmnet with OS bullseye
12:33 jclark@cumin1003: START - Cookbook sre.dns.netbox
12:32 mvernon@cumin2002: END (PASS) - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies (exit_code=0) rolling restart_daemons on P{ms-fe[1009-1019,1021-1024].eqiad.wmnet} and (A:swift-fe or A:swift-fe-canary or A:swift-fe-codfw or A:swift-fe-eqiad)
12:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
12:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
12:28 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host apus-be1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
12:27 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host apus-be1005.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
12:24 mvernon@cumin2002: START - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies rolling restart_daemons on P{ms-fe[1009-1019,1021-1024].eqiad.wmnet} and (A:swift-fe or A:swift-fe-canary or A:swift-fe-codfw or A:swift-fe-eqiad)
12:23 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-fe1020.eqiad.wmnet with OS bullseye
12:18 moritzm: restarting Postfix on mx-in to pick up OpenSSL updates
12:13 fnegri@cumin1003: conftool action : set/pooled=no; selector: name=clouddb1019.eqiad.wmnet
12:12 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
12:10 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
12:09 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
12:08 moritzm: restarting Postfix on mx-out to pick up OpenSSL updates
12:07 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
12:06 jclark@cumin1003: START - Cookbook sre.hosts.provision for host apus-be1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
12:06 jclark@cumin1003: START - Cookbook sre.hosts.provision for host apus-be1005.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
12:05 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
12:05 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding apus-be1005 to eqiad - jclark@cumin1003"
12:05 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding apus-be1005 to eqiad - jclark@cumin1003"
12:04 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-fe1020.eqiad.wmnet with reason: host reimage
12:02 jclark@cumin1003: START - Cookbook sre.dns.netbox
12:00 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-fe1020.eqiad.wmnet with reason: host reimage
11:51 moritzm: installing nginx security updates
11:44 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host ms-fe1020
11:44 mvernon@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-fe1020
11:31 mvernon@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ms-fe1020
11:31 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ms-fe1020.eqiad.wmnet 113.48.64.10.in-addr.arpa 3.1.1.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
11:31 mvernon@cumin2002: START - Cookbook sre.dns.wipe-cache ms-fe1020.eqiad.wmnet 113.48.64.10.in-addr.arpa 3.1.1.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
11:31 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
11:31 mvernon@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-fe1020 - mvernon@cumin2002"
11:30 mvernon@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-fe1020 - mvernon@cumin2002"
11:29 jmm@cumin2002: END (PASS) - Cookbook sre.misc-clusters.roll-restart-reboot-docker-registry (exit_code=0) rolling restart_daemons on A:docker-registry
11:29 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
11:28 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
11:27 jmm@cumin2002: START - Cookbook sre.misc-clusters.roll-restart-reboot-docker-registry rolling restart_daemons on A:docker-registry
11:27 mvernon@cumin2002: START - Cookbook sre.dns.netbox
11:26 mvernon@cumin2002: START - Cookbook sre.hosts.move-vlan for host ms-fe1020
11:25 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-fe1020.eqiad.wmnet with OS bullseye
11:16 moritzm: installing tiff security updates
11:05 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
11:04 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
11:04 mvernon@cumin2002: END (PASS) - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies (exit_code=0) rolling restart_daemons on P{ms-fe[1009-1018,1020-1024].eqiad.wmnet} and (A:swift-fe or A:swift-fe-canary or A:swift-fe-codfw or A:swift-fe-eqiad)
11:01 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
11:01 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
11:01 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
11:01 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
10:55 mvernon@cumin2002: START - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies rolling restart_daemons on P{ms-fe[1009-1018,1020-1024].eqiad.wmnet} and (A:swift-fe or A:swift-fe-canary or A:swift-fe-codfw or A:swift-fe-eqiad)
10:54 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-fe1019.eqiad.wmnet with OS bullseye
10:47 moritzm: installing openssl security updates
10:45 moritzm: upgrade debdeploy-server on cumin2002 to 0.0.99.14-1+deb12u1+exp2 (temporary build with Cumin 6 compat before we have Cumin 6 universally)
10:34 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-fe1019.eqiad.wmnet with reason: host reimage
10:29 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-fe1019.eqiad.wmnet with reason: host reimage
10:13 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host ms-fe1019
10:13 mvernon@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-fe1019
10:13 mvernon@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ms-fe1019
10:13 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ms-fe1019.eqiad.wmnet 92.32.64.10.in-addr.arpa 2.9.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
10:13 mvernon@cumin2002: START - Cookbook sre.dns.wipe-cache ms-fe1019.eqiad.wmnet 92.32.64.10.in-addr.arpa 2.9.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
10:13 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
10:13 mvernon@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-fe1019 - mvernon@cumin2002"
10:13 mvernon@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-fe1019 - mvernon@cumin2002"
10:08 mvernon@cumin2002: START - Cookbook sre.dns.netbox
10:08 mvernon@cumin2002: START - Cookbook sre.hosts.move-vlan for host ms-fe1019
10:08 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_esams - 3.2 upgrade (T421402)
10:07 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-fe1019.eqiad.wmnet with OS bullseye
10:04 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_esams - 3.2 upgrade (T421402)
09:58 gkyziridis@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
09:43 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2157: Pooling in
09:33 mvernon@cumin2002: END (PASS) - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies (exit_code=0) rolling restart_daemons on P{ms-fe[1009-1012,1014-1024].eqiad.wmnet} and (A:swift-fe or A:swift-fe-canary or A:swift-fe-codfw or A:swift-fe-eqiad)
09:25 mvernon@cumin2002: START - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies rolling restart_daemons on P{ms-fe[1009-1012,1014-1024].eqiad.wmnet} and (A:swift-fe or A:swift-fe-canary or A:swift-fe-codfw or A:swift-fe-eqiad)
09:22 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-fe1013.eqiad.wmnet with OS bullseye
09:16 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_esams - 3.2 upgrade (T421402)
09:16 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_esams - 3.2 upgrade (T421402)
09:15 elukey: remove /var/run/confd-template/_var_lib_gdnsd_discovery-k8s-ingress-aux-rw.state.err on affected dns servers and restart confd
09:12 elukey: remove /var/run/confd-template/_var_lib_gdnsd_discovery-k8s-ingress-aux-rw.state.err on dns5004 and restart confd
09:11 fabfur: upgrading esams to haproxy 3.2 (T421402)
09:10 gkyziridis@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
09:07 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
09:06 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
09:04 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-fe1013.eqiad.wmnet with reason: host reimage
08:59 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_eqiad - 3.2 upgrade (T421402)
08:58 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-fe1013.eqiad.wmnet with reason: host reimage
08:57 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db2157: Pooling in
08:56 oblivian@puppetserver1001: conftool action : set/pooled=false; selector: dnsdisc=k8s-ingress-aux-rw,name=codfw
08:51 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_eqiad - 3.2 upgrade (T421402)
08:41 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host ms-fe1013
08:41 mvernon@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-fe1013
08:41 mvernon@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ms-fe1013
08:41 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ms-fe1013.eqiad.wmnet 149.48.64.10.in-addr.arpa 9.4.1.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
08:41 mvernon@cumin2002: START - Cookbook sre.dns.wipe-cache ms-fe1013.eqiad.wmnet 149.48.64.10.in-addr.arpa 9.4.1.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
08:41 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
08:41 mvernon@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-fe1013 - mvernon@cumin2002"
08:41 mvernon@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-fe1013 - mvernon@cumin2002"
08:26 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2157 (T419635)', diff saved to https://phabricator.wikimedia.org/P90334 and previous config saved to /var/cache/conftool/dbconfig/20260409-082633-fceratto.json
08:26 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2157.codfw.wmnet with reason: Maintenance
08:23 elukey@cumin1003: END (FAIL) - Cookbook sre.k8s.pool-depool-cluster (exit_code=93) pool all services in codfw/aux-codfw: maintenance
08:21 mvernon@cumin2002: START - Cookbook sre.dns.netbox
08:21 mvernon@cumin2002: START - Cookbook sre.hosts.move-vlan for host ms-fe1013
08:21 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-cluster pool all services in codfw/aux-codfw: maintenance
08:20 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-fe1013.eqiad.wmnet with OS bullseye
08:14 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_eqiad - 3.2 upgrade (T421402)
08:14 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_eqiad - 3.2 upgrade (T421402)
08:11 fabfur: upgrading eqiad to haproxy 3.2 (T421402)
07:17 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool pc1016: After reimage
07:17 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
07:17 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
07:17 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool pc1016: After reimage
07:16 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host pc1016.eqiad.wmnet with OS trixie
07:03 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on pc2016.codfw.wmnet with reason: Maintenance
06:59 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on pc1016.eqiad.wmnet with reason: host reimage
06:55 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on pc1016.eqiad.wmnet with reason: host reimage
06:39 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host pc1016.eqiad.wmnet with OS trixie
06:38 marostegui@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host pc1016.eqiad.wmnet with OS trixie
06:38 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host pc1016.eqiad.wmnet with OS trixie
05:59 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host pc2016.codfw.wmnet with OS trixie
05:35 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on pc2016.codfw.wmnet with reason: host reimage
05:31 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on pc2016.codfw.wmnet with reason: host reimage
05:13 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host pc2016.codfw.wmnet with OS trixie
05:10 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on pc2016.codfw.wmnet,pc1016.eqiad.wmnet with reason: Reimage to Debian Trixie
05:09 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool pc1016: Reimage
05:09 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
05:09 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
05:09 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool pc1016: Reimage
02:31 kevinbazira@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
02:09 kevinbazira@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 11s)
02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
00:57 zabe@deploy1003: Finished scap sync-world: Backport for Start reading from new file tables everwhere except enwiki and commons (T416548) (duration: 07m 40s)
00:53 zabe@deploy1003: zabe: Continuing with sync
00:51 zabe@deploy1003: zabe: Backport for Start reading from new file tables everwhere except enwiki and commons (T416548) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
00:49 zabe@deploy1003: Started scap sync-world: Backport for Start reading from new file tables everwhere except enwiki and commons (T416548)
00:22 eevans@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for aqs1024.eqiad.wmnet
00:22 eevans@cumin1003: START - Cookbook sre.hosts.remove-downtime for aqs1024.eqiad.wmnet

2026-04-08

22:04 ladsgroup@deploy1003: Finished scap sync-world: Backport for Revert "Use envoy for swift inside mediawiki" (duration: 06m 54s)
22:00 ladsgroup@deploy1003: ladsgroup: Continuing with sync
21:59 ladsgroup@deploy1003: ladsgroup: Backport for Revert "Use envoy for swift inside mediawiki" synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
21:57 ladsgroup@deploy1003: Started scap sync-world: Backport for Revert "Use envoy for swift inside mediawiki"
21:46 rzl@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
21:45 rzl@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
21:45 rzl@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
21:27 rzl@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
21:17 rzl@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
21:04 ladsgroup@deploy1003: Finished scap sync-world: Backport for Use envoy for swift inside mediawiki (T328872) (duration: 06m 27s)
21:00 ladsgroup@deploy1003: ladsgroup: Continuing with sync
21:00 ladsgroup@deploy1003: ladsgroup: Backport for Use envoy for swift inside mediawiki (T328872) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
20:58 ladsgroup@deploy1003: Started scap sync-world: Backport for Use envoy for swift inside mediawiki (T328872)
20:40 jdrewniak@deploy1003: Finished scap sync-world: Backport for Bumping portals to master (T128546) (duration: 06m 14s)
20:36 jdrewniak@deploy1003: jdrewniak: Continuing with sync
20:35 jdrewniak@deploy1003: jdrewniak: Backport for Bumping portals to master (T128546) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
20:34 jdrewniak@deploy1003: Started scap sync-world: Backport for Bumping portals to master (T128546)
20:24 jdrewniak@deploy1003: jdrewniak: Continuing with sync
20:23 jdrewniak@deploy1003: jdrewniak: Backport for Bumping portals to master (T128546) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
20:21 jdrewniak@deploy1003: Started scap sync-world: Backport for Bumping portals to master (T128546)
20:17 toyofuku@deploy1003: Finished scap sync-world: Backport for Disable extension:WP25EasterEggs from Wikipedias. (T422548) (duration: 09m 27s)
20:13 toyofuku@deploy1003: jdrewniak, toyofuku: Continuing with sync
20:09 toyofuku@deploy1003: jdrewniak, toyofuku: Backport for Disable extension:WP25EasterEggs from Wikipedias. (T422548) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
20:07 toyofuku@deploy1003: Started scap sync-world: Backport for Disable extension:WP25EasterEggs from Wikipedias. (T422548)
19:35 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-mcrouter: apply
19:22 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-mcrouter: apply
19:14 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cirrussearch1103.eqiad.wmnet with OS bullseye
19:09 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-mcrouter: apply
19:01 eevans@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on aqs1024.eqiad.wmnet with reason: Bootstrapping — T412830
18:57 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/mw-mcrouter: apply
18:56 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/mw-mcrouter: apply
18:55 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/mw-mcrouter: apply
18:54 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cirrussearch1103.eqiad.wmnet with reason: host reimage
18:49 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cirrussearch1103.eqiad.wmnet with reason: host reimage
18:33 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cirrussearch1103
18:33 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host cirrussearch1103
18:32 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cirrussearch1103.eqiad.wmnet with OS bullseye
18:25 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cirrussearch1103.eqiad.wmnet with OS bullseye
18:08 dancy@deploy1003: rebuilt and synchronized wikiversions files: group1 to 1.46.0-wmf.23 refs T420481
18:04 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cirrussearch1103.eqiad.wmnet with reason: host reimage
18:00 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cirrussearch1103.eqiad.wmnet with reason: host reimage
17:49 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cirrussearch1088.eqiad.wmnet with OS bullseye
17:43 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cirrussearch1103
17:43 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cirrussearch1103
17:42 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cirrussearch1103
17:42 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cirrussearch1103.eqiad.wmnet 43.48.64.10.in-addr.arpa 3.4.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
17:42 bking@cumin2002: START - Cookbook sre.dns.wipe-cache cirrussearch1103.eqiad.wmnet 43.48.64.10.in-addr.arpa 3.4.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
17:42 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
17:42 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cirrussearch1103 - bking@cumin2002"
17:42 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cirrussearch1103 - bking@cumin2002"
17:39 bking@cumin2002: START - Cookbook sre.dns.netbox
17:37 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host cirrussearch1103
17:36 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cirrussearch1103.eqiad.wmnet with OS bullseye
17:36 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cirrussearch1089.eqiad.wmnet with OS bullseye
17:35 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cirrussearch1089.eqiad.wmnet with OS bullseye
17:29 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cirrussearch1088.eqiad.wmnet with reason: host reimage
17:23 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cirrussearch1088.eqiad.wmnet with reason: host reimage
17:08 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host testvm2002.codfw.wmnet with OS trixie
17:07 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cirrussearch1088
17:07 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cirrussearch1088
17:06 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cirrussearch1088
17:06 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cirrussearch1088.eqiad.wmnet 176.32.64.10.in-addr.arpa 6.7.1.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
17:06 bking@cumin2002: START - Cookbook sre.dns.wipe-cache cirrussearch1088.eqiad.wmnet 176.32.64.10.in-addr.arpa 6.7.1.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
17:06 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
17:06 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cirrussearch1088 - bking@cumin2002"
17:06 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cirrussearch1088 - bking@cumin2002"
17:02 bking@cumin2002: START - Cookbook sre.dns.netbox
17:01 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host cirrussearch1088
17:00 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cirrussearch1088.eqiad.wmnet with OS bullseye
16:20 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_codfw - 3.2 upgrade (T421402)
16:19 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_codfw - 3.2 upgrade (T421402)
16:07 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cirrussearch1087.eqiad.wmnet with OS bullseye
16:00 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cirrussearch1081.eqiad.wmnet with OS bullseye
15:52 eevans@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for aqs1023.eqiad.wmnet
15:52 eevans@cumin1003: START - Cookbook sre.hosts.remove-downtime for aqs1023.eqiad.wmnet
15:52 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host testvm2002.codfw.wmnet with OS trixie
15:49 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cirrussearch1087.eqiad.wmnet with reason: host reimage
15:43 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cirrussearch1087.eqiad.wmnet with reason: host reimage
15:42 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_codfw - 3.2 upgrade (T421402)
15:42 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_codfw - 3.2 upgrade (T421402)
15:41 fabfur: upgrading codfw to haproxy 3.2 (T421402)
15:36 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_eqsin - 3.2 upgrade (T421402)
15:35 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cirrussearch1081.eqiad.wmnet with reason: host reimage
15:29 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_eqsin - 3.2 upgrade (T421402)
15:28 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts cloudcephmon2004-dev.codfw.wmnet
15:28 andrew@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
15:28 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cirrussearch1087
15:28 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cirrussearch1087
15:27 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cirrussearch1087
15:27 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cirrussearch1087.eqiad.wmnet 174.32.64.10.in-addr.arpa 4.7.1.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
15:27 bking@cumin2002: START - Cookbook sre.dns.wipe-cache cirrussearch1087.eqiad.wmnet 174.32.64.10.in-addr.arpa 4.7.1.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
15:27 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
15:27 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cirrussearch1087 - bking@cumin2002"
15:27 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cirrussearch1081.eqiad.wmnet with reason: host reimage
15:26 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cirrussearch1087 - bking@cumin2002"
15:26 andrew@cumin2002: START - Cookbook sre.dns.netbox
15:20 bking@cumin2002: START - Cookbook sre.dns.netbox
15:20 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host cirrussearch1087
15:19 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cirrussearch1087.eqiad.wmnet with OS bullseye
15:19 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
15:19 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
15:18 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
15:18 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
15:18 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
15:17 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
15:17 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
15:17 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
15:17 sukhe: sukhe@lvs1020:~$ sudo systemctl restart pybal.service
15:17 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
15:16 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
15:16 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
15:16 andrew@cumin2002: START - Cookbook sre.hosts.decommission for hosts cloudcephmon2004-dev.codfw.wmnet
15:16 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
15:11 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cirrussearch1081
15:11 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cirrussearch1081
15:10 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cirrussearch1081
15:10 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cirrussearch1081.eqiad.wmnet 166.32.64.10.in-addr.arpa 6.6.1.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
15:10 bking@cumin2002: START - Cookbook sre.dns.wipe-cache cirrussearch1081.eqiad.wmnet 166.32.64.10.in-addr.arpa 6.6.1.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
15:10 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
15:10 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cirrussearch1081 - bking@cumin2002"
15:10 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cirrussearch1081 - bking@cumin2002"
15:06 bking@cumin2002: START - Cookbook sre.dns.netbox
15:05 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host cirrussearch1081
15:05 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cirrussearch1081.eqiad.wmnet with OS bullseye
15:00 derick@deploy1003: mwscript-k8s job started: extensions/CentralAuth/maintenance/fixStuckGlobalRename.php --wiki=zhwiki --logwiki=metawiki 'Mr Kazi Tuhin' KaziHasanTuhin # T422677
14:58 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
14:58 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
14:57 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
14:53 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
14:49 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
14:48 taavi: serve dumps rsync traffic via new LVS service T422040
14:47 swfrench@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
14:42 taavi@dns1004: END - running authdns-update
14:41 taavi@dns1004: START - running authdns-update
14:39 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host seaborgium.wikimedia.org
14:37 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_eqsin - 3.2 upgrade (T421402)
14:37 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_eqsin - 3.2 upgrade (T421402)
14:36 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host seaborgium.wikimedia.org
14:32 fabfur: upgrading eqsin to haproxy 3.2 (T421402)
14:19 ecarg@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
14:18 ecarg@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
14:18 ecarg@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
14:17 ecarg@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
14:17 ecarg@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
14:16 ecarg@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
14:11 ecarg@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
14:11 ecarg@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
14:11 ecarg@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
14:10 ecarg@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
14:09 ecarg@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
14:08 ecarg@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
14:07 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
14:07 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
14:04 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
14:03 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
13:45 gkyziridis@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
13:42 Lucas_WMDE: UTC afternoon backport+config window done
13:41 phuedx@deploy1003: Finished scap sync-world: Backport for PHP SDK: Measure known experiments correctly (T422112), PHP SDK: Measure known experiments correctly (T422112) (duration: 07m 58s)
13:37 phuedx@deploy1003: phuedx: Continuing with sync
13:35 phuedx@deploy1003: phuedx: Backport for PHP SDK: Measure known experiments correctly (T422112), PHP SDK: Measure known experiments correctly (T422112) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
13:33 bking@cumin2002: conftool action : set/pooled=true; selector: dnsdisc=wdqs-internal-scholarly,name=eqiad
13:33 phuedx@deploy1003: Started scap sync-world: Backport for PHP SDK: Measure known experiments correctly (T422112), PHP SDK: Measure known experiments correctly (T422112)
13:31 ayounsi@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host testvm2006.codfw.wmnet with OS trixie
13:28 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for cswiki: lift IP cap for workshop (T422520) (duration: 06m 22s)
13:26 moritzm: upgrade debdeploy-server on cumin2002 to 0.0.99.14-1+deb12u1+exp1 (temporary build with Cumin 6 compat before we have Cumin 6 universally)
13:25 lucaswerkmeister-wmde@deploy1003: anzx, lucaswerkmeister-wmde: Continuing with sync
13:24 lucaswerkmeister-wmde@deploy1003: anzx, lucaswerkmeister-wmde: Backport for cswiki: lift IP cap for workshop (T422520) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
13:22 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for cswiki: lift IP cap for workshop (T422520)
13:15 cscott@deploy1003: Finished scap sync-world: Backport for Turn on Parsoid Read Views for eswiki (T422524) (duration: 07m 06s)
13:11 cscott@deploy1003: cscott: Continuing with sync
13:10 cscott@deploy1003: cscott: Backport for Turn on Parsoid Read Views for eswiki (T422524) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
13:08 cscott@deploy1003: Started scap sync-world: Backport for Turn on Parsoid Read Views for eswiki (T422524)
13:04 taavi: restarting pybal on lvs1018
12:49 gkyziridis@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
12:43 taavi: restarting pybal on lvs1020
12:40 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host testvm2006.codfw.wmnet with OS trixie
12:32 gkyziridis@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
12:32 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
12:32 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
12:32 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
12:31 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
12:29 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
12:28 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
12:28 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
12:27 jiji@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM wikikube-worker-exp2001.codfw.wmnet
12:27 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
12:27 jiji@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM wikikube-worker-exp2001.codfw.wmnet
12:15 mszwarc@deploy1003: mwscript-k8s job started: foreachwikiindblist all backfillInterwikiRightsLog.php --remote-wiki metawiki 20260311190000 # T6055
12:13 jiji@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM wikikube-worker-exp1001.eqiad.wmnet
12:07 jiji@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM wikikube-worker-exp1001.eqiad.wmnet
11:53 gkyziridis@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
11:44 kart_: machinetranslation: Remove networkpolicies for people* (T335491)
11:43 kartik@deploy1003: helmfile [eqiad] DONE helmfile.d/services/machinetranslation: apply
11:43 kartik@deploy1003: helmfile [eqiad] START helmfile.d/services/machinetranslation: apply
11:42 kartik@deploy1003: helmfile [codfw] DONE helmfile.d/services/machinetranslation: apply
11:42 kartik@deploy1003: helmfile [codfw] START helmfile.d/services/machinetranslation: apply
11:42 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-drmrs - 3.2.15 upgrade (T421402)
11:41 kartik@deploy1003: helmfile [staging] DONE helmfile.d/services/machinetranslation: apply
11:41 kartik@deploy1003: helmfile [staging] START helmfile.d/services/machinetranslation: apply
11:38 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-magru - 3.2.15 upgrade (T421402)
11:35 gkyziridis@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
11:23 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-ulsfo - 3.2.15 upgrade (T421402)
11:15 moritzm: installing dpkg security updates
11:11 moritzm: installing Tomcat security updates
11:11 mvolz@deploy1003: helmfile [staging] DONE helmfile.d/services/citoid: apply
11:10 mvolz@deploy1003: helmfile [staging] START helmfile.d/services/citoid: apply
11:01 mvernon@cumin2002: END (PASS) - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies (exit_code=0) rolling restart_daemons on P{ms-fe[1009-1010,1012-1024].eqiad.wmnet} and (A:swift-fe or A:swift-fe-canary or A:swift-fe-codfw or A:swift-fe-eqiad)
10:52 mvernon@cumin2002: START - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies rolling restart_daemons on P{ms-fe[1009-1010,1012-1024].eqiad.wmnet} and (A:swift-fe or A:swift-fe-canary or A:swift-fe-codfw or A:swift-fe-eqiad)
10:52 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-fe1011.eqiad.wmnet with OS bullseye
10:48 ayounsi@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host testvm2006.codfw.wmnet with OS trixie
10:42 blake@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
10:42 blake@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-web: apply
10:42 blake@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-web: apply
10:41 blake@deploy1003: helmfile [codfw] START helmfile.d/services/mw-web: apply
10:29 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-fe1011.eqiad.wmnet with reason: host reimage
10:25 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-fe1011.eqiad.wmnet with reason: host reimage
10:15 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
10:14 hnowlan@deploy1003: Finished deploy [restbase/deploy@dcc15be]: Add urwikisource T415975 (duration: 01m 31s)
10:12 hnowlan@deploy1003: Started deploy [restbase/deploy@dcc15be]: Add urwikisource T415975
10:12 jmm@cumin2002: START - Cookbook sre.dns.netbox
10:11 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host ms-fe1011
10:11 mvernon@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-fe1011
10:08 mvernon@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ms-fe1011
10:08 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ms-fe1011.eqiad.wmnet 182.32.64.10.in-addr.arpa 2.8.1.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
10:08 mvernon@cumin2002: START - Cookbook sre.dns.wipe-cache ms-fe1011.eqiad.wmnet 182.32.64.10.in-addr.arpa 2.8.1.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
10:08 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
10:08 mvernon@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-fe1011 - mvernon@cumin2002"
10:08 mvernon@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-fe1011 - mvernon@cumin2002"
10:03 mvernon@cumin2002: START - Cookbook sre.dns.netbox
10:02 mvernon@cumin2002: START - Cookbook sre.hosts.move-vlan for host ms-fe1011
10:02 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-fe1011.eqiad.wmnet with OS bullseye
09:58 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host testvm2006.codfw.wmnet with OS trixie
09:55 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-ulsfo - 3.2.15 upgrade (T421402)
09:55 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-drmrs - 3.2.15 upgrade (T421402)
09:55 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-magru - 3.2.15 upgrade (T421402)
09:54 fabfur: upgrading haproxy to version 3.2.15 on magru,drmrs,ulsfo (T421402)
09:41 ayounsi@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host testvm2006.codfw.wmnet with OS trixie
09:00 taavi: remove unused cloud-vrf clouddumps cr firewall rule https://gerrit.wikimedia.org/r/c/operations/homer/public/+/1268516
08:53 taavi@cumin1003: conftool action : set/pooled=yes; selector: name=clouddumps1001.wikimedia.org
08:53 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host testvm2006.codfw.wmnet with OS trixie
08:52 ayounsi@dns1004: END - running authdns-update
08:51 ayounsi@dns1004: START - running authdns-update
08:47 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-cluster (exit_code=0) pool all services in codfw/aux-codfw: maintenance
08:47 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-cluster pool all services in codfw/aux-codfw: maintenance
08:41 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.wipe-cluster (exit_code=0) Wipe the K8s cluster aux-codfw: Kubernetes upgrade
08:40 elukey@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: sync
08:39 elukey@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: sync
08:33 elukey@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/aux-k8s-services/sophroid: sync
08:32 elukey@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/aux-k8s-services/sophroid: sync
08:32 elukey@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/aux-k8s-services/redioscope: sync
08:32 elukey@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/aux-k8s-services/redioscope: sync
08:32 elukey@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/aux-k8s-services/kafka-mirrormaker: sync
08:31 elukey@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/aux-k8s-services/kafka-mirrormaker: sync
08:31 elukey@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/aux-k8s-services/jaeger: sync
08:31 elukey@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/aux-k8s-services/jaeger: sync
08:24 elukey@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'sync'.
08:23 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti5007.eqsin.wmnet
08:22 elukey@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'sync'.
08:20 elukey@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'sync'.
08:19 elukey@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'sync'.
08:13 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti5007.eqsin.wmnet
08:04 elukey@cumin1003: START - Cookbook sre.k8s.wipe-cluster Wipe the K8s cluster aux-codfw: Kubernetes upgrade
08:03 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-cluster (exit_code=0) depool all services in codfw/aux-codfw: maintenance
08:02 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-cluster depool all services in codfw/aux-codfw: maintenance
07:48 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-cluster (exit_code=0) depool all services in codfw/aux-codfw: maintenance
07:48 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-cluster depool all services in codfw/aux-codfw: maintenance
07:46 krinkle@deploy1003: Finished scap sync-world: Backport for Enable wgTrackMediaRequestProvenance on most group1 wikis (T414338) (duration: 09m 34s)
07:41 krinkle@deploy1003: krinkle: Continuing with sync
07:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
07:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: eqsin routed ganeti IPs - ayounsi@cumin1003"
07:40 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: eqsin routed ganeti IPs - ayounsi@cumin1003"
07:38 krinkle@deploy1003: krinkle: Backport for Enable wgTrackMediaRequestProvenance on most group1 wikis (T414338) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
07:36 krinkle@deploy1003: Started scap sync-world: Backport for Enable wgTrackMediaRequestProvenance on most group1 wikis (T414338)
07:35 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
07:33 wmde-fisch@deploy1003: Finished scap sync-world: Backport for wikimaniawiki: add editsemiprotected userright to extendedconfirmed usergroup (T421770) (duration: 06m 54s)
07:29 wmde-fisch@deploy1003: wmde-fisch, anzx: Continuing with sync
07:28 elukey@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-cluster (exit_code=0) depool all services in codfw/aux-codfw: maintenance
07:28 elukey@cumin1003: START - Cookbook sre.k8s.pool-depool-cluster depool all services in codfw/aux-codfw: maintenance
07:28 wmde-fisch@deploy1003: wmde-fisch, anzx: Backport for wikimaniawiki: add editsemiprotected userright to extendedconfirmed usergroup (T421770) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
07:26 wmde-fisch@deploy1003: Started scap sync-world: Backport for wikimaniawiki: add editsemiprotected userright to extendedconfirmed usergroup (T421770)
07:19 moritzm: installing openssl security updates
07:15 wmde-fisch@deploy1003: Finished scap sync-world: Backport for Enable sub-references on Czech and Italian wiki (T420938) (duration: 08m 44s)
07:11 wmde-fisch@deploy1003: wmde-fisch: Continuing with sync
07:08 wmde-fisch@deploy1003: wmde-fisch: Backport for Enable sub-references on Czech and Italian wiki (T420938) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
07:06 wmde-fisch@deploy1003: Started scap sync-world: Backport for Enable sub-references on Czech and Italian wiki (T420938)
05:58 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1152: After reimage
05:58 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
05:57 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
05:57 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1152: After reimage
05:50 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1152.eqiad.wmnet with OS trixie
05:33 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1152.eqiad.wmnet with reason: host reimage
05:29 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1152.eqiad.wmnet with reason: host reimage
05:15 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db1152.eqiad.wmnet with OS trixie
05:15 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1152: Reimage
05:15 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
05:15 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
05:15 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1152: Reimage
05:13 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2142.codfw.wmnet,db1152.eqiad.wmnet with reason: Maintenance
02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 12s)
02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image

2026-04-07

22:01 cscott@deploy1003: Finished scap sync-world: Backport for Actually enable parsoid postproc for all wikis (except enwiki) (duration: 08m 05s)
21:57 cscott@deploy1003: cscott, ihurbain: Continuing with sync
21:55 cscott@deploy1003: cscott, ihurbain: Backport for Actually enable parsoid postproc for all wikis (except enwiki) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
21:53 cscott@deploy1003: Started scap sync-world: Backport for Actually enable parsoid postproc for all wikis (except enwiki)
21:52 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cirrussearch1083.eqiad.wmnet with OS bullseye
21:50 cscott@deploy1003: Finished scap sync-world: Backport for ParserMigration: transition to new configuration variables (T422543), Enable legacy post-processing cache for DiscussionTools (T376183) (duration: 07m 40s)
21:46 cscott@deploy1003: ihurbain, cscott: Continuing with sync
21:45 cscott@deploy1003: ihurbain, cscott: Backport for ParserMigration: transition to new configuration variables (T422543), Enable legacy post-processing cache for DiscussionTools (T376183) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
21:43 cscott@deploy1003: Started scap sync-world: Backport for ParserMigration: transition to new configuration variables (T422543), Enable legacy post-processing cache for DiscussionTools (T376183)
{{safesubst:SAL entry|1=21:39 cscott@deploy1003: Finished scap sync-world: Backport for PHP SDK: Handle experiment config missing or malformed (T422112), ForeignWikiRequest: Pass session to internal 'centralauthtoken' request (T422218), Remove Navigation Menu Link Instrumentation on Personal Dashboard (T422512), Bump wikimedia/parsoid to 0.23.0-a26 (T422394), [[gerrit:1268}}
21:35 cscott@deploy1003: matmarex, sfaci, cscott, kgraessle: Continuing with sync
21:33 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cirrussearch1083.eqiad.wmnet with reason: host reimage
{{safesubst:SAL entry|1=21:33 cscott@deploy1003: matmarex, sfaci, cscott, kgraessle: Backport for PHP SDK: Handle experiment config missing or malformed (T422112), ForeignWikiRequest: Pass session to internal 'centralauthtoken' request (T422218), Remove Navigation Menu Link Instrumentation on Personal Dashboard (T422512), Bump wikimedia/parsoid to 0.23.0-a26 (T422394), [[g}}
{{safesubst:SAL entry|1=21:31 cscott@deploy1003: Started scap sync-world: Backport for PHP SDK: Handle experiment config missing or malformed (T422112), ForeignWikiRequest: Pass session to internal 'centralauthtoken' request (T422218), Remove Navigation Menu Link Instrumentation on Personal Dashboard (T422512), Bump wikimedia/parsoid to 0.23.0-a26 (T422394), [[gerrit:12686}}
21:27 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cirrussearch1083.eqiad.wmnet with reason: host reimage
{{safesubst:SAL entry|1=21:17 cscott@deploy1003: Finished scap sync-world: Backport for Ensure RevisionOutputCache uses post-processing options where appropriate (T421629), Remove Navigation Menu Link Instrumentation on Personal Dashboard (T422512), ForeignWikiRequest: Pass session to internal 'centralauthtoken' request (T422218), [[gerrit:1268651|PHP SDK: Handle experiment config missing or}}
21:13 cscott@deploy1003: cscott, kgraessle, sfaci, matmarex: Continuing with sync
21:12 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cirrussearch1083
21:12 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cirrussearch1083
21:11 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cirrussearch1083
21:11 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cirrussearch1083.eqiad.wmnet 168.32.64.10.in-addr.arpa 8.6.1.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
21:11 bking@cumin2002: START - Cookbook sre.dns.wipe-cache cirrussearch1083.eqiad.wmnet 168.32.64.10.in-addr.arpa 8.6.1.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
21:11 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
21:11 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cirrussearch1083 - bking@cumin2002"
21:11 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cirrussearch1083 - bking@cumin2002"
{{safesubst:SAL entry|1=21:10 cscott@deploy1003: cscott, kgraessle, sfaci, matmarex: Backport for Ensure RevisionOutputCache uses post-processing options where appropriate (T421629), Remove Navigation Menu Link Instrumentation on Personal Dashboard (T422512), ForeignWikiRequest: Pass session to internal 'centralauthtoken' request (T422218), [[gerrit:1268651|PHP SDK: Handle experiment config}}
{{safesubst:SAL entry|1=21:09 cscott@deploy1003: Started scap sync-world: Backport for Ensure RevisionOutputCache uses post-processing options where appropriate (T421629), Remove Navigation Menu Link Instrumentation on Personal Dashboard (T422512), ForeignWikiRequest: Pass session to internal 'centralauthtoken' request (T422218), [[gerrit:1268651|PHP SDK: Handle experiment config missing or}}
21:08 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
21:05 ryankemper: [WDQS] codfw is getting slammed hard enough that hosts are falling immediately back into deadlock post-restart and largely failing to report metrics. not much we can do atm, there will be some noise
21:02 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
21:01 bking@cumin2002: START - Cookbook sre.dns.netbox
21:00 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host cirrussearch1083
21:00 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cirrussearch1083.eqiad.wmnet with OS bullseye
20:57 cscott@deploy1003: Finished scap sync-world: Backport for Bump jawiki to 100% Parsoid Read Views (from 10%) (T420273), REST: Publish ReadingLists v0 module in REST Sandbox (T419619), Move createwithcontentmodel to autoconfirmed (T248294) (duration: 13m 27s)
20:51 cscott@deploy1003: cscott, pppery, kineticpelagic: Continuing with sync
20:48 cscott@deploy1003: cscott, pppery, kineticpelagic: Backport for Bump jawiki to 100% Parsoid Read Views (from 10%) (T420273), REST: Publish ReadingLists v0 module in REST Sandbox (T419619), Move createwithcontentmodel to autoconfirmed (T248294) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
20:47 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
20:44 cscott@deploy1003: Started scap sync-world: Backport for Bump jawiki to 100% Parsoid Read Views (from 10%) (T420273), REST: Publish ReadingLists v0 module in REST Sandbox (T419619), Move createwithcontentmodel to autoconfirmed (T248294)
20:42 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
20:40 reedy@deploy1003: Finished scap sync-world: Backport for Undeploy Extension:StopForumSpam (T422185) (duration: 31m 17s)
20:40 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
20:33 eevans@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on aqs1023.eqiad.wmnet with reason: Bootstrapping — T412830
20:30 eevans@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
20:30 eevans@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Additional IPs for aqs hosts - eevans@cumin1003"
20:30 eevans@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Additional IPs for aqs hosts - eevans@cumin1003"
20:28 reedy@deploy1003: reedy: Continuing with sync
20:28 reedy@deploy1003: reedy: Backport for Undeploy Extension:StopForumSpam (T422185) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
20:26 eevans@cumin1003: START - Cookbook sre.dns.netbox
20:19 eevans@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
20:19 eevans@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Additional IPs for aqs1024 - eevans@cumin1003"
20:19 eevans@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Additional IPs for aqs1024 - eevans@cumin1003"
20:14 eevans@cumin1003: START - Cookbook sre.dns.netbox
20:09 reedy@deploy1003: Started scap sync-world: Backport for Undeploy Extension:StopForumSpam (T422185)
20:07 eevans@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host aqs1023.eqiad.wmnet
20:02 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cirrussearch1082.eqiad.wmnet with OS bullseye
20:02 eevans@cumin1003: START - Cookbook sre.hosts.reboot-single for host aqs1023.eqiad.wmnet
19:43 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cirrussearch1082.eqiad.wmnet with reason: host reimage
19:38 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cirrussearch1082.eqiad.wmnet with reason: host reimage
19:32 eevans@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
19:32 eevans@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Additional IPs for aqs1023 - eevans@cumin1003"
19:32 eevans@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Additional IPs for aqs1023 - eevans@cumin1003"
19:27 eevans@cumin1003: START - Cookbook sre.dns.netbox
19:26 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cirrussearch1080.eqiad.wmnet with OS bullseye
19:22 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cirrussearch1082
19:22 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cirrussearch1082
19:17 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cirrussearch1082
19:17 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cirrussearch1082.eqiad.wmnet 167.32.64.10.in-addr.arpa 7.6.1.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
19:17 bking@cumin2002: START - Cookbook sre.dns.wipe-cache cirrussearch1082.eqiad.wmnet 167.32.64.10.in-addr.arpa 7.6.1.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
19:16 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
19:16 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cirrussearch1082 - bking@cumin2002"
19:16 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cirrussearch1082 - bking@cumin2002"
19:08 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cirrussearch1080.eqiad.wmnet with reason: host reimage
19:08 bking@cumin2002: START - Cookbook sre.dns.netbox
19:08 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host cirrussearch1082
19:07 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cirrussearch1082.eqiad.wmnet with OS bullseye
19:04 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cirrussearch1080.eqiad.wmnet with reason: host reimage
18:49 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cirrussearch1080
18:49 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cirrussearch1080
18:47 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cirrussearch1080
18:47 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cirrussearch1080.eqiad.wmnet 29.32.64.10.in-addr.arpa 9.2.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
18:47 bking@cumin2002: START - Cookbook sre.dns.wipe-cache cirrussearch1080.eqiad.wmnet 29.32.64.10.in-addr.arpa 9.2.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
18:47 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
18:47 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cirrussearch1080 - bking@cumin2002"
18:47 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cirrussearch1080 - bking@cumin2002"
18:44 bking@cumin2002: START - Cookbook sre.dns.netbox
18:44 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host cirrussearch1080
18:43 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cirrussearch1080.eqiad.wmnet with OS bullseye
18:24 dancy@deploy1003: rebuilt and synchronized wikiversions files: group0 to 1.46.0-wmf.23 refs T420481
18:13 dreamyjazz@deploy1003: Finished scap sync-world: Backport for ClientHints: Don't collect header only on null edit (T418989) (duration: 12m 14s)
18:07 dreamyjazz@deploy1003: dreamyjazz: Continuing with sync
18:06 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
18:06 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
18:05 dreamyjazz@deploy1003: dreamyjazz: Backport for ClientHints: Don't collect header only on null edit (T418989) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
18:01 dreamyjazz@deploy1003: Started scap sync-world: Backport for ClientHints: Don't collect header only on null edit (T418989)
16:41 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
16:39 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
16:31 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
16:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
16:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
16:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
16:29 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
16:29 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
16:29 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
16:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
16:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
16:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
16:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
16:22 Lucas_WMDE: UTC afternoon backport+config window (belatedly) done
{{safesubst:SAL entry|1=16:20 dreamyjazz@deploy1003: Finished scap sync-world: Backport for Set $wgGlobalBlockingWikisWhereGlobalBlocksDoNotApply (T422220), GlobalBlockLocalStatusLookup: Remove unused constructor param (T422220), GlobalBlockLocalStatusLookup: Remove unused constructor param (T422220), [[gerrit:1268585|GlobalBlockLocalStatusLookup: Support wikis that don't apply blocks (T4222}}
16:07 dreamyjazz@deploy1003: stran, dreamyjazz: Continuing with sync
16:03 dreamyjazz@deploy1003: stran, dreamyjazz: Backport for Set $wgGlobalBlockingWikisWhereGlobalBlocksDoNotApply (T422220), GlobalBlockLocalStatusLookup: Remove unused constructor param (T422220), GlobalBlockLocalStatusLookup: Remove unused constructor param (T422220), GlobalBlockLocalStatusLookup: Support wikis that don't apply blocks (T422220),
15:56 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
15:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
15:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
15:45 cgoubert@cumin1003: END (FAIL) - Cookbook sre.k8s.renumber-node (exit_code=1) Renumbering for host wikikube-worker1273.eqiad.wmnet
15:45 cgoubert@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker1273.eqiad.wmnet
15:45 cgoubert@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker1273.eqiad.wmnet
{{safesubst:SAL entry|1=15:44 dreamyjazz@deploy1003: Started scap sync-world: Backport for Set $wgGlobalBlockingWikisWhereGlobalBlocksDoNotApply (T422220), GlobalBlockLocalStatusLookup: Remove unused constructor param (T422220), GlobalBlockLocalStatusLookup: Remove unused constructor param (T422220), [[gerrit:1268585|GlobalBlockLocalStatusLookup: Support wikis that don't apply blocks (T42222}}
15:44 sukhe@dns1004: END - running authdns-update
15:42 sukhe@dns1004: START - running authdns-update
15:31 aikochou@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revise-tone-task-generator' for release 'main' .
15:31 ayounsi@cumin1003: conftool action : set/pooled=yes; selector: cluster=dnsbox,dc=esams [reason: esams maintenance over]
15:30 moritzm: installing postgresql-15 security updates
15:29 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: pool esams [reason: network maintenance over, T416450]
15:28 ayounsi@cumin1003: START - Cookbook sre.dns.admin DNS admin: pool esams [reason: network maintenance over, T416450]
15:25 claime: homer lsw1-d1-eqiad* commit
15:24 claime: homer cr*eqiad* commit
15:24 cgoubert@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1273.eqiad.wmnet with OS bookworm
15:21 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
15:20 Emperor: restart swift object/container replicaton services on ms-be1069
15:20 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
15:14 XioNoX: cr1-esams - re-enabling external peers
15:12 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
15:10 swfrench@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
15:04 cgoubert@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1273.eqiad.wmnet with reason: host reimage
14:57 cgoubert@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1273.eqiad.wmnet with reason: host reimage
14:57 ayounsi@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on cr1-esams,cr1-esams IPv6,re0.cr1-esams.mgmt with reason: router upgrade
14:55 cgoubert@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker1347.eqiad.wmnet
14:54 cgoubert@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker1347.eqiad.wmnet
14:36 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host testvm2006.codfw.wmnet with OS trixie
14:34 cgoubert@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1273
14:34 cgoubert@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1273
14:30 XioNoX: re0.cr1-esams> request chassis routing-engine master switch
14:30 cgoubert@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1273
14:30 cgoubert@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1273.eqiad.wmnet 128.48.64.10.in-addr.arpa 8.2.1.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
14:30 cgoubert@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1273.eqiad.wmnet 128.48.64.10.in-addr.arpa 8.2.1.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
14:30 cgoubert@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
14:30 cgoubert@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1273 - cgoubert@cumin1003"
14:30 cgoubert@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1273 - cgoubert@cumin1003"
14:25 cgoubert@cumin1003: START - Cookbook sre.dns.netbox
14:25 cgoubert@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1273
14:24 cgoubert@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1273.eqiad.wmnet with OS bookworm
14:24 cgoubert@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker1273.eqiad.wmnet
14:23 cgoubert@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker1273.eqiad.wmnet
14:23 cgoubert@cumin1003: START - Cookbook sre.k8s.renumber-node Renumbering for host wikikube-worker1273.eqiad.wmnet
14:16 eevans@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching sessionstore1*: Upgrade Cassandra to 4.1.11 — T418417 - eevans@cumin1003
14:03 cgoubert@cumin1003: END (FAIL) - Cookbook sre.k8s.renumber-node (exit_code=99) Renumbering for host wikikube-worker1273.eqiad.wmnet
14:03 cgoubert@cumin1003: START - Cookbook sre.k8s.renumber-node Renumbering for host wikikube-worker1273.eqiad.wmnet
14:01 cgoubert@cumin1003: END (FAIL) - Cookbook sre.k8s.renumber-node (exit_code=99) Renumbering for host wikikube-worker1273.eqiad.wmnet
14:01 cgoubert@cumin1003: START - Cookbook sre.k8s.renumber-node Renumbering for host wikikube-worker1273.eqiad.wmnet
13:58 eevans@cumin1003: START - Cookbook sre.cassandra.roll-restart for nodes matching sessionstore1*: Upgrade Cassandra to 4.1.11 — T418417 - eevans@cumin1003
13:58 eevans@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching sessionstore2*: Upgrade Cassandra to 4.1.11 — T418417 - eevans@cumin1003
13:58 jmm@dns1004: END - running authdns-update
13:57 jmm@dns1004: START - running authdns-update
13:56 jmm@dns1004: END - running authdns-update
13:54 jmm@dns1004: START - running authdns-update
13:53 ayounsi@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cr1-esams,cr1-esams IPv6,re0.cr1-esams.mgmt with reason: router upgrade
13:53 jmm@dns1004: END - running authdns-update
13:51 jmm@dns1004: START - running authdns-update
13:46 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host testvm2006.codfw.wmnet with OS trixie
13:45 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ganeti1056.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
13:41 volans: installed cumin v6.0.0 on cumin2002
13:40 eevans@cumin1003: START - Cookbook sre.cassandra.roll-restart for nodes matching sessionstore2*: Upgrade Cassandra to 4.1.11 — T418417 - eevans@cumin1003
13:39 jclark@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1056.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
13:31 XioNoX: reboot cr1-esams
13:30 ayounsi@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on cr1-esams,cr1-esams IPv6,re0.cr1-esams.mgmt with reason: router upgrade
13:25 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti5007.eqsin.wmnet with OS bookworm
13:19 taavi@cumin1003: conftool action : set/pooled=yes; selector: name=clouddumps1002.wikimedia.org
13:19 taavi@cumin1003: conftool action : set/pooled=no; selector: name=clouddumps1001.wikimedia.org
13:19 taavi@cumin1003: conftool action : set/weight=100; selector: cluster=dumps
13:11 XioNoX: re0.cr1-esams> request chassis routing-engine master switch
13:05 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti5007.eqsin.wmnet with reason: host reimage
12:59 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti5007.eqsin.wmnet with reason: host reimage
12:39 XioNoX: re1.cr1-esams> request chassis routing-engine master switch - that will cause router's short unavailability - T416450
12:27 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti5007.eqsin.wmnet with OS bookworm
12:16 jmm@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ganeti5007.eqsin.wmnet
12:15 jmm@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ganeti5007.eqsin.wmnet
12:11 XioNoX: re0.cr1-esams> request chassis routing-engine master switch - that will cause router's short unavailability - T416450
12:08 jmm@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ganeti5007.eqsin.wmnet
12:08 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti5007.eqsin.wmnet
12:04 XioNoX: reboot re1.cr1-esams (backup RE) for upgrade - T416450
12:03 ayounsi@cumin1003: conftool action : set/pooled=no; selector: cluster=dnsbox,dc=esams [reason: esams network maintenance]
12:01 ayounsi@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on cr1-esams,cr1-esams IPv6,re0.cr1-esams.mgmt with reason: router upgrade
11:56 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti5007.eqsin.wmnet
11:36 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: depool esams [reason: network maintenance, T416450]
11:36 XioNoX: depool esams for network maintenance - T416450
11:36 ayounsi@cumin1003: START - Cookbook sre.dns.admin DNS admin: depool esams [reason: network maintenance, T416450]
11:31 jmm@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ganeti5007.eqsin.wmnet
10:46 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti5007.eqsin.wmnet
10:11 topranks: shift inter-site traffic from exsiting 10G to new 100G transport circuit between eqiad<->codfw T395878
08:52 Amir1: tightening the rate limit for non-standard thumbnails (T402792 T414805)
08:42 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti5007.eqsin.wmnet
08:40 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti5007.eqsin.wmnet
08:38 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti5007.eqsin.wmnet
08:38 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.drain-node (exit_code=99) for draining ganeti node ganeti5007.eqsin.wmnet
08:37 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti5007.eqsin.wmnet
08:25 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 42
08:22 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'configure' for AS: 42
08:18 XioNoX: update pfw1-eqiad NAT - T422380
08:05 hashar: Moved Debian Glue jobs to Jenkins agents running Bookworm (integration-agent-pkgbuilder-1005 and integration-agent-pkgbuilder-1006)| T421114
08:00 marostegui: Upgrade clouddb1017 to mariadb 10.11.16 (v3) T420177
07:59 XioNoX: push pfw policies - T422204
07:59 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1017.eqiad.wmnet with reason: Maintenance
07:54 hashar: Moved `operations-puppet-tests-bullseye` job from a Jenkins agent running Bullseye to one running Bookworm. The image is still on Bullseye! | T421114
07:44 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
07:14 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2157: after upgrade
07:14 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1159: after upgrade
06:32 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2142: Upgrade package
06:32 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
06:32 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
06:32 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2142: Upgrade package
06:32 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2248: Upgrade package
06:29 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2157: after upgrade
06:28 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1159: after upgrade
06:16 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2248: Upgrade package
06:16 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2249: Upgrade package
06:02 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool pc1011: after upgrade
06:02 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
06:02 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
06:02 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool pc1011: after upgrade
06:01 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db2249: Upgrade package
06:01 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1169: Upgrade package
05:45 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1169: Upgrade package
05:37 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1169: Upgrade package
05:37 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1169: Upgrade package
05:37 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2142: Upgrade package
05:36 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
05:36 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
05:36 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2142: Upgrade package
05:35 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2248: Upgrade package
05:35 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2248: Upgrade package
05:33 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.depool (exit_code=99) depool db2248: Upgrade package
05:33 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2248: Upgrade package
05:31 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.depool (exit_code=99) depool db2249.codfw.wmnet: Upgrade package
05:31 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2249.codfw.wmnet: Upgrade package
05:31 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.depool (exit_code=99) depool db2249: Upgrade package
05:31 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2249: Upgrade package
05:30 marostegui@cumin1003: END (FAIL) - Cookbook sre.mysql.depool (exit_code=99) depool db2249: Upgrade package
05:30 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db2249: Upgrade package
05:29 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db[2142,2248-2249].codfw.wmnet,db1169.eqiad.wmnet with reason: Upgrade to 10.11.16.v3
05:29 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db[2142,2248-2249].codfw.wmnet with reason: Upgrade to 10.11.16.v3
05:21 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on pc2011.codfw.wmnet,pc1011.eqiad.wmnet with reason: Upgrade to 10.11.16.v3
05:20 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
05:20 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
04:02 mwpresync@deploy1003: Pruned MediaWiki: 1.46.0-wmf.20 (duration: 02m 27s)
03:39 mwpresync@deploy1003: Finished scap sync-world: testwikis to 1.46.0-wmf.23 refs T420481 (duration: 35m 55s)
03:03 mwpresync@deploy1003: Started scap sync-world: testwikis to 1.46.0-wmf.23 refs T420481
00:10 zabe@deploy1003: Finished scap sync-world: Backport for Start reading from the new file tables on more large wikis (T416548) (duration: 06m 22s)
00:05 zabe@deploy1003: zabe: Continuing with sync
00:05 zabe@deploy1003: zabe: Backport for Start reading from the new file tables on more large wikis (T416548) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
00:03 zabe@deploy1003: Started scap sync-world: Backport for Start reading from the new file tables on more large wikis (T416548)
00:02 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host gitlab-runner1003.eqiad.wmnet with OS bookworm

2026-04-06

23:43 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on gitlab-runner1003.eqiad.wmnet with reason: host reimage
23:40 dzahn@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on gitlab-runner1003.eqiad.wmnet with reason: host reimage
23:25 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host gitlab-runner1003
23:25 dzahn@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host gitlab-runner1003
23:25 mutante: gitlab: reimaging trusted runners with --move-vlan parameter which changed their IPs - verified was showing up as online after the change and using the new IPs (T421717)
23:25 dzahn@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host gitlab-runner1003
23:25 dzahn@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) gitlab-runner1003.eqiad.wmnet 184.32.64.10.in-addr.arpa 4.8.1.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
23:25 dzahn@cumin2002: START - Cookbook sre.dns.wipe-cache gitlab-runner1003.eqiad.wmnet 184.32.64.10.in-addr.arpa 4.8.1.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
23:25 dzahn@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
23:25 dzahn@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host gitlab-runner1003 - dzahn@cumin2002"
23:24 dzahn@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host gitlab-runner1003 - dzahn@cumin2002"
23:18 dzahn@cumin2002: START - Cookbook sre.dns.netbox
23:12 dzahn@cumin2002: START - Cookbook sre.hosts.move-vlan for host gitlab-runner1003
23:12 dzahn@cumin2002: START - Cookbook sre.hosts.reimage for host gitlab-runner1003.eqiad.wmnet with OS bookworm
22:56 eevans@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:aqs-codfw: Actually upgrade Cassandra to 4.1.11 — T418417 - eevans@cumin1003
22:18 sbassett@deploy1003: Finished scap sync-world: Backport for Check if $res->message is null within ApiAuthManagerHelper (T422320) (duration: 06m 18s)
22:14 sbassett@deploy1003: sbassett: Continuing with sync
22:13 sbassett@deploy1003: sbassett: Backport for Check if $res->message is null within ApiAuthManagerHelper (T422320) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
22:12 sbassett@deploy1003: Started scap sync-world: Backport for Check if $res->message is null within ApiAuthManagerHelper (T422320)
21:26 eevans@cumin1003: START - Cookbook sre.cassandra.roll-restart for nodes matching A:aqs-codfw: Actually upgrade Cassandra to 4.1.11 — T418417 - eevans@cumin1003
21:25 eevans@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:aqs-eqiad: Actually upgrade Cassandra to 4.1.11 — T418417 - eevans@cumin1003
21:15 dancy@deploy1003: Installation of scap version "4.244.0" completed for 2 hosts
21:13 dancy@deploy1003: Installing scap version "4.244.0" for 2 host(s)
21:06 urbanecm: Unlocking mw-experimental@eqiad
21:00 urbanecm: Locking mw-experimental@eqiad
20:55 urbanecm@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-experimental: apply
20:54 urbanecm@deploy1003: helmfile [codfw] START helmfile.d/services/mw-experimental: apply
20:54 urbanecm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
20:53 urbanecm@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
20:50 urbanecm@deploy1003: Finished scap sync-world: Backport for Respect the echo-read-notifications right in user interface (T420154), Grant new 'echo-read-notifications' right to all users (T422297) (duration: 06m 30s)
20:46 urbanecm@deploy1003: urbanecm: Continuing with sync
20:45 urbanecm@deploy1003: urbanecm: Backport for Respect the echo-read-notifications right in user interface (T420154), Grant new 'echo-read-notifications' right to all users (T422297) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
20:44 urbanecm@deploy1003: Started scap sync-world: Backport for Respect the echo-read-notifications right in user interface (T420154), Grant new 'echo-read-notifications' right to all users (T422297)
20:28 kemayo@deploy1003: Finished scap sync-world: Backport for VisualEditorSuggestionFeedback: undo the addition of Talk to the URL (T420123) (duration: 07m 07s)
20:24 kemayo@deploy1003: kemayo: Continuing with sync
20:23 kemayo@deploy1003: kemayo: Backport for VisualEditorSuggestionFeedback: undo the addition of Talk to the URL (T420123) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
20:21 kemayo@deploy1003: Started scap sync-world: Backport for VisualEditorSuggestionFeedback: undo the addition of Talk to the URL (T420123)
20:18 kemayo@deploy1003: Finished scap sync-world: Backport for VisualEditor editcheck suggestion feedback is always consolidated (T420123), Set $wgReadingListsEnableBetaQuickSurvey to true for beta cluster (T422275) (duration: 10m 56s)
20:11 kemayo@deploy1003: kemayo, aude: Continuing with sync
20:08 kemayo@deploy1003: kemayo, aude: Backport for VisualEditor editcheck suggestion feedback is always consolidated (T420123), Set $wgReadingListsEnableBetaQuickSurvey to true for beta cluster (T422275) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
20:07 kemayo@deploy1003: Started scap sync-world: Backport for VisualEditor editcheck suggestion feedback is always consolidated (T420123), Set $wgReadingListsEnableBetaQuickSurvey to true for beta cluster (T422275)
20:00 eevans@cumin1003: START - Cookbook sre.cassandra.roll-restart for nodes matching A:aqs-eqiad: Actually upgrade Cassandra to 4.1.11 — T418417 - eevans@cumin1003
19:52 ryankemper: [wdqs] Restarted `wmf_auto_restart_prometheus-blazegraph-exporter-wdqs-blazegraph.service` on `wdqs1012` to clear systemdunitfailed alert
19:32 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P{lvs5004.eqsin.wmnet} and A:liberica
19:28 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host gitlab-runner1004.eqiad.wmnet with OS bookworm
19:28 brett@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P{lvs5004.eqsin.wmnet} and A:liberica
19:20 bking@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthbook: apply
19:19 bking@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook: apply
19:17 bking@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthboo-next: apply
19:16 bking@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook-next: apply
19:10 bking@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/superset-next: apply
19:09 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on gitlab-runner1004.eqiad.wmnet with reason: host reimage
19:09 bking@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/superset-next: apply
19:05 dzahn@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on gitlab-runner1004.eqiad.wmnet with reason: host reimage
18:58 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P{lvs5005.eqsin.wmnet} and A:liberica
18:55 brett@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P{lvs5005.eqsin.wmnet} and A:liberica
18:50 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host gitlab-runner1004
18:50 dzahn@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host gitlab-runner1004
18:49 dzahn@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host gitlab-runner1004
18:49 dzahn@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) gitlab-runner1004.eqiad.wmnet 141.48.64.10.in-addr.arpa 1.4.1.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
18:49 dzahn@cumin2002: START - Cookbook sre.dns.wipe-cache gitlab-runner1004.eqiad.wmnet 141.48.64.10.in-addr.arpa 1.4.1.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
18:49 dzahn@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
18:49 dzahn@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host gitlab-runner1004 - dzahn@cumin2002"
18:49 dzahn@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host gitlab-runner1004 - dzahn@cumin2002"
18:45 dzahn@cumin2002: START - Cookbook sre.dns.netbox
18:40 dzahn@cumin2002: START - Cookbook sre.hosts.move-vlan for host gitlab-runner1004
18:40 dzahn@cumin2002: START - Cookbook sre.hosts.reimage for host gitlab-runner1004.eqiad.wmnet with OS bookworm
18:39 mutante: gitlab-runner1004 - reimaging with --move-vlan T421717
18:37 eevans@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:restbase-codfw: Actually upgrade Cassandra to 4.1.11 — T418417 - eevans@cumin1003
18:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
18:01 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1240.eqiad.wmnet with reason: Maintenance
18:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1223 (T419635)', diff saved to https://phabricator.wikimedia.org/P90287 and previous config saved to /var/cache/conftool/dbconfig/20260406-180118-fceratto.json
17:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1223', diff saved to https://phabricator.wikimedia.org/P90286 and previous config saved to /var/cache/conftool/dbconfig/20260406-175111-fceratto.json
17:42 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-upgrade-ats (exit_code=0) Rolling upgrade of ATS on P{cp7009.magru.wmnet} and A:cp - 9.2.13 Upgrade ()
17:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1223', diff saved to https://phabricator.wikimedia.org/P90285 and previous config saved to /var/cache/conftool/dbconfig/20260406-174104-fceratto.json
17:37 brett@cumin2002: START - Cookbook sre.cdn.roll-upgrade-ats Rolling upgrade of ATS on P{cp7009.magru.wmnet} and A:cp - 9.2.13 Upgrade ()
17:34 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-upgrade-ats (exit_code=0) Rolling upgrade of ATS on P{cp7001.magru.wmnet} and A:cp - 9.2.13 Upgrade ()
17:30 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1223 (T419635)', diff saved to https://phabricator.wikimedia.org/P90284 and previous config saved to /var/cache/conftool/dbconfig/20260406-173056-fceratto.json
17:29 brett@cumin2002: START - Cookbook sre.cdn.roll-upgrade-ats Rolling upgrade of ATS on P{cp7001.magru.wmnet} and A:cp - 9.2.13 Upgrade ()
17:20 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1223 (T419635)', diff saved to https://phabricator.wikimedia.org/P90283 and previous config saved to /var/cache/conftool/dbconfig/20260406-172055-fceratto.json
17:20 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1223.eqiad.wmnet with reason: Maintenance
17:20 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1212 (T419635)', diff saved to https://phabricator.wikimedia.org/P90282 and previous config saved to /var/cache/conftool/dbconfig/20260406-172030-fceratto.json
17:16 brett: import trafficserver 9.2.13-1wm1 into trixie-wikimedia - T422328
17:10 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1212', diff saved to https://phabricator.wikimedia.org/P90281 and previous config saved to /var/cache/conftool/dbconfig/20260406-171021-fceratto.json
17:00 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1212', diff saved to https://phabricator.wikimedia.org/P90280 and previous config saved to /var/cache/conftool/dbconfig/20260406-170013-fceratto.json
16:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1212 (T419635)', diff saved to https://phabricator.wikimedia.org/P90279 and previous config saved to /var/cache/conftool/dbconfig/20260406-165005-fceratto.json
16:43 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1212 (T419635)', diff saved to https://phabricator.wikimedia.org/P90278 and previous config saved to /var/cache/conftool/dbconfig/20260406-164323-fceratto.json
16:43 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on 6 hosts with reason: Maintenance
16:43 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1212.eqiad.wmnet with reason: Maintenance
16:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1198 (T419635)', diff saved to https://phabricator.wikimedia.org/P90277 and previous config saved to /var/cache/conftool/dbconfig/20260406-164257-fceratto.json
16:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1198', diff saved to https://phabricator.wikimedia.org/P90276 and previous config saved to /var/cache/conftool/dbconfig/20260406-163249-fceratto.json
16:32 eevans@cumin1003: START - Cookbook sre.cassandra.roll-restart for nodes matching A:restbase-codfw: Actually upgrade Cassandra to 4.1.11 — T418417 - eevans@cumin1003
16:31 eevans@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:restbase-eqiad: Actually upgrade Cassandra to 4.1.11 — T418417 - eevans@cumin1003
16:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1198', diff saved to https://phabricator.wikimedia.org/P90275 and previous config saved to /var/cache/conftool/dbconfig/20260406-162241-fceratto.json
16:12 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1198 (T419635)', diff saved to https://phabricator.wikimedia.org/P90274 and previous config saved to /var/cache/conftool/dbconfig/20260406-161232-fceratto.json
16:06 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1198 (T419635)', diff saved to https://phabricator.wikimedia.org/P90273 and previous config saved to /var/cache/conftool/dbconfig/20260406-160615-fceratto.json
16:06 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1198.eqiad.wmnet with reason: Maintenance
16:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1175 (T419635)', diff saved to https://phabricator.wikimedia.org/P90272 and previous config saved to /var/cache/conftool/dbconfig/20260406-160551-fceratto.json
15:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1175', diff saved to https://phabricator.wikimedia.org/P90271 and previous config saved to /var/cache/conftool/dbconfig/20260406-155542-fceratto.json
15:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1175', diff saved to https://phabricator.wikimedia.org/P90270 and previous config saved to /var/cache/conftool/dbconfig/20260406-154534-fceratto.json
15:35 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1175 (T419635)', diff saved to https://phabricator.wikimedia.org/P90269 and previous config saved to /var/cache/conftool/dbconfig/20260406-153526-fceratto.json
15:29 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1175 (T419635)', diff saved to https://phabricator.wikimedia.org/P90268 and previous config saved to /var/cache/conftool/dbconfig/20260406-152908-fceratto.json
15:29 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1175.eqiad.wmnet with reason: Maintenance
15:24 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1166.eqiad.wmnet with reason: Maintenance
15:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1157 (T419635)', diff saved to https://phabricator.wikimedia.org/P90267 and previous config saved to /var/cache/conftool/dbconfig/20260406-152409-fceratto.json
15:14 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1157', diff saved to https://phabricator.wikimedia.org/P90266 and previous config saved to /var/cache/conftool/dbconfig/20260406-151401-fceratto.json
15:03 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1157', diff saved to https://phabricator.wikimedia.org/P90265 and previous config saved to /var/cache/conftool/dbconfig/20260406-150353-fceratto.json
14:53 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1157 (T419635)', diff saved to https://phabricator.wikimedia.org/P90264 and previous config saved to /var/cache/conftool/dbconfig/20260406-145344-fceratto.json
14:53 taavi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
14:53 taavi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: allocate lvs vip for dumps-lb.eqiad - taavi@cumin1003"
14:53 taavi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: allocate lvs vip for dumps-lb.eqiad - taavi@cumin1003"
14:49 taavi@cumin1003: START - Cookbook sre.dns.netbox
14:47 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1157 (T419635)', diff saved to https://phabricator.wikimedia.org/P90263 and previous config saved to /var/cache/conftool/dbconfig/20260406-144734-fceratto.json
14:47 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1157.eqiad.wmnet with reason: Maintenance
14:40 vgutierrez@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on P{cp[6001,6009].*} and A:cp - 3.2.15 upgrade (T421402)
14:28 vgutierrez@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on P{cp[6001,6009].*} and A:cp - 3.2.15 upgrade (T421402)
14:27 vgutierrez: fetch haproxy 3.2.15 on thirdparty/haproxy32 (trixie-wikimedia) - T421402
14:26 eevans@cumin1003: START - Cookbook sre.cassandra.roll-restart for nodes matching A:restbase-eqiad: Actually upgrade Cassandra to 4.1.11 — T418417 - eevans@cumin1003
14:24 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1150.eqiad.wmnet with reason: Maintenance
13:20 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1013.eqiad.wmnet with reason: Maintenance
13:06 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1023.eqiad.wmnet with reason: Maintenance
12:13 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
12:13 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
12:11 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host pc2015.codfw.wmnet with OS trixie
12:06 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host pc1015.eqiad.wmnet with OS trixie
11:53 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on pc2015.codfw.wmnet with reason: host reimage
11:49 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on pc1015.eqiad.wmnet with reason: host reimage
11:48 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on pc2015.codfw.wmnet with reason: host reimage
11:43 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on pc1015.eqiad.wmnet with reason: host reimage
11:29 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host pc2015.codfw.wmnet with OS trixie
11:28 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host pc1015.eqiad.wmnet with OS trixie
11:26 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on pc2015.codfw.wmnet,pc1015.eqiad.wmnet with reason: Maintenance
11:26 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on pc2015.codfw.wmnet with reason: Maintenance
11:24 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
11:24 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
11:14 marostegui@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1022.eqiad.wmnet,service=s3
10:15 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
10:15 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
10:12 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2142.codfw.wmnet with OS trixie
09:54 marostegui@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2142.codfw.wmnet with reason: host reimage
09:48 marostegui@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2142.codfw.wmnet with reason: host reimage
09:30 marostegui@cumin1003: START - Cookbook sre.hosts.reimage for host db2142.codfw.wmnet with OS trixie
09:27 urbanecm@deploy1003: Finished scap sync-world: Backport for SECURITY: Protect ApiEchoNotifications with a new user right (T420154), [i18n] Correct the action message (T420154), refactor: Use a trait to check for reading permissions (T420154), Create a new grant for the echo-read-notifications (T420154) (duration: 31m 47s)
09:27 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2142.codfw.wmnet,db1152.eqiad.wmnet with reason: Upgrade
09:15 urbanecm@deploy1003: urbanecm: Continuing with sync
09:15 urbanecm@deploy1003: urbanecm: Backport for SECURITY: Protect ApiEchoNotifications with a new user right (T420154), [i18n] Correct the action message (T420154), refactor: Use a trait to check for reading permissions (T420154), Create a new grant for the echo-read-notifications (T420154) synced to the testservers (see https://wikitech.wikimedia
09:10 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
09:10 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
09:01 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1152: Upgrade
09:01 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.parsercache (exit_code=0)
09:01 marostegui@cumin1003: START - Cookbook sre.mysql.parsercache
09:01 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1152: Upgrade
09:00 marostegui@cumin1003: dbctl commit (dc=all): 'Pool db1169', diff saved to https://phabricator.wikimedia.org/P90257 and previous config saved to /var/cache/conftool/dbconfig/20260406-090040-marostegui.json
09:00 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1169: test
08:59 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1169: test
08:55 urbanecm@deploy1003: Started scap sync-world: Backport for SECURITY: Protect ApiEchoNotifications with a new user right (T420154), [i18n] Correct the action message (T420154), refactor: Use a trait to check for reading permissions (T420154), Create a new grant for the echo-read-notifications (T420154)
08:48 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1023.eqiad.wmnet with reason: Maintenance
08:46 urbanecm@deploy1003: Finished scap sync-world: Backport for [Growth] Decrease user impact limits back to the defaults (T422288 T341599) (duration: 10m 50s)
08:40 urbanecm@deploy1003: urbanecm: Continuing with sync
08:37 urbanecm@deploy1003: urbanecm: Backport for [Growth] Decrease user impact limits back to the defaults (T422288 T341599) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
08:35 urbanecm@deploy1003: Started scap sync-world: Backport for [Growth] Decrease user impact limits back to the defaults (T422288 T341599)
08:15 kgraessle@deploy1003: Finished scap sync-world: Backport for Set live configuration for Extension:PersonalDashboard on English Wikipedia (T421415) (duration: 31m 54s)
08:02 kgraessle@deploy1003: kgraessle: Continuing with sync
07:59 kgraessle@deploy1003: kgraessle: Backport for Set live configuration for Extension:PersonalDashboard on English Wikipedia (T421415) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
07:43 kgraessle@deploy1003: Started scap sync-world: Backport for Set live configuration for Extension:PersonalDashboard on English Wikipedia (T421415)
05:51 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1022.eqiad.wmnet with reason: Patch clouddb1022
02:06 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 15s)
02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image

2026-04-05

02:01 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 00m 52s)
02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image

2026-04-04

18:10 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
18:10 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 12s)
02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image

2026-04-03

23:49 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4 days, 0:00:00 on zuul2002.codfw.wmnet with reason: T421398
23:48 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4 days, 0:00:00 on zuul1002.eqiad.wmnet with reason: T421398
20:13 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
20:13 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
18:56 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafkamon1003.eqiad.wmnet with OS trixie
18:39 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafkamon1003.eqiad.wmnet with reason: host reimage
18:38 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
18:38 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
18:32 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafkamon1003.eqiad.wmnet with reason: host reimage
18:19 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafkamon1003.eqiad.wmnet with OS trixie
18:16 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafkamon2003.codfw.wmnet with OS trixie
16:18 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp1115.eqiad.wmnet with OS trixie
16:05 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp1115.eqiad.wmnet with OS trixie
15:32 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2009
15:32 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2009
15:32 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2008
15:32 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2008
15:31 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2007
15:31 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2007
15:31 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
15:31 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2008 to codfw - jhancock@cumin2002"
15:31 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2008 to codfw - jhancock@cumin2002"
15:27 jhancock@cumin2002: START - Cookbook sre.dns.netbox
15:04 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafkamon2003.codfw.wmnet with reason: host reimage
14:58 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafkamon2003.codfw.wmnet with reason: host reimage
14:52 sbassett: Deployed security mitigation for T422244
14:40 herron@cumin1003: START - Cookbook sre.hosts.reimage for host kafkamon2003.codfw.wmnet with OS trixie
13:43 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
13:39 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
13:38 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
13:38 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
13:34 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
13:27 vriley@cumin1003: START - Cookbook sre.hosts.provision for host ganeti1058.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
13:26 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ganeti1058
13:26 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host ganeti1058
13:25 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
13:22 vriley@cumin1003: START - Cookbook sre.dns.netbox
13:08 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
13:08 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [ganeti1055] - vriley@cumin1003"
13:08 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt [ganeti1055] - vriley@cumin1003"
13:03 vriley@cumin1003: START - Cookbook sre.dns.netbox
12:22 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
12:22 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
12:22 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
12:16 jclark@cumin1003: START - Cookbook sre.dns.netbox
10:56 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1010.eqiad.wmnet with OS trixie
10:39 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1010.eqiad.wmnet with reason: host reimage
10:32 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1010.eqiad.wmnet with reason: host reimage
10:30 urbanecm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
10:29 urbanecm@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
10:19 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1010.eqiad.wmnet with OS trixie
10:17 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1009.eqiad.wmnet with OS trixie
09:59 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1009.eqiad.wmnet with reason: host reimage
09:54 brouberol@dns1004: END - running authdns-update
09:52 brouberol@dns1004: START - running authdns-update
09:52 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1009.eqiad.wmnet with reason: host reimage
09:36 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1009.eqiad.wmnet with OS trixie
09:33 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1008.eqiad.wmnet with OS trixie
09:14 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1008.eqiad.wmnet with reason: host reimage
09:07 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1008.eqiad.wmnet with reason: host reimage
08:51 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1008.eqiad.wmnet with OS trixie
08:48 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-test1007.eqiad.wmnet with OS trixie
08:29 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1007.eqiad.wmnet with reason: host reimage
08:22 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1007.eqiad.wmnet with reason: host reimage
08:06 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1007.eqiad.wmnet with OS trixie
02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 21s)
02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
00:59 zabe: zabe@deploy1003:~$ mwscript updateSpecialPages.php testcommonswiki # T422062
00:58 zabe@deploy1003: Finished scap sync-world: Backport for Disable QueryPage updates for Special:Unusedtemplates on testcommonswiki (T422062) (duration: 06m 50s)
00:53 zabe@deploy1003: zabe: Continuing with sync
00:53 zabe@deploy1003: zabe: Backport for Disable QueryPage updates for Special:Unusedtemplates on testcommonswiki (T422062) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
00:51 zabe@deploy1003: Started scap sync-world: Backport for Disable QueryPage updates for Special:Unusedtemplates on testcommonswiki (T422062)

2026-04-02

23:52 zabe@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
23:51 zabe@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
23:41 zabe@deploy1003: Finished scap sync-world: Backport for Start reading from new file table in dewiki and fawiki (T416548) (duration: 06m 10s)
23:37 zabe@deploy1003: zabe: Continuing with sync
23:37 zabe@deploy1003: zabe: Backport for Start reading from new file table in dewiki and fawiki (T416548) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
23:35 zabe@deploy1003: Started scap sync-world: Backport for Start reading from new file table in dewiki and fawiki (T416548)
23:06 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
23:06 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
22:20 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
22:20 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
22:05 jdlrobson@deploy1003: Finished scap sync-world: Backport for Fix section heading spacing on mobile (T414882) (duration: 07m 33s)
22:01 jdlrobson@deploy1003: jdlrobson: Continuing with sync
22:00 jdlrobson@deploy1003: jdlrobson: Backport for Fix section heading spacing on mobile (T414882) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
21:58 jdlrobson@deploy1003: Started scap sync-world: Backport for Fix section heading spacing on mobile (T414882)
21:32 kemayo@deploy1003: Finished scap sync-world: Backport for SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise (duration: 06m 18s)
21:28 kemayo@deploy1003: kemayo: Continuing with sync
21:28 kemayo@deploy1003: kemayo: Backport for SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
21:26 kemayo@deploy1003: Started scap sync-world: Backport for SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise
21:18 kemayo@deploy1003: kemayo: Continuing with sync
21:17 kemayo@deploy1003: kemayo: Backport for SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
21:15 kemayo@deploy1003: Started scap sync-world: Backport for SuggestedLinkEditCheck: fetchSuggestions return a jQuery.Promise
21:03 kemayo@deploy1003: Finished scap sync-world: Backport for Add logged-in reader retention instrument (T420490) (duration: 11m 40s)
20:59 kemayo@deploy1003: annet, kemayo: Continuing with sync
20:53 kemayo@deploy1003: annet, kemayo: Backport for Add logged-in reader retention instrument (T420490) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
20:52 kemayo@deploy1003: Started scap sync-world: Backport for Add logged-in reader retention instrument (T420490)
20:37 kemayo@deploy1003: Finished scap sync-world: Backport for zhwikinews: 20th anniversary logo change (T420165) (duration: 11m 46s)
20:31 kemayo@deploy1003: 1f616emo, kemayo: Continuing with sync
20:29 kemayo@deploy1003: 1f616emo, kemayo: Backport for zhwikinews: 20th anniversary logo change (T420165) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
20:25 kemayo@deploy1003: Started scap sync-world: Backport for zhwikinews: 20th anniversary logo change (T420165)
20:09 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
20:09 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
20:03 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
20:02 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
19:57 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
19:57 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
19:56 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
19:24 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
19:24 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
19:09 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on releases2003.codfw.wmnet with reason: T418109
19:01 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
19:01 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
18:56 cmooney@dns2005: END - running authdns-update
18:55 cmooney@dns2005: START - running authdns-update
18:52 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
18:52 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add dns names for new lumen 100g transport - cmooney@cumin1003"
18:51 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add dns names for new lumen 100g transport - cmooney@cumin1003"
18:45 cmooney@cumin1003: START - Cookbook sre.dns.netbox
18:28 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P{lvs5006.eqsin.wmnet} and A:liberica
18:24 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P{lvs5006.eqsin.wmnet} and A:liberica
18:00 ebysans@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
18:00 ebysans@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
18:00 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
17:59 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
17:18 swfrench@dns1004: END - running authdns-update
17:16 swfrench@dns1004: START - running authdns-update
17:13 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
17:12 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
17:12 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
17:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
17:11 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
17:11 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
17:11 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
17:10 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
17:10 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
17:08 swfrench@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
17:07 swfrench@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
17:05 swfrench@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
17:05 swfrench@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
17:03 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P{lvs3008.esams.wmnet} and A:liberica
17:02 swfrench@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
16:59 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P{lvs3008.esams.wmnet} and A:liberica
16:57 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P{lvs3009.esams.wmnet} and A:liberica
16:53 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P{lvs3009.esams.wmnet} and A:liberica
16:13 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
16:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
16:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
16:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
16:09 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2239.codfw.wmnet with reason: Maintenance
16:09 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227 (T419635)', diff saved to https://phabricator.wikimedia.org/P90247 and previous config saved to /var/cache/conftool/dbconfig/20260402-160942-fceratto.json
16:03 Lucas_WMDE: UTC afternoon backport+config window (very belatedly) done ^^
16:02 swfrench@deploy1003: Finished scap sync-world: Manual full-rebuild sync-world to pick up 1267062, 1266985 (attempt 2) - T422143 (duration: 29m 56s)
15:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227', diff saved to https://phabricator.wikimedia.org/P90246 and previous config saved to /var/cache/conftool/dbconfig/20260402-155934-fceratto.json
15:51 swfrench@deploy1003: swfrench: Continuing with sync
15:50 swfrench@deploy1003: swfrench: Manual full-rebuild sync-world to pick up 1267062, 1266985 (attempt 2) - T422143 synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
15:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227', diff saved to https://phabricator.wikimedia.org/P90245 and previous config saved to /var/cache/conftool/dbconfig/20260402-154925-fceratto.json
15:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2227 (T419635)', diff saved to https://phabricator.wikimedia.org/P90244 and previous config saved to /var/cache/conftool/dbconfig/20260402-153918-fceratto.json
15:33 swfrench@deploy1003: Started scap sync-world: Manual full-rebuild sync-world to pick up 1267062, 1266985 (attempt 2) - T422143
15:32 moritzm: installing freetype security updates
15:31 swfrench-wmf: restarted docker-registry-ml.service on registry200[45] - T422166
15:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
15:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
15:26 swfrench@deploy1003: sync-world aborted: Manual full-rebuild sync-world to pick up 1267062, 1266985 - T422143 (duration: 26m 48s)
15:26 swfrench-wmf: restarted docker-registry-restricted.service on registry200[45] - T422166
15:23 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
15:23 papaul: maintenance complete on mr1-eqiad
15:22 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
15:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
15:12 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
15:11 moritzm: installing apache2 security updates
15:07 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
15:06 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
15:05 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2227 (T419635)', diff saved to https://phabricator.wikimedia.org/P90242 and previous config saved to /var/cache/conftool/dbconfig/20260402-150542-fceratto.json
15:05 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2227.codfw.wmnet with reason: Maintenance
15:05 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205 (T419635)', diff saved to https://phabricator.wikimedia.org/P90241 and previous config saved to /var/cache/conftool/dbconfig/20260402-150517-fceratto.json
15:03 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
15:03 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
14:59 swfrench@deploy1003: Started scap sync-world: Manual full-rebuild sync-world to pick up 1267062, 1266985 - T422143
14:59 papaul: ongoing maintenance on mr1-eqiad
14:57 swfrench@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted/
14:56 pt1979@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mr1-eqiad,mr1-eqiad IPv6 with reason: switching from OSFP to BGP
14:56 swfrench@deploy1003: Started scap sync-world: Manual sync-world to pick up 1267062, 1266985 - T422143
14:55 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205', diff saved to https://phabricator.wikimedia.org/P90239 and previous config saved to /var/cache/conftool/dbconfig/20260402-145508-fceratto.json
14:53 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
14:53 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
14:52 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
14:52 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
14:51 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
14:51 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
14:45 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205', diff saved to https://phabricator.wikimedia.org/P90237 and previous config saved to /var/cache/conftool/dbconfig/20260402-144500-fceratto.json
14:42 moritzm: installing libxml-parser-perl security updates
14:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2205 (T419635)', diff saved to https://phabricator.wikimedia.org/P90236 and previous config saved to /var/cache/conftool/dbconfig/20260402-143452-fceratto.json
14:28 moritzm: installing pyasn1 security updates
14:18 ladsgroup@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted
14:17 ladsgroup@deploy1003: Started scap sync-world: Backport for Bump maxConnCount
14:10 esanders@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted/
14:09 esanders@deploy1003: Started scap sync-world: Backport for Fix suggestion mode availability check (T422143)
14:03 hashar: Jenkins CI: reloading configuration from disk to poll new nodes # T421114
14:00 esanders@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmnet/restricted/
14:00 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2205 (T419635)', diff saved to https://phabricator.wikimedia.org/P90233 and previous config saved to /var/cache/conftool/dbconfig/20260402-140004-fceratto.json
13:59 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2205.codfw.wmnet with reason: Maintenance
13:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194 (T419635)', diff saved to https://phabricator.wikimedia.org/P90232 and previous config saved to /var/cache/conftool/dbconfig/20260402-135939-fceratto.json
13:58 esanders@deploy1003: Started scap sync-world: Backport for Fix suggestion mode availability check (T422143)
13:55 lucaswerkmeister-wmde@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmne
13:54 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for Fix suggestion mode availability check (T422143)
13:53 lucaswerkmeister-wmde@deploy1003: sync-world failed: <CalledProcessError> Command 'sudo -u mwbuilder /srv/mwbuilder/release/make-container-image/build-images.py --http-proxy http://webproxy:8080 --https-proxy http://webproxy:8080 /srv/mediawiki-staging/scap/image-build --staging-dir /srv/mediawiki-staging --mediawiki-versions 1.46.0-wmf.21,1.46.0-wmf.22,next --multiversion-image-basename docker-registry.discovery.wmne
13:51 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for Fix suggestion mode availability check (T422143)
13:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P90231 and previous config saved to /var/cache/conftool/dbconfig/20260402-134931-fceratto.json
13:42 eevans@deploy1003: helmfile [staging] DONE helmfile.d/services/linked-artifacts: apply
13:41 jasmine@cumin1003: END (PASS) - Cookbook sre.discovery.datacenter (exit_code=0) pool all active/active services in codfw: maintenance - T414486
13:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P90230 and previous config saved to /var/cache/conftool/dbconfig/20260402-133923-fceratto.json
13:37 eevans@deploy1003: helmfile [staging] START helmfile.d/services/linked-artifacts: apply
13:29 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2194 (T419635)', diff saved to https://phabricator.wikimedia.org/P90229 and previous config saved to /var/cache/conftool/dbconfig/20260402-132914-fceratto.json
13:18 jasmine@cumin1003: START - Cookbook sre.discovery.datacenter pool all active/active services in codfw: maintenance - T414486
13:17 jasmine@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: pool codfw [reason: no reason specified, T414486]
13:17 jasmine@cumin1003: START - Cookbook sre.dns.admin DNS admin: pool codfw [reason: no reason specified, T414486]
12:57 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2194 (T419635)', diff saved to https://phabricator.wikimedia.org/P90228 and previous config saved to /var/cache/conftool/dbconfig/20260402-125732-fceratto.json
12:57 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2194.codfw.wmnet with reason: Maintenance
12:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190 (T419635)', diff saved to https://phabricator.wikimedia.org/P90227 and previous config saved to /var/cache/conftool/dbconfig/20260402-125707-fceratto.json
12:56 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
12:50 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
12:49 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1374.eqiad.wmnet with OS trixie
12:48 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool es1042: Restoring section
12:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P90225 and previous config saved to /var/cache/conftool/dbconfig/20260402-124659-fceratto.json
12:45 klausman@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
12:45 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1373.eqiad.wmnet with OS trixie
12:45 klausman@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
12:44 klausman@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
12:41 klausman@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
12:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P90224 and previous config saved to /var/cache/conftool/dbconfig/20260402-123650-fceratto.json
12:33 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1042: Restoring section
12:32 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool es1042.eqiad.wmnet: Restoring section
12:32 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1042.eqiad.wmnet: Restoring section
12:32 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1374.eqiad.wmnet with reason: host reimage
12:32 fceratto@cumin1003: END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) pool es1042.eqiad.wmnet: Restoring section
12:31 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
12:30 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool es1042.eqiad.wmnet: Restoring section
12:29 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1373.eqiad.wmnet with reason: host reimage
12:28 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for es1042.eqiad.wmnet
12:28 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for es1042.eqiad.wmnet
12:26 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2190 (T419635)', diff saved to https://phabricator.wikimedia.org/P90222 and previous config saved to /var/cache/conftool/dbconfig/20260402-122642-fceratto.json
12:22 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1374.eqiad.wmnet with reason: host reimage
12:22 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1373.eqiad.wmnet with reason: host reimage
12:13 volans@dns1004: END - running authdns-update
12:11 volans@dns1004: START - running authdns-update
12:11 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
12:10 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1374
12:10 jayme@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1374
12:10 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1373
12:09 jayme@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1373
12:09 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1374.eqiad.wmnet with OS trixie
12:09 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1373.eqiad.wmnet with OS trixie
11:55 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2190 (T419635)', diff saved to https://phabricator.wikimedia.org/P90221 and previous config saved to /var/cache/conftool/dbconfig/20260402-115511-fceratto.json
11:55 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2190.codfw.wmnet with reason: Maintenance
11:54 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177 (T419635)', diff saved to https://phabricator.wikimedia.org/P90220 and previous config saved to /var/cache/conftool/dbconfig/20260402-115446-fceratto.json
11:52 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
11:44 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P90219 and previous config saved to /var/cache/conftool/dbconfig/20260402-114437-fceratto.json
11:34 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P90218 and previous config saved to /var/cache/conftool/dbconfig/20260402-113429-fceratto.json
11:24 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2177 (T419635)', diff saved to https://phabricator.wikimedia.org/P90217 and previous config saved to /var/cache/conftool/dbconfig/20260402-112421-fceratto.json
10:51 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2177 (T419635)', diff saved to https://phabricator.wikimedia.org/P90216 and previous config saved to /var/cache/conftool/dbconfig/20260402-105142-fceratto.json
10:51 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2177.codfw.wmnet with reason: Maintenance
10:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156 (T419635)', diff saved to https://phabricator.wikimedia.org/P90215 and previous config saved to /var/cache/conftool/dbconfig/20260402-105129-fceratto.json
10:45 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
10:45 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
10:43 jmm@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host cumin2002.codfw.wmnet
10:41 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P90214 and previous config saved to /var/cache/conftool/dbconfig/20260402-104121-fceratto.json
10:40 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
10:37 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
10:32 jmm@cumin1003: START - Cookbook sre.hosts.reboot-single for host cumin2002.codfw.wmnet
10:31 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
10:31 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P90213 and previous config saved to /var/cache/conftool/dbconfig/20260402-103113-fceratto.json
10:31 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
10:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2156 (T419635)', diff saved to https://phabricator.wikimedia.org/P90212 and previous config saved to /var/cache/conftool/dbconfig/20260402-102105-fceratto.json
10:19 jayme@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
10:19 jayme@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
10:19 jayme@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
10:19 jayme@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
10:19 jayme@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
10:19 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker-exp2001.codfw.wmnet
10:19 moritzm: installing freetype security updates
10:19 jayme@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
10:19 jayme@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
10:18 jayme@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
10:18 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
10:18 jayme@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
10:18 jayme@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
10:18 jayme@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
10:18 jayme@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
10:18 jayme@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
10:17 jayme@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
10:17 jayme@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
10:17 jayme@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
10:17 jayme@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
10:17 jayme@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
10:17 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
10:17 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
10:17 jayme@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
10:17 jayme@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
10:16 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
10:16 jayme@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
10:15 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker-exp2001.codfw.wmnet
10:14 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/thumbor: apply
10:12 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
10:10 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/thumbor: apply
10:05 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/thumbor: apply
10:05 jmm@deploy1003: helmfile [staging] START helmfile.d/services/thumbor: apply
09:48 javiermonton@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventgate-main: sync
09:48 javiermonton@deploy1003: helmfile [codfw] START helmfile.d/services/eventgate-main: sync
09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2156 (T419635)', diff saved to https://phabricator.wikimedia.org/P90211 and previous config saved to /var/cache/conftool/dbconfig/20260402-094834-fceratto.json
09:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2156.codfw.wmnet with reason: Maintenance
09:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149 (T419635)', diff saved to https://phabricator.wikimedia.org/P90210 and previous config saved to /var/cache/conftool/dbconfig/20260402-094808-fceratto.json
09:48 javiermonton@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventgate-main: sync
09:47 javiermonton@deploy1003: helmfile [eqiad] START helmfile.d/services/eventgate-main: sync
09:45 javiermonton@deploy1003: helmfile [staging] DONE helmfile.d/services/eventgate-main: sync
09:45 javiermonton@deploy1003: helmfile [staging] START helmfile.d/services/eventgate-main: sync
09:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P90209 and previous config saved to /var/cache/conftool/dbconfig/20260402-093759-fceratto.json
09:33 jmm@cumin2002: END (PASS) - Cookbook sre.misc-clusters.restart-reboot-config-master (exit_code=0) rolling reboot on A:config-master-codfw
09:29 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) config-master.discovery.wmnet. on all recursors
09:29 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache config-master.discovery.wmnet. on all recursors
09:29 jmm@cumin2002: START - Cookbook sre.misc-clusters.restart-reboot-config-master rolling reboot on A:config-master-codfw
09:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P90208 and previous config saved to /var/cache/conftool/dbconfig/20260402-092751-fceratto.json
09:27 gkyziridis@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
09:27 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
09:19 moritzm: upgrading Envoy on the config-master servers to 1.35.9 T419637 T410975
09:17 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2149 (T419635)', diff saved to https://phabricator.wikimedia.org/P90207 and previous config saved to /var/cache/conftool/dbconfig/20260402-091743-fceratto.json
08:55 gkyziridis@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
08:55 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
08:49 moritzm: added Atsuko to the cn=ops LDAP group T421860
08:46 dpogorzelski@deploy1003: helmfile [eqiad] DONE helmfile.d/services/changeprop: sync
08:45 dpogorzelski@deploy1003: helmfile [eqiad] START helmfile.d/services/changeprop: sync
08:45 dpogorzelski@deploy1003: helmfile [codfw] DONE helmfile.d/services/changeprop: sync
08:44 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db2149 (T419635)', diff saved to https://phabricator.wikimedia.org/P90206 and previous config saved to /var/cache/conftool/dbconfig/20260402-084452-fceratto.json
08:44 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2149.codfw.wmnet with reason: Maintenance
08:44 dpogorzelski@deploy1003: helmfile [codfw] START helmfile.d/services/changeprop: sync
08:42 XioNoX: reboot mr1-esams - T416450
08:30 volans: briefly disabling puppet on P:installserver::proxy to deploy g/1266885
08:17 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
08:10 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group2 to 1.46.0-wmf.22 refs T420480
08:00 mszwarc@deploy1003: Finished scap sync-world: Backport for Disable external link analysis (T419837) (duration: 10m 13s)
07:56 mszwarc@deploy1003: mszwarc: Continuing with sync
07:55 jmm@dns1004: END - running authdns-update
07:54 jmm@dns1004: START - running authdns-update
07:52 mszwarc@deploy1003: mszwarc: Backport for Disable external link analysis (T419837) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
07:50 mszwarc@deploy1003: Started scap sync-world: Backport for Disable external link analysis (T419837)
07:47 jnuche@deploy1003: Finished scap sync-world: Backport for ApiAuthManagerHelper: Accept fields with undefined label (T422027) (duration: 06m 39s)
07:47 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer (, T421714) xfer wdqs-all from wdqs2016.codfw.wmnet -> wdqs1027.eqiad.wmnet, repooling both afterwards
07:43 jnuche@deploy1003: jnuche: Continuing with sync
07:43 jnuche@deploy1003: jnuche: Backport for ApiAuthManagerHelper: Accept fields with undefined label (T422027) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
07:41 jnuche@deploy1003: Started scap sync-world: Backport for ApiAuthManagerHelper: Accept fields with undefined label (T422027)
07:38 moritzm: purge prometheus-nginx-exporter from url downloaders, remnants of early hcapcha rollout
07:38 ryankemper@deploy1003: Finished deploy [wdqs/wdqs@fea7794]: deploy to freshly reimaged wdqs host (duration: 00m 05s)
07:38 ryankemper@deploy1003: Started deploy [wdqs/wdqs@fea7794]: deploy to freshly reimaged wdqs host
07:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 64049
07:30 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 64049
07:25 gkyziridis@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
07:24 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
07:12 gkyziridis@deploy1003: Finished scap sync-world: Backport for EventStreamConfig: Add rr-multilingual prediction_change stream (T415892) (duration: 07m 00s)
07:08 gkyziridis@deploy1003: gkyziridis: Continuing with sync
07:07 gkyziridis@deploy1003: gkyziridis: Backport for EventStreamConfig: Add rr-multilingual prediction_change stream (T415892) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
07:05 gkyziridis@deploy1003: Started scap sync-world: Backport for EventStreamConfig: Add rr-multilingual prediction_change stream (T415892)
02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 23s)
02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
01:41 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
01:30 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
01:30 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
01:19 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
01:18 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
01:08 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
01:06 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
00:56 jasmine@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART

2026-04-01

23:03 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wdqs1027.eqiad.wmnet with OS bullseye
22:58 swfrench-wmf: removed unused image-suggestion service in eqiad - T368096
22:48 swfrench-wmf: removed unused image-suggestion service in codfw - T368096
22:48 jdlrobson@deploy1003: Finished scap sync-world: Backport for Legal Footer Link Deploys (T420348) (duration: 08m 25s)
22:43 jdlrobson@deploy1003: lmora, jdlrobson: Continuing with sync
22:41 jdlrobson@deploy1003: lmora, jdlrobson: Backport for Legal Footer Link Deploys (T420348) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
22:39 jdlrobson@deploy1003: Started scap sync-world: Backport for Legal Footer Link Deploys (T420348)
22:38 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs1027.eqiad.wmnet with reason: host reimage
22:34 ladsgroup@deploy1003: Finished scap sync-world: Backport for Deferred: Fix function to get virtual domain (T421914 T398709), Deferred: Fix function to get virtual domain (T421914 T398709) (duration: 06m 37s)
22:33 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs1027.eqiad.wmnet with reason: host reimage
22:29 ladsgroup@deploy1003: ladsgroup: Continuing with sync
22:29 ladsgroup@deploy1003: ladsgroup: Backport for Deferred: Fix function to get virtual domain (T421914 T398709), Deferred: Fix function to get virtual domain (T421914 T398709) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
22:27 ladsgroup@deploy1003: Started scap sync-world: Backport for Deferred: Fix function to get virtual domain (T421914 T398709), Deferred: Fix function to get virtual domain (T421914 T398709)
22:07 bking@cumin2002: conftool action : set/pooled=true; selector: dnsdisc=wdqs-internal-scholarly,name=codfw
22:05 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wdqs1027
22:05 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs1027
22:01 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs1027
22:01 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wdqs1027.eqiad.wmnet 98.32.64.10.in-addr.arpa 8.9.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
22:01 bking@cumin2002: START - Cookbook sre.dns.wipe-cache wdqs1027.eqiad.wmnet 98.32.64.10.in-addr.arpa 8.9.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
22:01 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
22:01 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wdqs1027 - bking@cumin2002"
22:01 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wdqs1027 - bking@cumin2002"
21:42 swfrench@deploy1003: Finished scap sync-world: Backport for Only set the thumb step when width is given (T422074), Only set the thumb step when width is given (T422074) (duration: 07m 15s)
21:38 swfrench@deploy1003: swfrench: Continuing with sync
21:36 swfrench@deploy1003: swfrench: Backport for Only set the thumb step when width is given (T422074), Only set the thumb step when width is given (T422074) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
21:35 swfrench@deploy1003: Started scap sync-world: Backport for Only set the thumb step when width is given (T422074), Only set the thumb step when width is given (T422074)
21:34 bking@cumin2002: START - Cookbook sre.dns.netbox
21:34 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host wdqs1027
21:33 bking@cumin2002: START - Cookbook sre.hosts.reimage for host wdqs1027.eqiad.wmnet with OS bullseye
21:14 brett: Reboot lvs1013, lvs1014, lvs1015, and lvs1017 for kernel updates
21:07 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudelastic1010.eqiad.wmnet with OS bullseye
20:53 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
20:53 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
20:52 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
20:52 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
20:50 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
20:50 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
20:47 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P{lvs3010.esams.wmnet} and A:liberica
20:43 brett@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P{lvs3010.esams.wmnet} and A:liberica
20:23 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudelastic1010.eqiad.wmnet with reason: host reimage
20:19 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudelastic1010.eqiad.wmnet with reason: host reimage
20:13 cjming@deploy1003: Finished scap sync-world: Backport for config: Enable EmailConfirmationBanner on mediawikiwiki (T421366) (duration: 08m 47s)
20:09 cjming@deploy1003: mmartorana, cjming: Continuing with sync
20:06 cjming@deploy1003: mmartorana, cjming: Backport for config: Enable EmailConfirmationBanner on mediawikiwiki (T421366) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
20:04 cjming@deploy1003: Started scap sync-world: Backport for config: Enable EmailConfirmationBanner on mediawikiwiki (T421366)
20:00 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cloudelastic1010
20:00 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudelastic1010
19:58 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cloudelastic1010
19:58 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cloudelastic1010.eqiad.wmnet 24.48.64.10.in-addr.arpa 4.2.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
19:58 bking@cumin2002: START - Cookbook sre.dns.wipe-cache cloudelastic1010.eqiad.wmnet 24.48.64.10.in-addr.arpa 4.2.0.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
19:58 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
19:58 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1010 - bking@cumin2002"
19:58 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1010 - bking@cumin2002"
19:53 bking@cumin2002: START - Cookbook sre.dns.netbox
19:53 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host cloudelastic1010
19:52 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1010.eqiad.wmnet with OS bullseye
18:59 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudelastic1009.eqiad.wmnet with OS bullseye
18:11 ebysans@deploy1003: helmfile [codfw] DONE helmfile.d/services/media-analytics: apply
18:10 ebysans@deploy1003: helmfile [codfw] START helmfile.d/services/media-analytics: apply
18:10 ebysans@deploy1003: helmfile [eqiad] DONE helmfile.d/services/media-analytics: apply
18:10 ebysans@deploy1003: helmfile [eqiad] START helmfile.d/services/media-analytics: apply
18:10 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudelastic1009.eqiad.wmnet with reason: host reimage
18:04 aokoth@cumin1003: END (PASS) - Cookbook sre.vrts.upgrade (exit_code=0) on VRTS host vrts1003.eqiad.wmnet
18:03 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudelastic1009.eqiad.wmnet with reason: host reimage
18:02 aokoth@cumin1003: START - Cookbook sre.vrts.upgrade on VRTS host vrts1003.eqiad.wmnet
18:01 SandraEbele_: Deployed refinery using scap, then deployed onto hdfs
17:56 ladsgroup@deploy1003: Finished scap sync-world: Backport for Refix thumb steps for the poster image of videos (T414805), Refix thumb steps for the poster image of videos (T414805) (duration: 08m 18s)
17:52 ladsgroup@deploy1003: ladsgroup: Continuing with sync
17:50 ladsgroup@deploy1003: ladsgroup: Backport for Refix thumb steps for the poster image of videos (T414805), Refix thumb steps for the poster image of videos (T414805) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
17:48 ladsgroup@deploy1003: Started scap sync-world: Backport for Refix thumb steps for the poster image of videos (T414805), Refix thumb steps for the poster image of videos (T414805)
17:43 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cloudelastic1009
17:43 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudelastic1009
17:42 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cloudelastic1009
17:42 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) cloudelastic1009.eqiad.wmnet 30.32.64.10.in-addr.arpa 0.3.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
17:42 bking@cumin2002: START - Cookbook sre.dns.wipe-cache cloudelastic1009.eqiad.wmnet 30.32.64.10.in-addr.arpa 0.3.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
17:42 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
17:42 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1009 - bking@cumin2002"
17:42 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host cloudelastic1009 - bking@cumin2002"
17:39 ebysans@deploy1003: Finished deploy [analytics/refinery@fa28ad8] (thin): Extend mediarequest Cassandra loads with poster/plays for video-requests API T415202 [analytics/refinery@fa28ad83] (duration: 01m 53s)
17:38 ebysans@deploy1003: Started deploy [analytics/refinery@fa28ad8] (thin): Extend mediarequest Cassandra loads with poster/plays for video-requests API T415202 [analytics/refinery@fa28ad83]
17:37 ebysans@deploy1003: Finished deploy [analytics/refinery@fa28ad8]: Extend mediarequest Cassandra loads with poster/plays for video-requests API T415202 [analytics/refinery@fa28ad83] (duration: 04m 15s)
17:33 ebysans@deploy1003: Started deploy [analytics/refinery@fa28ad8]: Extend mediarequest Cassandra loads with poster/plays for video-requests API T415202 [analytics/refinery@fa28ad83]
17:33 bking@cumin2002: START - Cookbook sre.dns.netbox
17:32 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host cloudelastic1009
17:32 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1009.eqiad.wmnet with OS bullseye
17:32 ebysans@deploy1003: Finished deploy [analytics/refinery@fa28ad8] (hadoop-test): Extend mediarequest Cassandra loads with poster/plays for video-requests API T415202 TEST [analytics/refinery@fa28ad83] (duration: 01m 52s)
17:30 ebysans@deploy1003: Started deploy [analytics/refinery@fa28ad8] (hadoop-test): Extend mediarequest Cassandra loads with poster/plays for video-requests API T415202 TEST [analytics/refinery@fa28ad83]
17:21 swfrench@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-cron: apply
17:21 swfrench@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-cron: apply
17:21 swfrench@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-cron: apply
17:21 swfrench@deploy1003: helmfile [codfw] START helmfile.d/services/mw-cron: apply
17:18 swfrench@deploy1003: Finished scap sync-world: helmfile-only deployment to remove unused image-suggestion listener - T368096 (duration: 07m 25s)
17:12 swfrench@deploy1003: Started scap sync-world: helmfile-only deployment to remove unused image-suggestion listener - T368096
17:09 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-restart-haproxy (exit_code=0) rolling restart of HAProxy on A:cp-drmrs - New configuration/test (T421402)
16:53 dreamyjazz@deploy1003: Finished scap sync-world: Backport for hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678), hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678) (duration: 11m 30s)
16:49 dreamyjazz@deploy1003: dreamyjazz: Continuing with sync
16:44 dreamyjazz@deploy1003: dreamyjazz: Backport for hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678), hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
16:42 dreamyjazz@deploy1003: Started scap sync-world: Backport for hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678), hCaptcha: Add log and counter when all SiteVerify attempts fail (T421678)
16:36 bking@cumin2002: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0) (T421714, prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
16:33 fabfur@cumin1003: START - Cookbook sre.cdn.roll-restart-haproxy rolling restart of HAProxy on A:cp-drmrs - New configuration/test (T421402)
16:29 urbanecm@deploy1003: Finished scap sync-world: Backport for Set the default for UserEmailConfirmationUseHTML to true (T411147), cleanup: Remove UserEmailConfirmationUseHTML (defaults to true) (T411147) (duration: 09m 31s)
16:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
16:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
16:24 urbanecm@deploy1003: urbanecm: Continuing with sync
16:21 urbanecm@deploy1003: urbanecm: Backport for Set the default for UserEmailConfirmationUseHTML to true (T411147), cleanup: Remove UserEmailConfirmationUseHTML (defaults to true) (T411147) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
16:19 urbanecm@deploy1003: Started scap sync-world: Backport for Set the default for UserEmailConfirmationUseHTML to true (T411147), cleanup: Remove UserEmailConfirmationUseHTML (defaults to true) (T411147)
16:15 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host conf2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
16:14 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host conf2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
16:14 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host conf2007
16:14 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host conf2007
16:13 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
16:13 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2007 to codfw - jhancock@cumin2002"
16:13 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding conf2007 to codfw - jhancock@cumin2002"
16:09 jhancock@cumin2002: START - Cookbook sre.dns.netbox
15:32 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_ulsfo - 3.2 upgrade (T421402)
15:26 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_ulsfo - 3.2 upgrade (T421402)
15:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
15:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
15:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
15:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
15:24 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer (T421714, prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
15:23 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) (T421714, prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
15:23 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer (T421714, prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
15:22 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) (T421714, prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
15:22 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer (T421714, prepare newly-reimaged host) xfer commons from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
15:22 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) (T421714, prepare newly-reimaged host) xfer wikidata from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
15:20 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer (T421714, prepare newly-reimaged host) xfer wikidata from wcqs1001.eqiad.wmnet -> wcqs1003.eqiad.wmnet, repooling both afterwards
15:13 jforrester@deploy1003: Finished scap sync-world: Backport for Wikifunctions: Switch cache from mcrouter-wikifunctions to special access (T411807) (duration: 12m 53s)
15:12 lucaswerkmeister-wmde@deploy1003: helmfile [codfw] DONE helmfile.d/services/mw-experimental: apply
15:11 lucaswerkmeister-wmde@deploy1003: helmfile [codfw] START helmfile.d/services/mw-experimental: apply
15:11 lucaswerkmeister-wmde@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
15:10 lucaswerkmeister-wmde@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
15:09 jforrester@deploy1003: jforrester: Continuing with sync
15:03 jforrester@deploy1003: jforrester: Backport for Wikifunctions: Switch cache from mcrouter-wikifunctions to special access (T411807) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
15:01 jforrester@deploy1003: Started scap sync-world: Backport for Wikifunctions: Switch cache from mcrouter-wikifunctions to special access (T411807)
15:00 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host db1208.eqiad.wmnet
14:59 taavi@dns1004: END - running authdns-update
14:59 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wcqs1003.eqiad.wmnet with OS bullseye
14:57 taavi@dns1004: START - running authdns-update
14:56 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker2005.codfw.wmnet
14:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
14:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
14:53 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
14:52 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259 (T419635)', diff saved to https://phabricator.wikimedia.org/P90188 and previous config saved to /var/cache/conftool/dbconfig/20260401-145256-fceratto.json
14:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 34968
14:52 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 34968
14:50 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker2005.codfw.wmnet
14:50 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker2004.codfw.wmnet
14:48 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host db1208.eqiad.wmnet
14:47 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker2004.codfw.wmnet
14:45 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_ulsfo - 3.2 upgrade (T421402)
14:44 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_ulsfo - 3.2 upgrade (T421402)
14:44 fabfur: upgrading ulsfo to haproxy 3.2 (T421402)
14:42 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259', diff saved to https://phabricator.wikimedia.org/P90187 and previous config saved to /var/cache/conftool/dbconfig/20260401-144247-fceratto.json
14:32 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259', diff saved to https://phabricator.wikimedia.org/P90186 and previous config saved to /var/cache/conftool/dbconfig/20260401-143239-fceratto.json
14:28 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wcqs1003.eqiad.wmnet with reason: host reimage
14:28 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
14:28 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
14:27 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
14:26 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
14:26 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
14:25 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
14:22 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1259 (T419635)', diff saved to https://phabricator.wikimedia.org/P90184 and previous config saved to /var/cache/conftool/dbconfig/20260401-142231-fceratto.json
14:22 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wcqs1003.eqiad.wmnet with reason: host reimage
14:21 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
14:21 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
14:20 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
14:19 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
14:19 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
14:18 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
14:16 jforrester@deploy1003: Finished scap sync-world: Backport for MemcachedWrapper: Hash key when longer than 250 characters, Extend queue processing times for abstract fragments (T421581) (duration: 08m 14s)
14:13 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts an-worker1148.eqiad.wmnet
14:13 brouberol@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
14:13 brouberol@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-worker1148.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - brouberol@cumin1003"
14:13 brouberol@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-worker1148.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - brouberol@cumin1003"
14:12 jforrester@deploy1003: jforrester: Continuing with sync
14:11 volans: uploaded cumin_6.0.0 to apt.wikimedia.org bookworm-wikimedia,trixie-wikimedia
14:11 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_magru - 3.2 upgrade (T421402)
14:11 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_magru - 3.2 upgrade (T421402)
14:10 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
14:10 jforrester@deploy1003: jforrester: Backport for MemcachedWrapper: Hash key when longer than 250 characters, Extend queue processing times for abstract fragments (T421581) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
14:09 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
14:09 brouberol@cumin1003: START - Cookbook sre.dns.netbox
14:08 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
14:08 jforrester@deploy1003: Started scap sync-world: Backport for MemcachedWrapper: Hash key when longer than 250 characters, Extend queue processing times for abstract fragments (T421581)
14:07 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
14:07 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
14:07 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
14:07 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1259 (T419635)', diff saved to https://phabricator.wikimedia.org/P90182 and previous config saved to /var/cache/conftool/dbconfig/20260401-140707-fceratto.json
14:07 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1259.eqiad.wmnet with reason: Maintenance
14:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254 (T419635)', diff saved to https://phabricator.wikimedia.org/P90181 and previous config saved to /var/cache/conftool/dbconfig/20260401-140654-fceratto.json
14:05 jforrester@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
14:05 jforrester@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
14:05 jforrester@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
14:05 bking@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wcqs1003
14:05 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wcqs1003
14:04 jforrester@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
14:04 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wcqs1003
14:04 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wcqs1003.eqiad.wmnet 9.32.64.10.in-addr.arpa 9.0.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
14:04 bking@cumin2002: START - Cookbook sre.dns.wipe-cache wcqs1003.eqiad.wmnet 9.32.64.10.in-addr.arpa 9.0.0.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
14:04 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
14:04 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wcqs1003 - bking@cumin2002"
14:04 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wcqs1003 - bking@cumin2002"
14:03 brouberol@cumin1003: START - Cookbook sre.hosts.decommission for hosts an-worker1148.eqiad.wmnet
14:03 jforrester@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
14:02 jforrester@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
14:00 bking@cumin2002: START - Cookbook sre.dns.netbox
14:00 bking@cumin2002: START - Cookbook sre.hosts.move-vlan for host wcqs1003
13:59 bking@cumin2002: START - Cookbook sre.hosts.reimage for host wcqs1003.eqiad.wmnet with OS bullseye
13:56 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254', diff saved to https://phabricator.wikimedia.org/P90179 and previous config saved to /var/cache/conftool/dbconfig/20260401-135646-fceratto.json
13:51 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for ml-lab1002.eqiad.wmnet
13:51 klausman@cumin1003: START - Cookbook sre.hosts.remove-downtime for ml-lab1002.eqiad.wmnet
13:50 klausman@cumin1003: END (ERROR) - Cookbook sre.hosts.reboot-single (exit_code=97) for host ml-lab1002.eqiad.wmnet
13:46 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-lab1002.eqiad.wmnet
13:46 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254', diff saved to https://phabricator.wikimedia.org/P90178 and previous config saved to /var/cache/conftool/dbconfig/20260401-134638-fceratto.json
13:44 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
13:43 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
13:41 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/media-analytics: apply
13:41 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/media-analytics: apply
13:36 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1254 (T419635)', diff saved to https://phabricator.wikimedia.org/P90177 and previous config saved to /var/cache/conftool/dbconfig/20260401-133629-fceratto.json
13:33 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
13:33 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
13:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
13:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
13:30 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
13:29 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
13:22 moritzm: purge prometheus-nginx-exporter from url downloaders, remnants of early hcapcha rollout
13:21 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1254 (T419635)', diff saved to https://phabricator.wikimedia.org/P90176 and previous config saved to /var/cache/conftool/dbconfig/20260401-132149-fceratto.json
13:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1254.eqiad.wmnet with reason: Maintenance
13:21 fabfur: upgrading magru to haproxy 3.2 (T421402)
13:20 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_magru - 3.2 upgrade (T421402)
13:19 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_magru - 3.2 upgrade (T421402)
13:09 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader2004.wikimedia.org
13:07 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1239.eqiad.wmnet with reason: Maintenance
13:07 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233 (T419635)', diff saved to https://phabricator.wikimedia.org/P90174 and previous config saved to /var/cache/conftool/dbconfig/20260401-130753-fceratto.json
13:06 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group1 to 1.46.0-wmf.22 refs T420480
13:05 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host urldownloader2004.wikimedia.org
13:00 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host urldownloader1004.wikimedia.org
12:57 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P90173 and previous config saved to /var/cache/conftool/dbconfig/20260401-125744-fceratto.json
12:56 kharlan@deploy1003: Finished scap sync-world: Backport for hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678), hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678) (duration: 09m 21s)
12:55 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host urldownloader1004.wikimedia.org
12:52 kharlan@deploy1003: kharlan: Continuing with sync
12:49 kharlan@deploy1003: kharlan: Backport for hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678), hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
12:47 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P90171 and previous config saved to /var/cache/conftool/dbconfig/20260401-124736-fceratto.json
12:47 kharlan@deploy1003: Started scap sync-world: Backport for hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678), hCaptcha: Retry SiteVerify API on HTTP error and adjust timeout (T421678)
12:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1233 (T419635)', diff saved to https://phabricator.wikimedia.org/P90170 and previous config saved to /var/cache/conftool/dbconfig/20260401-123728-fceratto.json
12:33 kharlan@deploy1003: Finished scap sync-world: Backport for Revert "SuggestedInvestigations: Import session into signal matching job" (T421062), Revert "SuggestedInvestigations: Import session into signal matching job" (T421062) (duration: 07m 34s)
12:29 kharlan@deploy1003: kharlan: Continuing with sync
12:28 kharlan@deploy1003: kharlan: Backport for Revert "SuggestedInvestigations: Import session into signal matching job" (T421062), Revert "SuggestedInvestigations: Import session into signal matching job" (T421062) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
12:26 kharlan@deploy1003: Started scap sync-world: Backport for Revert "SuggestedInvestigations: Import session into signal matching job" (T421062), Revert "SuggestedInvestigations: Import session into signal matching job" (T421062)
12:22 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1233 (T419635)', diff saved to https://phabricator.wikimedia.org/P90169 and previous config saved to /var/cache/conftool/dbconfig/20260401-122203-fceratto.json
12:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1233.eqiad.wmnet with reason: Maintenance
12:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229 (T419635)', diff saved to https://phabricator.wikimedia.org/P90168 and previous config saved to /var/cache/conftool/dbconfig/20260401-122138-fceratto.json
12:17 kart_: Updated cxserver to 2026-03-25-072715-production
12:17 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-lab1002.eqiad.wmnet
12:13 kartik@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
12:12 kartik@deploy1003: helmfile [eqiad] START helmfile.d/services/cxserver: apply
12:12 kartik@deploy1003: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
12:11 kartik@deploy1003: helmfile [codfw] START helmfile.d/services/cxserver: apply
12:11 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-lab1002.eqiad.wmnet
12:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P90167 and previous config saved to /var/cache/conftool/dbconfig/20260401-121130-fceratto.json
12:02 kartik@deploy1003: helmfile [staging] DONE helmfile.d/services/cxserver: apply
12:02 kartik@deploy1003: helmfile [staging] START helmfile.d/services/cxserver: apply
12:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P90166 and previous config saved to /var/cache/conftool/dbconfig/20260401-120122-fceratto.json
11:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1229 (T419635)', diff saved to https://phabricator.wikimedia.org/P90164 and previous config saved to /var/cache/conftool/dbconfig/20260401-115114-fceratto.json
11:48 moritzm: upgrading Envoy on the idp-test servers to 1.35.9 T419637 T410975
11:35 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1229 (T419635)', diff saved to https://phabricator.wikimedia.org/P90162 and previous config saved to /var/cache/conftool/dbconfig/20260401-113556-fceratto.json
11:35 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1229.eqiad.wmnet with reason: Maintenance
11:33 moritzm: installing tomcat10 security updates
11:27 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
11:27 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
11:21 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1225.eqiad.wmnet with reason: Maintenance
11:21 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197 (T419635)', diff saved to https://phabricator.wikimedia.org/P90161 and previous config saved to /var/cache/conftool/dbconfig/20260401-112125-fceratto.json
11:18 mvolz@deploy1003: helmfile [eqiad] DONE helmfile.d/services/citoid: apply
11:17 mvolz@deploy1003: helmfile [eqiad] START helmfile.d/services/citoid: apply
11:17 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
11:16 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
11:12 mvolz@deploy1003: helmfile [codfw] DONE helmfile.d/services/citoid: apply
11:11 mvolz@deploy1003: helmfile [codfw] START helmfile.d/services/citoid: apply
11:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P90160 and previous config saved to /var/cache/conftool/dbconfig/20260401-111117-fceratto.json
11:10 mvolz@deploy1003: helmfile [staging] DONE helmfile.d/services/citoid: apply
11:09 mvolz@deploy1003: helmfile [staging] START helmfile.d/services/citoid: apply
11:06 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
11:05 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
11:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P90159 and previous config saved to /var/cache/conftool/dbconfig/20260401-110109-fceratto.json
10:58 daniel@deploy1003: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
10:58 daniel@deploy1003: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
10:57 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1370.eqiad.wmnet with OS trixie
10:51 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1197 (T419635)', diff saved to https://phabricator.wikimedia.org/P90158 and previous config saved to /var/cache/conftool/dbconfig/20260401-105059-fceratto.json
10:48 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1197 (T419635)', diff saved to https://phabricator.wikimedia.org/P90157 and previous config saved to /var/cache/conftool/dbconfig/20260401-104847-fceratto.json
10:48 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1197.eqiad.wmnet with reason: Maintenance
10:48 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188 (T419635)', diff saved to https://phabricator.wikimedia.org/P90156 and previous config saved to /var/cache/conftool/dbconfig/20260401-104823-fceratto.json
10:48 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
10:47 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
10:47 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
10:47 moritzm: installing libpng1.6 security updates on Trixie/Bookworm
10:47 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host an-druid1004.eqiad.wmnet with OS bookworm
10:47 daniel@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
10:46 daniel@deploy1003: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
10:44 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host irc1003.wikimedia.org
10:41 daniel@deploy1003: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
10:40 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host irc1003.wikimedia.org
10:40 daniel@deploy1003: helmfile [staging] START helmfile.d/services/rest-gateway: apply
10:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1370.eqiad.wmnet with reason: host reimage
10:38 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P90154 and previous config saved to /var/cache/conftool/dbconfig/20260401-103816-fceratto.json
10:36 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1370.eqiad.wmnet with reason: host reimage
10:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P90152 and previous config saved to /var/cache/conftool/dbconfig/20260401-102807-fceratto.json
10:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
10:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1370
10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1370
10:24 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1370
10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1370.eqiad.wmnet 204.48.64.10.in-addr.arpa 4.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
10:24 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1370.eqiad.wmnet 204.48.64.10.in-addr.arpa 4.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
10:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1370 - ayounsi@cumin1003"
10:24 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1370 - ayounsi@cumin1003"
10:19 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
10:19 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1370
10:19 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-druid1004.eqiad.wmnet with reason: host reimage
10:19 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1370.eqiad.wmnet with OS trixie
10:18 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
10:18 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
10:18 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1188 (T419635)', diff saved to https://phabricator.wikimedia.org/P90151 and previous config saved to /var/cache/conftool/dbconfig/20260401-101758-fceratto.json
10:13 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on an-druid1004.eqiad.wmnet with reason: host reimage
10:13 jmm@dns1004: END - running authdns-update
10:12 jmm@deploy1003: helmfile [eqiad] DONE helmfile.d/services/proton: apply
10:11 jmm@dns1004: START - running authdns-update
10:09 jmm@deploy1003: helmfile [eqiad] START helmfile.d/services/proton: apply
10:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
10:08 jmm@deploy1003: helmfile [codfw] DONE helmfile.d/services/proton: apply
10:08 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
10:06 jmm@deploy1003: helmfile [codfw] START helmfile.d/services/proton: apply
10:06 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1369.eqiad.wmnet with OS trixie
10:06 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
10:06 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
10:04 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
10:04 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
09:59 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1188 (T419635)', diff saved to https://phabricator.wikimedia.org/P90150 and previous config saved to /var/cache/conftool/dbconfig/20260401-095943-fceratto.json
09:59 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1188.eqiad.wmnet with reason: Maintenance
09:59 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182 (T419635)', diff saved to https://phabricator.wikimedia.org/P90149 and previous config saved to /var/cache/conftool/dbconfig/20260401-095920-fceratto.json
09:58 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host an-druid1004
09:58 btullis@cumin1003: START - Cookbook sre.hosts.move-vlan for host an-druid1004
09:57 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host an-druid1004.eqiad.wmnet with OS bookworm
09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
09:55 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
09:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
09:54 jmm@deploy1003: helmfile [staging] DONE helmfile.d/services/proton: apply
09:53 jmm@deploy1003: helmfile [staging] START helmfile.d/services/proton: apply
09:50 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1369.eqiad.wmnet with reason: host reimage
09:49 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P90148 and previous config saved to /var/cache/conftool/dbconfig/20260401-094912-fceratto.json
09:44 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1369.eqiad.wmnet with reason: host reimage
09:39 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P90147 and previous config saved to /var/cache/conftool/dbconfig/20260401-093903-fceratto.json
09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1369
09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1369
09:32 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1369
09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1369.eqiad.wmnet 203.48.64.10.in-addr.arpa 3.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
09:32 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1369.eqiad.wmnet 203.48.64.10.in-addr.arpa 3.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1369 - ayounsi@cumin1003"
09:32 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1369 - ayounsi@cumin1003"
09:31 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host an-druid1003.eqiad.wmnet with OS bookworm
09:28 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1182 (T419635)', diff saved to https://phabricator.wikimedia.org/P90146 and previous config saved to /var/cache/conftool/dbconfig/20260401-092855-fceratto.json
09:27 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
09:27 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1369
09:27 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1369.eqiad.wmnet with OS trixie
09:26 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1368.eqiad.wmnet with OS trixie
09:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
09:21 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
09:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
09:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
09:11 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1182 (T419635)', diff saved to https://phabricator.wikimedia.org/P90143 and previous config saved to /var/cache/conftool/dbconfig/20260401-091134-fceratto.json
09:11 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1182.eqiad.wmnet with reason: Maintenance
09:11 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162 (T419635)', diff saved to https://phabricator.wikimedia.org/P90142 and previous config saved to /var/cache/conftool/dbconfig/20260401-091109-fceratto.json
09:09 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1368.eqiad.wmnet with reason: host reimage
09:05 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
09:05 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
09:05 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1368.eqiad.wmnet with reason: host reimage
09:03 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-druid1003.eqiad.wmnet with reason: host reimage
09:01 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P90141 and previous config saved to /var/cache/conftool/dbconfig/20260401-090101-fceratto.json
09:00 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on an-druid1003.eqiad.wmnet with reason: host reimage
08:57 Amir1: mwscript-k8s --dblist=all -- purgeUserOptions.php --login-age 5 skin (T406724)
08:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
08:54 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
08:53 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1368
08:53 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1368
08:52 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1368
08:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1368.eqiad.wmnet 202.48.64.10.in-addr.arpa 2.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
08:52 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1368.eqiad.wmnet 202.48.64.10.in-addr.arpa 2.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
08:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
08:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1368 - ayounsi@cumin1003"
08:52 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1368 - ayounsi@cumin1003"
08:50 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P90139 and previous config saved to /var/cache/conftool/dbconfig/20260401-085053-fceratto.json
08:45 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
08:45 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
08:44 moritzm: installing Apache security updates
08:44 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host an-druid1003
08:44 btullis@cumin1003: START - Cookbook sre.hosts.move-vlan for host an-druid1003
08:43 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host an-druid1003.eqiad.wmnet with OS bookworm
08:42 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
08:42 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
08:40 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1162 (T419635)', diff saved to https://phabricator.wikimedia.org/P90138 and previous config saved to /var/cache/conftool/dbconfig/20260401-084047-fceratto.json
08:38 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
08:37 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1162 (T419635)', diff saved to https://phabricator.wikimedia.org/P90137 and previous config saved to /var/cache/conftool/dbconfig/20260401-083733-fceratto.json
08:37 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1162.eqiad.wmnet with reason: Maintenance
08:37 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156 (T419635)', diff saved to https://phabricator.wikimedia.org/P90136 and previous config saved to /var/cache/conftool/dbconfig/20260401-083709-fceratto.json
08:28 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1014.eqiad.wmnet
08:27 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P90135 and previous config saved to /var/cache/conftool/dbconfig/20260401-082701-fceratto.json
08:25 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
08:24 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1368
08:23 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1368.eqiad.wmnet with OS trixie
08:21 jnuche@deploy1003: rebuilt and synchronized wikiversions files: group0 to 1.46.0-wmf.22 refs T420480
08:21 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1014.eqiad.wmnet
08:18 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1013.eqiad.wmnet
08:16 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P90134 and previous config saved to /var/cache/conftool/dbconfig/20260401-081652-fceratto.json
08:16 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1367.eqiad.wmnet with OS trixie
08:10 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1013.eqiad.wmnet
08:07 moritzm: upgrading Envoy on the Puppet servers to 1.35.9 T419637 T410975
08:06 fceratto@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1156 (T419635)', diff saved to https://phabricator.wikimedia.org/P90132 and previous config saved to /var/cache/conftool/dbconfig/20260401-080644-fceratto.json
08:03 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
07:59 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1367.eqiad.wmnet with reason: host reimage
07:52 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1367.eqiad.wmnet with reason: host reimage
07:47 fceratto@cumin1003: dbctl commit (dc=all): 'Depooling db1156 (T419635)', diff saved to https://phabricator.wikimedia.org/P90130 and previous config saved to /var/cache/conftool/dbconfig/20260401-074704-fceratto.json
07:46 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1014,1018].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
07:46 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1156.eqiad.wmnet with reason: Maintenance
07:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1367
07:40 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1367
07:39 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1367
07:39 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1367.eqiad.wmnet 201.48.64.10.in-addr.arpa 1.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
07:39 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1367.eqiad.wmnet 201.48.64.10.in-addr.arpa 1.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
07:39 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
07:39 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1367 - ayounsi@cumin1003"
07:39 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1367 - ayounsi@cumin1003"
07:36 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
07:36 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
07:35 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
07:35 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1367
07:34 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1367.eqiad.wmnet with OS trixie
07:34 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1012.eqiad.wmnet
07:27 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1012.eqiad.wmnet
07:27 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
07:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
07:26 moritzm: installing postgresql security updates
07:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
07:26 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
07:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
07:24 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
07:22 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1011.eqiad.wmnet
07:15 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1011.eqiad.wmnet
07:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
07:14 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
07:10 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1366.eqiad.wmnet with OS trixie
07:09 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
07:09 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
06:54 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1366.eqiad.wmnet with reason: host reimage
06:50 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1366.eqiad.wmnet with reason: host reimage
06:38 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1366
06:38 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1366
06:37 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
06:37 javiermonton@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
06:35 ayounsi@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1366
06:35 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1366.eqiad.wmnet 200.48.64.10.in-addr.arpa 0.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
06:34 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1366.eqiad.wmnet 200.48.64.10.in-addr.arpa 0.0.2.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
06:34 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
06:34 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1366 - ayounsi@cumin1003"
06:34 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1366 - ayounsi@cumin1003"
06:30 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
06:30 ayounsi@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1366
06:29 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1366.eqiad.wmnet with OS trixie
05:56 marostegui: Drop empty tables cusi_case, cusi_user, and cusi_signal on wikis not listed at checkuser-suggested-investigations.dblist T421353
05:33 marostegui: Drop empty ores_classification and ores_model on closed wikis T420093
05:26 marostegui: Drop global_block_whitelist on closed wikis T420525
02:24 ebysans@deploy1003: helmfile [staging] DONE helmfile.d/services/media-analytics: apply
02:14 ebysans@deploy1003: helmfile [staging] START helmfile.d/services/media-analytics: apply
02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 23s)
02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
01:51 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
01:51 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
01:12 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
01:12 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
01:02 ladsgroup@deploy1003: Finished scap sync-world: Backport for Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589) (duration: 08m 35s)
00:57 ladsgroup@deploy1003: ladsgroup: Continuing with sync
00:56 otto@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
00:56 otto@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
00:55 ladsgroup@deploy1003: ladsgroup: Backport for Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
00:53 ladsgroup@deploy1003: Started scap sync-world: Backport for Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)
00:40 ladsgroup@deploy1003: Finished scap sync-world: Backport for util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589), util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589), Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589) (duration: 12m 40s)
00:35 ladsgroup@deploy1003: ladsgroup: Continuing with sync
00:29 ladsgroup@deploy1003: ladsgroup: Backport for util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589), util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589), Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589) synced to the testservers (see https://wikitech.wiki
00:27 ladsgroup@deploy1003: Started scap sync-world: Backport for util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589), util.js: Allow passing isVectorized to adjustThumbWidthForSteps (T414805 T411013 T421589), Pass whether the image is svg to adjustThumbWidthForSteps (T414805 T411013 T421589)
00:07 ladsgroup@deploy1003: Finished scap sync-world: Backport for LinksUpdate: Consolidate links virtual domains (T421914), LinksUpdate: Consolidate links virtual domains (T421914) (duration: 06m 50s)
00:03 ladsgroup@deploy1003: ladsgroup: Continuing with sync
00:03 ladsgroup@deploy1003: ladsgroup: Backport for LinksUpdate: Consolidate links virtual domains (T421914), LinksUpdate: Consolidate links virtual domains (T421914) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
00:00 ladsgroup@deploy1003: Started scap sync-world: Backport for LinksUpdate: Consolidate links virtual domains (T421914), LinksUpdate: Consolidate links virtual domains (T421914)

Other archives

See Server Admin Log/Archives.

Server Admin Log - Wikitech

2026-04-25

2026-04-24

2026-04-23

2026-04-22

2026-04-21

2026-04-20

2026-04-19

2026-04-17

2026-04-16

2026-04-15

2026-04-14

2026-04-13

2026-04-12

2026-04-11

2026-04-10

2026-04-09

2026-04-08

2026-04-07

2026-04-06

2026-04-05

2026-04-04

2026-04-03

2026-04-02

2026-04-01

Other archives