Page MenuHomeMiraheze

Upgrade linux kernel on all hosts
Closed, ResolvedPublic

Event Timeline

Unknown Object (User) added a comment.Jun 11 2022, 15:40

I think this would've been done yesterday by Reception123, so just needs a reboot on servers that weren't rebooted. Some major ones had to be because of outage. db* was rebooted, mon111 was rebooted, phab121 was rebooted, a single mw server was (by me) and test101 was. So I think all those are already done.

But I would like to note, for the record upgrades should never be done like the way they were this time, and should be done with unattended-upgrades, and never with apt upgrade on all servers at once, at least not without scheduling downtime. But because of it those servers would've already had the kernel upgraded. I can do the rest of the MediaWiki servers today though (mw*, mwtask111, and jobchron121)

I filed the task because Icinga alerted again. It was only released this morning.

And yes they were numerous issues in how the upgrades were done early this morning

Unknown Object (User) added a comment.Jun 11 2022, 15:43

I filed the task because Icinga alerted again. It was only released this morning.

Oh. I will do mw*, mwtask111, test101, and jobchron121 then later.

Unknown Object (User) added a comment.Jun 11 2022, 18:00

mw*, mwtask111, test101, and jobchron121 are now done.

Upgraded matomo101, prometheus101, mon111 and puppet111.

Upgraded phab121, ldap111, bast101, bast121 and mail121.

It feels like hosts should have been done based on cloud server rather than individually as we need to reboot the physical hosts as well.

Cloud servers can't be done without downtime can they?

The best bet in my opinion is the reboot them during the MW upgrade as users already expect broken.

Cloud servers can't be done without downtime can they?

Nope, we don’t have a fully redundant setup so there will be downtime.

The best bet in my opinion is the reboot them during the MW upgrade as users already expect broken.

@Paladox is this something you can facilitate?

The best bet in my opinion is the reboot them during the MW upgrade as users already expect broken.

@Paladox is this something you can facilitate?

Sure. Although note that proxmox uses a different kernel (a customised one).

We could also use that window to upgrade proxmox? (https://pve.proxmox.com/wiki/Roadmap#Proxmox_VE_7.2)

Each vm would need to be stopped so that we make sure there is no corruption. Also they would manually have to be started after rebooting the cloud server.

It does also now cross my mind that ProxMox is installed on the HDDs of the server, maybe not too relevant here but we might want to look at some work to move it over to the SSDs

In T9366#189952, @John wrote:

It does also now cross my mind that ProxMox is installed on the HDDs of the server, maybe not too relevant here but we might want to look at some work to move it over to the SSDs

If you want to, could you do that during the window? I'm unsure how to do that so would like your help with that please :)

Unknown Object (User) closed this task as Resolved.Jun 16 2022, 05:17
Unknown Object (User) assigned this task to Paladox.
Unknown Object (User) moved this task from Incoming to Short Term on the Infrastructure (SRE) board.
Unknown Object (User) changed the visibility from "Custom Policy" to "Public (No Login Required)".
Unknown Object (User) changed the edit policy from "Custom Policy" to "All Users".