K8s kubespray cluster upgrade from Ubuntu 20.04 to 22.04
Introduction
It was time to upgrade my Kubernetes cluster that I created with Kubespray years ago. Originally the cluster was deployed on Ubuntu 20.04 LTS operating system computers.
Recently Ubuntu 22.04 LTS appeared on the scene and I was wondering how to upgrade all my Ubuntu 20.04 LTS nodes without losing the current Kubespray Kubernetes deployment.
Depending on your cluster nodes configuration this guide is not going to fully fulfil your needs but you can extract the general idea. This is the scenario I covered:
5 Kubernetes cluster nodes with Ubuntu 20.04:
2 control-plane nodes (master) + etcd
1 worker node + etcd
2 worker nodes (no etcd)
Let's start, backup and disclaimer
First of all, one important thing, Kubespray is fragile, I repeat, Kubespray is extremely fragile, it can ruin the whole cluster deployment if you do a small detail that collides with something hidden you didn’t expect. You must have previous experience managing Kubespray and familiar with Kubernetes to do this upgrade.
So, first, node by node make a full backup of ALL OS disks using Clonezilla.
If you don't do a backup of all your nodes, you will do the steps below are at your own risk of losing your cluster, upgrading OS can fully destroy your cluster. It worked to me but it doesn't mean it will work for you. Please read the whole post before starting any step.
Before the cluster OS upgrade, all your nodes should be up and running successfully. If any of them doesn't work well, the upgrade can't break your cluster. First switch them on and if any of them doesn't join the cluster, fix it first of all.
Cluster OS upgrade from Ubuntu 20.04 to Ubuntu 22.04
1) Update to the latest Ubuntu 20.04 packages all your nodes with this command:
sudo apt update -y && sudo apt upgrade -y && sudo apt autoremove && sudo reboot
2) Upgrade each of your node with below steps: 3) to 11) in this specific way:
- First upgrade your worker nodes with no etcd.
- Second upgrade your worker nodes with etcd.
- Third upgrade your control-plane master nodes.
IMPORTANT NOTES:
(A) In your Kubespray cluster inventory hosts.yaml file When it will be the turn to remove a control-plane master node, you have to leave another one at the top of the hosts.yaml list so, for instance if you are going to remove a node named nuc1 and nuc1 is the first of the list, change the order and put another master like nuc2 at the top of the list:
children:->kube-master->hosts:
(B) In your Kubespray cluster inventory hosts.yaml
children:->etcd:->hosts:
etcd node number needs to be always odd, adjust it when removing nodes to upgrade their OS.
3) In the node you want to upgrade, list and upgrade the remaining packages that weren't updated before:
sudo apt list --upgradable
in my case only docker-ce-cli was listed as not upgraded, force to upgrade it running:
sudo apt upgrade docker-ce-cli
NOTE: if you have RabbitMQ server package installed and running, it gave problems to me so, better to remove it temporarily:
sudo -u rabbitmq rabbitmqctl stop
update-rc.d -f rabbitmq-server remove
4) Make a backup of these files from your Ubuntu 20.04 distro node, later can be useful:
/etc/systemd/resolved.conf
/etc/sysctl.conf
/etc/sudoers
5) It is not recommended to do the upgrade from a SSH terminal (at your risk), if something goes wrong or the upgrade gets stuck, your node upgrade control can be difficult to retake, this is the warning message you will see if you try it:
Start now upgrading your Ubuntu node distro from 20.04 to 22.04. Run this command from the console:
sudo do-release-upgrade
After it, probably you will see something similar to this:
Continue with 'y':
During the Ubuntu upgrade, you will be asked some questions regarding config files to overwrite or not. I answered ‘Y’ to all and after the release upgrade I only needed to recover 1 line from /etc/sudoers file backup. In your case, it can be different but you can always compare them from your backup after the upgrade and you will be able to recover missing lines. These are the questions I experienced:
6) When the node upgrade is shown as completed, shutdown that node running in the command line:
sudo halt -p
7) Now we are going to delete the node in Kubernetes. If you upgraded a master node or a etcd node check the important notes defined in step 2) and adjust hosts.yaml if it’s needed.
Enter in the root directory of the Kubespray control machine where your Kubespray code is cloned.
In my case, the hosts definition file is located in inventory/mycluster/hosts.yaml and my node name is beelink1, change it in the next command line if it’s different for you.
Run this command line to remove temporarily the node from the existing cluster:
ansible-playbook -i inventory/mycluster/hosts.yaml -b remove-node.yml -e node=beelink1 -e reset_nodes=false -e allow_ungraceful_removal=true
It is going to ask you to confirm the node deletion from cluster, say yes:
8) When the node is removed from your cluster using Kubespray, power on your node again.
9) If you are upgrading a master node or a etcd node check the important notes defined in step 2), adjust hosts.yaml if it’s needed. Run scale to add your node as it was defined before you removed it.
ansible-playbook -i inventory/mycluster/hosts.yaml -b scale.yml -b -v
10) Your node should be appearing re-joined to your cluster and upgraded to Ubuntu 22.04, you can check it with:
kubectl get nodes
11) Now as mentioned in step 2) you have to repeat these previous steps to the next node if there is any still remaining to be upgraded.