Guide: Kubernetes Cluster Install Streamlined via Kubeadm

Kubernetes cluster installation guide

Last updated on September 3, 2024

The guide below streamlines the installation of the latest kubernetes version 1.30.3 (via Kubeadm) by aggregating all the steps to get a cluster up and running. Links to the official documentation are provided so you can conveniently take a deeper dive into the installation steps.

Helpful Tips Before You Start

  • You will need a minimum of 2 VMs (nodes) to follow this guide. I am using jammy-server-cloudimg-amd64.ova from Ubuntu cloud images. Please note, it is best practice to use a “non-node” machine to manage kubernetes clusters (i.e. running kubectl commands).
  • Keep in mind, all kubernetes node VMs must have unique mac addresses, hostnames and product -uuids. Therefore using clones of VMs is not going to work without manual manipulation of the VM. Templating a VM is a better option.
  • If you are planning on provisioning several nodes, it would save time to use some type of multi-execution terminal such as MobaXterm or alike.
  • If you are planning to create a cluster with one control-plane node AND want to expand the number of the control-planes in the future, you need to create an endpoint for these VMs ahead of time (IP address and/or DNS name). This endpoint needs to be passed as an argument when creating a cluster via kubeadm. kubeadm does not support adding a control plane endpoint AFTER cluster creation. I suggest creating an A-Record if you have access to a DNS server and/or use shareable virtual IP address (i.e. keepalived). You could also go all-out and use a load balancer with shareable ip and a DNS name.
  • If you have any questions or feedback feel free to sign in and shoot me a message or leave a comment below. I would be more than happy to help.

Preparing Nodes

Source: Installing kubeadm | Kubernetes

Let’s start by preparing all the nodes of the cluster. We need to apply specific configurations for kubernetes to work with our VMs. I use MobaXterm to speed up this process using its multi-execution feature. It allows me to execute terminal commands to multiple terminals in one pass.

We need to set up a root password for root access on all nodes:

sudo passwd root

Update and upgrade all nodes:

sudo apt update
sudo apt upgrade

Disable swap permanently by commenting out (adding #) the swap line if one exists. REBOOT if you make any changes:

# opens file that controls swap
vi /etc/fstab

Open Control Plane Ports (ufw)

In this step we open ports to our firewall to allow componets of kubernetes to communicate. Since our VM template deploys VMs with the firewall turned off, there will be some steps below to enable and check status of the firewall. Your firewall may already be enabled by default, however I still encourage you to follow the guide step-by-step. Running the commands below will only ensure your firewall is up and running.

Open the following ports on all control-plane nodes (The etcd ports assume native kubernetes hosting of etcd-server. Port 22 opened for ssh):

ProtocolDirectionPort RangePurposeUsed By
TCPInbound6443Kubernetes API serverAll
TCPInbound2379-2380ETCD Server Client APIkube-apiserver, etcd
TCPInbound10250Kubelet APISelf, Control plane
TCPInbound10259kube-schedulerSelf
TCPInbound10257kube-controller-managerSelf
Source: https://kubernetes.io/docs/reference/networking/ports-and-protocols/
sudo ufw allow 22,6443,10250,10259,10257,2379:2380/tcp

Enable ufw firewall service (Note: Our VM template’s firewall is disabled by default)

sudo ufw enable

Confirm ports are open:

sudo ufw status verbose

Output similiar to:

To                                         Action      From
--                                         ------      ----
2379:2380,6443,10250,10257,10259/tcp       ALLOW IN    Anywhere
2379:2380,6443,10250,10257,10259/tcp (v6)  ALLOW IN    Anywhere (v6)

Enable and restart ufw.service (Note: Our VM template’s firewall is disabled by default):

sudo systemctl enable ufw
sudo systemctl restart ufw

Check ufw.service is enabled & active. Reboot VM:

sudo systemctl status ufw

Output similiar to:

● ufw.service - Uncomplicated firewall
     Loaded: loaded (/lib/systemd/system/ufw.service; enabled; vendor preset: enabled)
     Active: active (exited) since Thu 2024-07-18 06:59:54 UTC; 7h ago
       Docs: man:ufw(8)
    Process: 617 ExecStart=/lib/ufw/ufw-init start quiet (code=exited, status=0/SUCCESS)
   Main PID: 617 (code=exited, status=0/SUCCESS)
        CPU: 27ms
sudo reboot

Open worker-node Ports (ufw)

Open the following ports on all worker-node(s). Any planned custom ports should be opened as well (Note: port 22 added for ssh):

ProtocolDirectionPort RangePurposeUsed By
TCPInbound10250Kubelet APISelf, Control Plane
TCPInbound10256kube-proxySelf, Load Balancers
TCPInbound30000-32767NodePort SvcsAll
Source: https://kubernetes.io/docs/reference/networking/ports-and-protocols/
sudo ufw allow 22,10250,10256,30000:32767/tcp

Enable ufw firewall service (Note: Our VM template’s firewall is disabled by default):

sudo ufw enable

Confirm ports are open:

sudo ufw status verbose

Output should look similiar to:

To                                Action      From
--                                ------      ----
10250,10256,30000:32767/tcp       ALLOW IN    Anywhere
10250,10256,30000:32767/tcp (v6)  ALLOW IN    Anywhere (v6)

Enable and restart ufw.service (Note: Our VM template’s firewall is disabled by default):

sudo systemctl enable ufw
sudo systemctl restart ufw

Check ufw.service is enabled & active. Reboot VM:

sudo systemctl status ufw

Output similiar to:

● ufw.service - Uncomplicated firewall
     Loaded: loaded (/lib/systemd/system/ufw.service; enabled; vendor preset: enabled)
     Active: active (exited) since Thu 2024-07-18 06:59:54 UTC; 7h ago
       Docs: man:ufw(8)
    Process: 617 ExecStart=/lib/ufw/ufw-init start quiet (code=exited, status=0/SUCCESS)
   Main PID: 617 (code=exited, status=0/SUCCESS)
        CPU: 27ms
sudo reboot

Note: More ports may need to be opened for all nodes depending on the CNI used for the pod network (i.e. Flannel, Cilium, Calico, etc). We will refer to specific CNI documentation later in this guide.

Install & Configure Container Runtime (Containerd)

Source: Container Runtimes | Kubernetes

In Kubernetes, a container runtime is a foundational technology responsible for executing containers across a cluster. It handles tasks like image management, container execution, networking, and storage. Essentially, it provides the necessary isolation and resource management required by containers. Kubernetes supports various container runtimes, including containerd, CRI-O, and other implementations of the Container Runtime Interface (CRI).

I am installing containerd as the container runtime. Containerd must be installed on ALL NODES.

Prerequisites for Container Runtime Install

The parameters below ensure proper networking and IP forwarding behavior, which is essential for container runtimes in Kubernetes. Run these commands on all nodes.

# modprobe overlay & br_netfilter added for persistence across reboots
cat <<EOF | sudo tee /etc/modules-load.d/k8s.conf
overlay
br_netfilter
EOF

# start overlay & br_netfilter for current session 
sudo modprobe overlay
sudo modprobe br_netfilter

# sysctl params required by setup, params persist across reboots
cat <<EOF | sudo tee /etc/sysctl.d/k8s.conf
net.bridge.bridge-nf-call-iptables  = 1
net.bridge.bridge-nf-call-ip6tables = 1
net.ipv4.ip_forward                 = 1
EOF

# Apply sysctl params without reboot
sudo sysctl --system

Verify sysctl parameters are configured:

sysctl net.bridge.bridge-nf-call-iptables
sysctl net.bridge.bridge-nf-call-ip6tables
sysctl net.ipv4.ip_forward

Install Containerd Container Runtime

Containerd must be installed on all nodes. We start by adding the official GPG key and repository:

# Add Docker's official GPG key:
sudo apt-get update
sudo apt-get install ca-certificates curl
sudo install -m 0755 -d /etc/apt/keyrings
sudo curl -fsSL https://download.docker.com/linux/ubuntu/gpg -o /etc/apt/keyrings/docker.asc
sudo chmod a+r /etc/apt/keyrings/docker.asc

# Add the repository to Apt sources:
echo \
"deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.asc] https://download.docker.com/linux/ubuntu \
$(. /etc/os-release && echo "$VERSION_CODENAME") stable" | \
sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
sudo apt-get update

Install the latest containerd container runtime:

sudo apt install containerd.io

Enable containerd & check it is enabled and active (running):

sudo systemctl daemon-reload
sudo systemctl enable --now containerd
systemctl status containerd

Output similiar to:

● containerd.service - containerd container runtime
     Loaded: loaded (/lib/systemd/system/containerd.service; enabled; vendor preset: enabled)
     Active: active (running) since Thu 2024-07-18 15:51:28 UTC; 2min 41s ago
       Docs: https://containerd.io
    Process: 13358 ExecStartPre=/sbin/modprobe overlay (code=exited, status=0/SUCCESS)
   Main PID: 13359 (containerd)
      Tasks: 8
     Memory: 12.9M
        CPU: 83ms
     CGroup: /system.slice/containerd.service
             └─13359 /usr/bin/containerd

Jul 18 15:51:28 kadm-wk-2-b containerd[13359]: time="2024-07-18T15:51:28.829114045Z" level=info msg="s>
...
Jul 18 15:51:28 kadm-wk-2-b systemd[1]: Started containerd container runtime.

Create containerd config.toml with default configuration:

containerd config default > /etc/containerd/config.toml

Next, we need to configure the file /etc/containerd/config.toml by changing SystemdCgroup = false to SystemdCgroup = true:

sudo vi /etc/containerd/config.toml

Find and change SystemdCgroup = true:

[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc]
  ...
  [plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc.options]
    SystemdCgroup = true

Sandbox_Image = “registry.k8s.io/pause:3.8”

We need to change the pause image to at least 3.9 for kubernetes 1.30.x. The pause image in containerd serves as the base image for all containers. It’s a minimal, lightweight image that provides essential functionality for container initialization and orchestration.

At the time of this post, the latest pause image version is 3.10, however, containerd apt install still defaults this image version to 3.8. Find sandbox_image = "registry.k8s.io/pause:3.10" and make sure it isn’t less than version 3.9:

# open config.toml in text editor
vi /etc/containerd/config.toml
...
sandbox_image = "registry.k8s.io/pause:3.10"...

Restart containerd service:

sudo daemon-reload
sudo systemctl restart containerd

Installing kubeadm, kubectl and kubelet

Source: Installing kubeadm | Kubernetes

We are installing the latest versions (as of 07/16/2024) for kubernetes 1.30.3. All packages must be installed on ALL nodes (control-plane(s) & workers-node(s). You may exclude kubectl from the worker-nodes, but it isn’t necessary.

Version skew can create problems within kubernetes clusters. Be aware, package versions are designed to be backward compatible by a very small margin and sometimes not backward compatible at all. For example, kube-api-server always needs to be a higher version than kubelet. Therefore kubelet is not “backwards compatible” with kube-apiserver. We don’t have to worry about this since we are installing all the latest versions to date for all packages. For more information regarding version skew: kubernetes version skew, kubeadm version skew.

Update apt package and install kubernetes apt repository:

sudo apt-get update
# apt-transport-https may be a dummy package; if so, you can skip that package
sudo apt-get install -y apt-transport-https ca-certificates curl gpg

Download public signing keys for kubernetes repos:

# If the directory `/etc/apt/keyrings` does not exist, it should be created before the curl command, follow the comment below, else go to curl command.
# sudo mkdir -p -m 755 /etc/apt/keyrings

curl -fsSL https://pkgs.k8s.io/core:/stable:/v1.30/deb/Release.key | sudo gpg --dearmor -o /etc/apt/keyrings/kubernetes-apt-keyring.gpg

Add kubernetes apt repository 1.30:

# This overwrites any existing configuration in /etc/apt/sources.list.d/kubernetes.list
echo 'deb [signed-by=/etc/apt/keyrings/kubernetes-apt-keyring.gpg] https://pkgs.k8s.io/core:/stable:/v1.30/deb/ /' | sudo tee /etc/apt/sources.list.d/kubernetes.list

Update apt package index, install kubeadm, kubectl and kubelet and hold versions from future apt updates:

sudo apt-get update
sudo apt-get install -y kubelet kubeadm kubectl
sudo apt-mark hold kubelet kubeadm kubectl

Check package versions they should all be 1.30.x (or latest version):

kubeadm version
kubectl version --client
kubelet --version

Enable kubelet service:

sudo systemctl enable --now kubelet

Initializing kubernetes Cluster via Kubeadm

Source: Creating a cluster with kubeadm | Kubernetes

Before initializing the cluster check to see if you have more than one gateway configured on any host machine:

ip route show # Look at "default via x.x.x.x"

If you have more than one default gateway, please follow the official documentation here and then continue on with this guide.

Initializing the control-plane Node for High Availability

Below is the configuration for creating a single control-plane node that is upgradeable to add more control-planes at any time. We pass --control-plane-endpoint during kubeadm init, to achieve the upgradeable option. In other words, if you want the option to expand the number of control-planes in your cluster, you need to pass an assignable IP address or DNS name to the argument --control-plane-endpoint . Note: kubeadm does not support adding --control-plane-endpoint after a cluster is created.

I am using a highly available load balancer configuration using two HAProxy servers and keepalived shareable IP address. Another option is to pass a DNS name as the --control-plane-endpoint allowing you to reassign the DNS name for high availability later.

Choose a Container Network Interface (CNI) based Pod Network Add-On

We need to choose a CNI pod network add-on to use with our cluster. CNIs are third-party add-ons and may have specific parameters that need to be passed during kubeadm init. We need to check out the CNI add-on documentation before running the kubeadm init command. I am using Calico in this setup which requires us to pass --pod-network-cidr=x.x.x.x/16 during kubeadm init (Calico Quickstart).

REMINDER: Check the CNI documentation for any additional ports that need to be opened.

Our calico deployment requires us to open up ports 179/TCP for BGP support, 4789/UDP for VXLAN, 5473/TCP for Typha communications, 9090/TCP for calico-node, 9081/TCP for Prometheus metrics, 9900/TCP for Prometheus BGP metrics and 9093/TCP Prometheus Alertmanager on ALL nodes. Let’s open them now:

# open ports
sudo ufw allow 179,5473,9090,9081,9900,9093/tcp
sudo ufw allow 4789/udp

# check ports open
ufw status

output: (control-plane, worker-node will show different ports open)
To                                        Action      From
--                                        ------      ----
2379:2380,6443,10250,10257,10259/tcp      ALLOW       Anywhere
22/tcp                                    ALLOW       Anywhere
179/tcp                                   ALLOW       Anywhere
179,9081,9090,9093,9900/tcp               ALLOW       Anywhere
4789/udp                                  ALLOW       Anywhere
2379:2380,6443,10250,10257,10259/tcp (v6) ALLOW       Anywhere (v6)
22/tcp (v6)                               ALLOW       Anywhere (v6)
179/tcp (v6)                              ALLOW       Anywhere (v6)
179,9081,9090,9093,9900/tcp (v6)          ALLOW       Anywhere (v6)
4789/udp (v6)                             ALLOW       Anywhere (v6)

# repeat steps for ALL nodes in the cluster

Initializing control-plane via Kubeadm

Now that we have our parameters for kubeadm init, it is time to initialize the cluster with the first control-plane.

kubeadm init --pod-network-cidr=172.244.0.0/16 --apiserver-advertise-address=10.10.0.63 --control-plane-endpoint=192.168.50.100

Where --pod-network-cidr is the ip addresses to be assigned for pod/container networking, --apiserver-advertise-address is the network interface IP address of the control-plane node, and --control-plane-endpoint is the shareable IP address for future expansion of our cluster’s control-planes (which currently points to our sole control-plane node).

The --control-plane-endpoint IP address (192.168.50.100) or DNS name should currently point to 10.10.0.63 aka --apiserver-advertise-address. Later, you can point it to your load balancer’s IP address when after you add more control-planes.

The output for a successful kubeadm init will look very similiar to the following:

Your Kubernetes control-plane has initialized successfully!

To start using your cluster, you need to run the following as a regular user:

  mkdir -p $HOME/.kube
  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
  sudo chown $(id -u):$(id -g) $HOME/.kube/config

Alternatively, if you are the root user, you can run:

  export KUBECONFIG=/etc/kubernetes/admin.conf

You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
  https://kubernetes.io/docs/concepts/cluster-administration/addons/

You can now join any number of control-plane nodes by copying certificate authorities
and service account keys on each node and then running the following as root:

  kubeadm join 192.168.50.100:6443 --token mq16ug.tja19txxxxxxxxxx \
        --discovery-token-ca-cert-hash sha256:7e21974c817367f1dc13894339243d0dc486f207d8c0xxxxxxxxxxxxxxxxxxxx \
        --control-plane

Then you can join any number of worker nodes by running the following on each as root:

kubeadm join 192.168.50.100:6443 --token mq16ug.tja19txxxxxxxxxx \
        --discovery-token-ca-cert-hash sha256:7e21974c817367f1dc13894339243d0dc486f207d8c0xxxxxxxxxxxxxxxxxxxx

Copy the kubeadm init output to a file and put it somewhere safe. For practice clusters, I put the file in a folder on the control-plane:

mkdir -p kubeadm_init
vi /$HOME/kubeadm_init/kubeadm_info
# paste the full output of kubeadm init in this file for later use.

Set Up kubectl Access to the Cluster

The kubeadm init output provides you with the commands to setup the KUBECONFIG for kubectl access:

mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config

You can now access your kubernetes cluster using kubectl. Let’s test it out.

kubectl get nodes

output:
NAME              STATUS     ROLES           AGE     VERSION
control-plane-1   NotReady   control-plane   3h35m   v1.30.3

kubectl get pods -n kube-system

output:
NAME                                 READY   STATUS    RESTARTS   AGE
coredns-7db6d8ff4d-gq7lb             0/1     Pending   0          3m
coredns-7db6d8ff4d-tn4b7             0/1     Pending   0          3m
etcd-kadm-cp-b1                      1/1     Running   1          3m
kube-apiserver-kadm-cp-b1            1/1     Running   1          3m
kube-controller-manager-kadm-cp-b1   1/1     Running   1          3m
kube-proxy-d5mlq                     1/1     Running   0          3m
kube-scheduler-kadm-cp-b1            1/1     Running   1          3m

Notice the ouput above is showing coredns pods are in “STATUS=Pending”. This is because we need to install the Container Network Interface (CNI) that handles pod networking.

Container Network Interface (CNI) calico vs cilium vs flannel

For this guide we are using calico for pod network add-on. The choice here may or may not be important depending on needs of your cluster. I think calico is a good all-around CNI thanks to it’s modularity. It allows you to deploy various configurations; lightweight, heavyweight or anywhere in between.

You could go with CNI behemoth like cilium. It has a ton of features for visibility, security, metrics and performance. However, with a ton of features comes complexity and more points of failure.

Flannel is on the other side of the spectrum, it is simple and lightweight. It probably works with any cluster with only 1-2 cli commands and, without any configurations after deployment. Flannel does lack a lot of features mentioned above and therefore is a good place to start when learning kubernetes.

Calico falls somewhere in between these two CNIs due to it’s modularity. It has a lot of features but you can choose to exclude almost all of them for a lightweight deployment. Plus, their github repository has many more contributors than the others, which I think is huge benefit. Whichever CNI you decide to use is up to you, however for this guide I am using calico

Install calico Pod Network Add-On

Source: Quickstart for Calico on Kubernetes | Calico Documentation (tigera.io)

Install the Tigera calico operator and custom resource definitions (CRDs):

kubectl create -f https://raw.githubusercontent.com/projectcalico/calico/v3.28.1/manifests/tigera-operator.yaml

output:
customresourcedefinition.apiextensions.k8s.io/bgpconfigurations.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/bgpfilters.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/bgppeers.crd.projectcalico.org created
...
clusterrole.rbac.authorization.k8s.io/tigera-operator created
clusterrolebinding.rbac.authorization.k8s.io/tigera-operator created
deployment.apps/tigera-operator created

We need to create the necessary custom resource to get calico pod networking up and running. Before we do that, we need to make sure that the pod CIDR 172.244.0.0/16 we passed in kubeadm init (--pod-network-cidr=172.244.0.0/16) matches the CIDR calico passes in the calico deployment manifests. So let’s download the manifest and check it out:

# make directory for the calcio manifest download
mkdir -p /$HOME/calico

# download the custom-resources.yaml manifest and save as yaml file
curl https://raw.githubusercontent.com/projectcalico/calico/v3.28.1/manifests/custom-resources.yaml > /$HOME/calico/custom-resources.yaml

Great! Now that we have the manifest (file) we can check it’s contents for the CIDR key:

vi /$HOME/calico/custom-resources.yaml

output:
# This section includes base Calico installation configuration.
# For more information, see: https://docs.tigera.io/calico/latest/reference/installation/api#operator.tigera.io/v1.Installation
apiVersion: operator.tigera.io/v1
kind: Installation
metadata:
  name: default
spec:
  # Configures Calico networking.
  calicoNetwork:
    ipPools:
    - name: default-ipv4-ippool
      blockSize: 26
      cidr: 192.168.0.0/16
      encapsulation: VXLANCrossSubnet
      natOutgoing: Enabled
      nodeSelector: all()

---

# This section configures the Calico API server.
# For more information, see: https://docs.tigera.io/calico/latest/reference/installation/api#operator.tigera.io/v1.APIServer
apiVersion: operator.tigera.io/v1
kind: APIServer
metadata:
  name: default
spec: {}

As you can see above, the CIDR does not match the IP addresss we passed with kubeadm init. Had it matched, that would have been a crazy coincidence. We need to change the CIDR to match our passed arguements:

# change CIDR value from 192.168.0.0/16 to 174.244.0.0/16 (my IP range)
cidr: 174.244.0.0/16

Everything else look good, time to apply the manifest to our cluster. This manifest will create the necessary objects to complete our pod networking stack:

kubectl create -f /$HOME/calico/custom-resources.yaml

Let’s take a look at what was created so far with via calico manifests:

kubectl get all -n calico-system

output:
NAME                                           READY   STATUS    RESTARTS   AGE
pod/calico-kube-controllers-664cbd5567-sq9x4   1/1     Running   0          1h
pod/calico-node-s6mcz                          1/1     Running   0          31m
pod/calico-node-t7gqv                          1/1     Running   0          31m
pod/calico-node-vtw4r                          1/1     Running   0          31m
pod/calico-node-wqhj7                          1/1     Running   0          31m
pod/calico-typha-5d84568fcf-8tncs              1/1     Running   0          1h
pod/calico-typha-5d84568fcf-sl8q6              1/1     Running   0          1h
pod/csi-node-driver-2mjnm                      2/2     Running   0          1h
pod/csi-node-driver-67t8l                      2/2     Running   0          1h
pod/csi-node-driver-tq97d                      2/2     Running   0          1h
pod/csi-node-driver-v29l8                      2/2     Running   0          1h

NAME                                      TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)    AGE
service/calico-kube-controllers-metrics   ClusterIP   None             <none>        9094/TCP   17h
service/calico-typha                      ClusterIP   10.104.254.103   <none>        5473/TCP   17h

NAME                             DESIRED   CURRENT   READY   UP-TO-DATE   AVAILABLE   NODE SELECTOR            AGE
daemonset.apps/calico-node       4         4         4       4            4           kubernetes.io/os=linux   17h
daemonset.apps/csi-node-driver   4         4         4       4            4           kubernetes.io/os=linux   17h

NAME                                      READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/calico-kube-controllers   1/1     1            1           1h
deployment.apps/calico-typha              2/2     2            2           1h

NAME                                                 DESIRED   CURRENT   READY   AGE
replicaset.apps/calico-kube-controllers-664cbd5567   1         1         1       1h
replicaset.apps/calico-typha-5d84568fcf              2         2         2       1h

Calico created many new objects in our cluster including some not displayed in the output above; such as cluster roles, cluster role bindings, services accounts, etc. Please note it can take over 10 mins for the calico objects to get up and running. This is particularly true for the calico pod csi-node-driver-xxxx. Wait for this pod to “STATUS=Running”, then continue.

Remove Any Lingering Taints

Before moving forward, lets check our control-plane for taints and remove any:

# check for taints
kubectl describe nodes | grep -i taints

output:
Taints:             node-role.kubernetes.io/control-plane
# remove taints with command below
kubectl taint nodes --all node-role.kubernetes.io/control-plane-

Install calicoctl

Source: Install calicoctl | Calico Documentation (tigera.io)

As mentioned previously, calico has a lot of features and modules that can be used to enhance our kubernetes environment. To manage and perform administrative functions for calico, we need to install calicoctl. Let’s do that before joining our worker-nodes to the cluster.

There are 3 ways to deploy calicoctl into our environment. For this guide, we will deploy it as a binary on a single host; the control-plane. Run the next few commands on the control-plane only.

First locate your PATH folders on the host:

echo $PATH

output:
/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin

We are going to place the calicoctl binary in one of the PATH locations. Navigate to this location in terminal:

cd /usr/local/bin

Let’s download the calicoctl binary to this location:

curl -L https://github.com/projectcalico/calico/releases/download/v3.28.1/calicoctl-linux-amd64 -o calicoctl

output:
% Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
100 64.4M  100 64.4M    0     0  21.2M      0  0:00:03  0:00:03 --:--:-- 40.2M

Lastly, lets set the file as executable and check if our terminal has access to calicoctl:

chmod +x ./calicoctl

calicoctl version

output:
root@kadm-cp-b1:~# calicoctl version
Client Version:    v3.28.1
Git commit:        413e6f559
Cluster Version:   v3.28.1
Cluster Type:      typha,kdd,k8s,operator,bgp,kubeadm

Great, calicoctl is working as expected. If you want to learn more about calico you can visit their website. They have extensive documentation, tutorials and AI to help you understand and deploy a robust pod network. For now, lets finish creating our kubernetes cluster. We need to add our 3 worker-nodes to the cluster.

Join worker-nodes to the Cluster

Previously, we saved our output from kubeadm init to /$HOME/kubeadm_init/kubeadm_info. We need to access that file to get the kubeadm join command for worker-nodes. Note, there is a seperate kubeadm join command for control-plane nodes:

cat /$HOME/kubeadm_init/kubeadm_info

output:
Your Kubernetes control-plane has initialized successfully!

To start using your cluster, you need to run the following as a regular user:

  mkdir -p $HOME/.kube
...
Then you can join any number of worker nodes by running the following on each as root:

kubeadm join 192.168.50.100:6443 --token mq16ug.tja19tstju4os173 \
        --discovery-token-ca-cert-hash sha256:7e21974c817367f1dc13894339243d0dc486f207d8c0xxxxxxxxxxxxxxxxxxxx

Great! We now have our join command for our worker-nodes. Let’s ssh/open terminal on each worker-node and run the kubeadm join command. First, we need to log on as root, as root priviledges are required to join worker-nodes to the cluster:

su
Password:
root@worker-node-1:/home/ubuntu#

While logged in as root user, we now run the kubeadm join command we copied from our file:

kubeadm join 192.168.50.100:6443 --token mq16ug.tja19tstju4os173 \
--discovery-token-ca-cert-hash sha256:7e21974c817367f1dc13894339243d0dc486f207d8c0xxxxxxxxxxxxxxxxxxxx

Joining a worker-node is generally only takes a few seconds. Let check to see how it went. Go back to your control-plane terminal and run:

kubectl get nodes

output:
NAME               STATUS   ROLES           AGE   VERSION
control-plane-1    Ready    control-plane   1h    v1.30.3
worker-node-1      Ready    <none>          68s   v1.30.3

Awesome, the worker-node has joined the cluster and is “STATUS=Ready”. Repeat the steps above for the rest of the worker-nodes. Remember, you need to log-in as root user when running the kubeadm join commands and go back to your kubernetes admin terminal (in our case the control-plane) to run kubectl get nodes.

If for some reason one of your nodes is “STATUS=NotReady”, ssh into that node and run commands:

systemctl daemon-reload
systemctl restart kubelet

Now go back to your kubernetes admin-terminal (control-plane terminal) and check your nodes. Ensure they are all “STATUS=Ready”.

kubectl get nodes

output:
NAME               STATUS   ROLES           AGE     VERSION
control-plane-1    Ready    control-plane   1h      v1.30.3
worker-node-1      Ready    <none>          2m23s   v1.30.3
worker-node-2      Ready    <none>          84s     v1.30.3
worker-node-3      Ready    <none>          71s     v1.30.3

Reminder: Best practice for managing the cluster is with a kubetnetes admin terminal that is not apart of the cluster. You can deploy another VM that is soley used to administer the cluster. Access to the cluster is achieved by copying /$HOME/.kube/config from the control-plane host and placing it in the kubernetes admin terminal’s /$HOME/.kube/. Then install kubectl on the admin terminal and you should be good to go. Check out the official documentation for more details on setting KUBECONFIG and cluster authentication.

Conclusion

That concludes the kubernetes cluster setup guide. I hope this guide was useful to you. Should you have any questions or feedback, please leave it in the comments below or shoot me a message.

Related Articles

Responses