半HA(High availability) おうちKubernetesクラスターの構築

はじめに
主に参考にしたページ
全体構成
ソフトウェアのバージョン
ロードバランサー
コントロールプレーン
ワーカー
kubeadm init(最初のコントロールプレーン向け)
kubeadm join(残りのコントロールプレーン向け)
kubeadm join(ワーカーノードの参加)
おわりに
参考

はじめに

下記では、ラズパイ3台でコントロールプレーン・ワーカーを含むクラスタを構築しました。

また、下記にHA(High Availability)構成についてまとめました。

KubernetesのHA構成(コントロールプレーン、etcd、ワーカー)

次のステップとして、ラズパイ3台でコントロールプレーンを構成するHA(High availability) おうちKubernetesクラスター構築に取り組みます。

ワーカーについてはx86(amd64)のPCを1台を使用します。 Kubernetesのクラスター構築方法を理解する上では、ラズパイ3台での構築は非常に有効であると感じましたが、実際になんらかのアプリケーションをデプロイすることを試していくとなると、公開されているDockerイメージの数の上で、より多くのアプリケーションを使用することができるx86アーキのマシンをワーカーとして使用することが有効であると思います。 (もちろんKubernetesをラズパイ/ARMで使用することが目的であればラズパイで問題ないと思います。)

本来ワーカーノードには最低3台必要ですが、今回の構成では1台設けることで半HA構成とします。半HA構成とは言っても、今後このままワーカーノードを増やしていくことができますので、構築方法としては本来のHA構成と同等となります。今後、できればワーカーノードを増やしたいと考えています。

なお、HA構成に対してはコントロールプレーンのノードを選択的に使用するため、ロードバランサー(load balancer:LB)が必要となります。

主に参考にしたページ

主に下記を参考にして作業を行いました。

Creating Highly Available Clusters with kubeadm | Kubernetes

Creating Highly Available Clusters with kubeadm

This page explains two different approaches to setting up a highly available Kubernetes cluster using kubeadm:With stacked control plane nodes. This approach re...

Creating Highly Available Kubernetes Cluster using kubeadm | by Heshani Samarasekara | Medium

Creating Highly Available Kubernetes Cluster using kubeadm

In this article, I’m going to build a kubernetes cluster using kubeadm. There are two options we can follow.

全体構成

Options for Highly Available Topology | Kubernetes

Options for Highly Available Topology

This page explains the two options for configuring the topology of your highly available (HA) Kubernetes clusters.You can set up an HA cluster:With stacked cont...

今回構築するのは下記のような構成となります。 (上記の公式ページの図を参考にしています)

ユーザーはロードバランサー(192.168.1.100:6443) にアクセスする(kubectrlを実行する)ことになります。

3台のラズパイと1台のPC、OpenWrtをインストールした1台のルーターを使用します。

192.168.1.100 OpenWrt (BUFFALO WZR-1750DHP)
192.168.1.201 k8s-ctrl1 (RaspberryPi 4B 8GB RAM)
192.168.1.202 k8s-ctrl2 (RaspberryPi 4B 8GB RAM)
192.168.1.203 k8s-ctrl3 (RaspberryPi 4B 8GB RAM)
192.168.1.211 k8s-worker1 (ASUS Chromebox3 i7-8550U 16GB RAM, 256GB NVMe SSD)

ソフトウェアのバージョン

Ubuntu 22.04
Kubernetes v1.29 (v1.29.4-2.1)
Calico v3.27.0

ロードバランサー

OpenWrtのパッケージに haproxy がありますので、Buffaloの小型ルーターにOpenWrtをインストールし、これをロードバランサーとして使用します。ロードバランサーのIPアドレスは 192.168.1.100 とします。下記の通り、ルーターにOpenWrtをインストールし、opkgにてhaproxyをインストール、設定しました。

いくつかの参考文献においては、haproxyとkeepalivedを組み合わせて構築する方法がありましたが、これは選択せず、より単純なhaproxyのみを使用する構成を選択しました。

keepalivedを使用する目的は、ロードバランサー自体の冗長構成であると理解しています。公式ページにも言及がありますが、ロードバランサーの構成方法はKubernetes自体には直接関係がありません。私の使用目的はKubernetesのキャッチアップ、理解が中心ですので、シンプルなhaproxyのみの構成で十分であると判断しました。一方で、今回の小型ルーター + OpenWrt を使用する場合には比較的安定して動作することが期待できると考えています。

コントロールプレーン

ハードウェアのセットアップ
OSのインストール
OS基本設定
containerd・kubernetesのインストール

下記と同様に、マシンを設定します。

hostnamectl set-hostname k8s-ctrl1
hostnamectl set-hostname k8s-ctrl2
hostnamectl set-hostname k8s-ctrl3

なお、/etc/hosts については下記の通り設定しました。

cat << _EOF_ | sudo tee -a /etc/hosts
192.168.1.201  k8s-ctrl1
192.168.1.202  k8s-ctrl2
192.168.1.203  k8s-ctrl3
192.168.1.211  k8s-worker1
_EOF_

ワーカー

ワーカーとなるマシンを下記の通りセットアップします。

ハードウェアのセットアップ
OSのインストール
OS基本設定
containerd・kubernetesのインストール

マシンの選定や具体的な設定方法については下記に記載しています。

kubeadm init(最初のコントロールプレーン向け)

kubeadm initの実行

k8s-ctrl1 で実行します。ここで、–control-plane-endpoint ではロードバランサーのIPアドレスを指定します。

$ sudo kubeadm init --control-plane-endpoint "192.168.1.100:6443" --upload-certs

kubeadm initの結果

kubeadm initの結果で次に行うべき内容が示されますので、必ずこれを控えておきます。

クラスターを使い始めるための設定
podネットワーク(CNI)のdeployについて
他のコントロールプレーンノードの参加方法
- 証明書(upload-certs)が期限切れになった場合の対応方法
ワーカーノードの参加方法

wurly@k8s-ctrl1:~$ sudo kubeadm init --control-plane-endpoint "192.168.1.100:6443" --upload-certs
I0506 22:26:35.713265    1438 version.go:256] remote version is much newer: v1.30.0; falling back to: stable-1.29
[init] Using Kubernetes version: v1.29.4
[preflight] Running pre-flight checks
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
W0506 22:26:36.858967    1438 checks.go:835] detected that the sandbox image "registry.k8s.io/pause:3.6" of the container runtime is inconsistent with that used by kubeadm. It is recommended that using "re
gistry.k8s.io/pause:3.9" as the CRI sandbox image.
[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Generating "ca" certificate and key
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [k8s-ctrl1 kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.96.0.1 192.168.1.201 192.168.1.100]
[certs] Generating "apiserver-kubelet-client" certificate and key
[certs] Generating "front-proxy-ca" certificate and key
[certs] Generating "front-proxy-client" certificate and key
[certs] Generating "etcd/ca" certificate and key
[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [k8s-ctrl1 localhost] and IPs [192.168.1.201 127.0.0.1 ::1]
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [k8s-ctrl1 localhost] and IPs [192.168.1.201 127.0.0.1 ::1]
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "apiserver-etcd-client" certificate and key
[certs] Generating "sa" key and public key
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[kubeconfig] Writing "admin.conf" kubeconfig file
[kubeconfig] Writing "super-admin.conf" kubeconfig file
[kubeconfig] Writing "kubelet.conf" kubeconfig file
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
[control-plane] Creating static Pod manifest for "kube-scheduler"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Starting the kubelet
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
[apiclient] All control plane components are healthy after 18.544814 seconds
[upload-config] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[kubelet] Creating a ConfigMap "kubelet-config" in namespace kube-system with the configuration for the kubelets in the cluster
[upload-certs] Storing the certificates in Secret "kubeadm-certs" in the "kube-system" Namespace
[upload-certs] Using certificate key:
cf6b62a26809ce3e4126c782badb0853e02d97dab46f90d7e895dd96ac1b3a1d
[mark-control-plane] Marking the node k8s-ctrl1 as control-plane by adding the labels: [node-role.kubernetes.io/control-plane node.kubernetes.io/exclude-from-external-load-balancers]
[mark-control-plane] Marking the node k8s-ctrl1 as control-plane by adding the taints [node-role.kubernetes.io/control-plane:NoSchedule]
[bootstrap-token] Using token: hdie35.u9airq6ychkt8amq
[bootstrap-token] Configuring bootstrap tokens, cluster-info ConfigMap, RBAC Roles
[bootstrap-token] Configured RBAC rules to allow Node Bootstrap tokens to get nodes
[bootstrap-token] Configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
[bootstrap-token] Configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
[bootstrap-token] Configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
[bootstrap-token] Creating the "cluster-info" ConfigMap in the "kube-public" namespace
[kubelet-finalize] Updating "/etc/kubernetes/kubelet.conf" to point to a rotatable kubelet client certificate and key
[addons] Applied essential addon: CoreDNS
[addons] Applied essential addon: kube-proxy

Your Kubernetes control-plane has initialized successfully!

To start using your cluster, you need to run the following as a regular user:

  mkdir -p $HOME/.kube
  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
  sudo chown $(id -u):$(id -g) $HOME/.kube/config

Alternatively, if you are the root user, you can run:

  export KUBECONFIG=/etc/kubernetes/admin.conf

You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
  https://kubernetes.io/docs/concepts/cluster-administration/addons/

You can now join any number of the control-plane node running the following command on each as root:

  kubeadm join 192.168.1.100:6443 --token hdie35.u9airq6ychkt8amq \
        --discovery-token-ca-cert-hash sha256:cd58094931470815be7e0b791357ce4ca6907cb861858915e17752baa6cfc18a \
        --control-plane --certificate-key cf6b62a26809ce3e4126c782badb0853e02d97dab46f90d7e895dd96ac1b3a1d

Please note that the certificate-key gives access to cluster sensitive data, keep it secret!
As a safeguard, uploaded-certs will be deleted in two hours; If necessary, you can use
"kubeadm init phase upload-certs --upload-certs" to reload certs afterward.

Then you can join any number of worker nodes by running the following on each as root:

kubeadm join 192.168.1.100:6443 --token hdie35.u9airq6ychkt8amq \
        --discovery-token-ca-cert-hash sha256:cd58094931470815be7e0b791357ce4ca6907cb861858915e17752baa6cfc18a

クラスターを使い始めるための設定

まずは指示通り(k8s-ctrl1上で)こちらを実行します。

mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config

kubectrlが実行できることを確認します。この時点では、corednsはPending状態になっています。

wurly@k8s-ctrl1:~$ kubectl get pod -n kube-system -w
NAME                                READY   STATUS    RESTARTS   AGE
coredns-76f75df574-mzng7            0/1     Pending   0          4m9s
coredns-76f75df574-t265x            0/1     Pending   0          4m9s
etcd-k8s-ctrl1                      1/1     Running   0          4m13s
kube-apiserver-k8s-ctrl1            1/1     Running   0          4m18s
kube-controller-manager-k8s-ctrl1   1/1     Running   0          4m13s
kube-proxy-d64kt                    1/1     Running   0          4m9s
kube-scheduler-k8s-ctrl1            1/1     Running   0          4m13s

CNI(Calico)のインストール

Kubernetes v1.29 に合わせ、下記の記載を確認し、Calico v3.27 を使用します。

System requirements | Calico Documentation

System requirements | Calico Documentation

Review requirements before installing Calico to ensure success.

We test Calico v3.27 against the following Kubernetes versions. Other versions may work, but we are not actively testing them.

v1.27

v1.28

v1.29

下記コマンドでデプロイできます。

kubectl apply -f https://raw.githubusercontent.com/projectcalico/calico/v3.27.0/manifests/calico.yaml

下記が実行結果です。

wurly@k8s-ctrl1:~$ kubectl apply -f https://raw.githubusercontent.com/projectcalico/calico/v3.27.0/manifests/calico.yaml
poddisruptionbudget.policy/calico-kube-controllers created
serviceaccount/calico-kube-controllers created
serviceaccount/calico-node created
serviceaccount/calico-cni-plugin created
configmap/calico-config created
customresourcedefinition.apiextensions.k8s.io/bgpconfigurations.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/bgpfilters.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/bgppeers.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/blockaffinities.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/caliconodestatuses.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/clusterinformations.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/felixconfigurations.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/globalnetworkpolicies.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/globalnetworksets.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/hostendpoints.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/ipamblocks.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/ipamconfigs.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/ipamhandles.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/ippools.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/ipreservations.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/kubecontrollersconfigurations.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/networkpolicies.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/networksets.crd.projectcalico.org created
clusterrole.rbac.authorization.k8s.io/calico-kube-controllers created
clusterrole.rbac.authorization.k8s.io/calico-node created
clusterrole.rbac.authorization.k8s.io/calico-cni-plugin created
clusterrolebinding.rbac.authorization.k8s.io/calico-kube-controllers created
clusterrolebinding.rbac.authorization.k8s.io/calico-node created
clusterrolebinding.rbac.authorization.k8s.io/calico-cni-plugin created
daemonset.apps/calico-node created
deployment.apps/calico-kube-controllers created

calico、corednsのコンテナが生成され始めます。

wurly@k8s-ctrl1:~$ kubectl get pod -n kube-system
NAME                                       READY   STATUS              RESTARTS   AGE
calico-kube-controllers-5fc7d6cf67-qn5xn   0/1     ContainerCreating   0          61s
calico-node-zz6c7                          0/1     Init:2/3            0          61s
coredns-76f75df574-mzng7                   0/1     ContainerCreating   0          10m
coredns-76f75df574-t265x                   0/1     ContainerCreating   0          10m
etcd-k8s-ctrl1                             1/1     Running             0          10m
kube-apiserver-k8s-ctrl1                   1/1     Running             0          10m
kube-controller-manager-k8s-ctrl1          1/1     Running             0          10m
kube-proxy-d64kt                           1/1     Running             0          10m
kube-scheduler-k8s-ctrl1                   1/1     Running             0          10m

calico-kube-controllers と coredns が ContainerCreating のままの現象

しかし、いつまで経ってもcalico-kube-controllers と coredns が ContainerCreating のままでした。

wurly@k8s-ctrl1:~$ kubectl get pod -n kube-system
NAME                                       READY   STATUS              RESTARTS        AGE
calico-kube-controllers-5fc7d6cf67-qn5xn   0/1     ContainerCreating   0               6m26s
calico-node-zz6c7                          1/1     Running             0               6m26s
coredns-76f75df574-mzng7                   0/1     ContainerCreating   0               15m
coredns-76f75df574-t265x                   0/1     ContainerCreating   0               15m
etcd-k8s-ctrl1                             1/1     Running             0               15m
kube-apiserver-k8s-ctrl1                   1/1     Running             0               15m
kube-controller-manager-k8s-ctrl1          1/1     Running             1 (4m42s ago)   15m
kube-proxy-d64kt                           1/1     Running             0               15m
kube-scheduler-k8s-ctrl1                   1/1     Running             1 (4m40s ago)   15m

$ kubectl describe pod calico-kube-controllers-5fc7d6cf67-t24hh

(略)
Events:
  Type     Reason                  Age                 From               Message
  ----     ------                  ----                ----               -------
  Normal   Scheduled               16m                 default-scheduler  Successfully assigned kube-system/cali  Warning  FailedCreatePodSandBox  16m                 kubelet            Failed to create pod sandbox: rpc erroed (add): failed to create host netlink handle: protocol not supported
  Warning  FailedCreatePodSandBox  16m                 kubelet            Failed to create pod sandbox: rpc erroed (add): failed to create host netlink handle: protocol not supported
  Warning  FailedCreatePodSandBox  16m                 kubelet            Failed to create pod sandbox: rpc erroed (add): failed to create host netlink handle: protocol not supported
  Warning  FailedCreatePodSandBox  16m                 kubelet            Failed to create pod sandbox: rpc erroed (add): failed to create host netlink handle: protocol not supported
  Warning  FailedCreatePodSandBox  16m                 kubelet            Failed to create pod sandbox: rpc erroed (add): failed to create host netlink handle: protocol not supported
  Warning  FailedCreatePodSandBox  15m                 kubelet            Failed to create pod sandbox: rpc erroed (add): failed to create host netlink handle: protocol not supported
  Warning  FailedCreatePodSandBox  15m                 kubelet            Failed to create pod sandbox: rpc erroed (add): failed to create host netlink handle: protocol not supported
  Warning  FailedCreatePodSandBox  15m                 kubelet            Failed to create pod sandbox: rpc erroed (add): failed to create host netlink handle: protocol not supported
  Warning  FailedCreatePodSandBox  15m                 kubelet            Failed to create pod sandbox: rpc erroed (add): failed to create host netlink handle: protocol not supported
  Warning  FailedCreatePodSandBox  98s (x58 over 14m)  kubelet            (combined from similar events): Failed8a69": plugin type="calico" failed (add): failed to create host netlink handle: protocol not supported

下記の要因でcreateできていない模様です。

Failed to create pod sandbox: rpc erroed (add): failed to create host netlink handle: protocol not supported

calicoが立ち上がらない現象への対処(linux-modules-extra-raspiのインストール)

上記メッセージでググってみたところ、下記があやしそうです。

Calico fails on 21.10 when running on raspberry pi · Issue #2680 · canonical/microk8s · GitHub

Calico fails on 21.10 when running on raspberry pi · Issue #2680 · canonical/microk8s

I believe this may be the root cause of issue #2663 (comment) Calico is not starting correctly - there is no calico.vxlan interface created and attempting to de...

I have been chasing down this issue on my 7 node stack. Not sure if you got the same problem but i never got any containers up. Found out that in Ubuntu 21.10 i had to install sudo apt install linux-modules-extra-raspi after stop and start it came up and working! 🙂

試しに、linux-modules-extra-raspi をインストールしてみたところ・・・

wurly@k8s-ctrl1:~$ sudo apt install linux-modules-extra-raspi
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
The following additional packages will be installed:
  linux-modules-extra-5.15.0-1053-raspi
The following NEW packages will be installed:
  linux-modules-extra-5.15.0-1053-raspi linux-modules-extra-raspi
0 upgraded, 2 newly installed, 0 to remove and 0 not upgraded.
Need to get 19.7 MB of archives.
After this operation, 98.6 MB of additional disk space will be used.
Do you want to continue? [Y/n] y
Get:1 http://ports.ubuntu.com/ubuntu-ports jammy-updates/main arm64 linux-modules-extra-5.15.0-1053-raspi arm64 5.15.0-1053.56 [19.7 MB]
Get:2 http://ports.ubuntu.com/ubuntu-ports jammy-updates/main arm64 linux-modules-extra-raspi arm64 5.15.0.1053.50 [2390 B]                                                                                                                                              
Fetched 19.7 MB in 9s (2091 kB/s)                                                                                                                                                                                                                                        
Selecting previously unselected package linux-modules-extra-5.15.0-1053-raspi.
(Reading database ... 102389 files and directories currently installed.)
Preparing to unpack .../linux-modules-extra-5.15.0-1053-raspi_5.15.0-1053.56_arm64.deb ...
Unpacking linux-modules-extra-5.15.0-1053-raspi (5.15.0-1053.56) ...
Selecting previously unselected package linux-modules-extra-raspi.
Preparing to unpack .../linux-modules-extra-raspi_5.15.0.1053.50_arm64.deb ...
Unpacking linux-modules-extra-raspi (5.15.0.1053.50) ...
Setting up linux-modules-extra-5.15.0-1053-raspi (5.15.0-1053.56) ...
Setting up linux-modules-extra-raspi (5.15.0.1053.50) ...
Processing triggers for linux-image-5.15.0-1053-raspi (5.15.0-1053.56) ...
/etc/kernel/postinst.d/initramfs-tools:
update-initramfs: Generating /boot/initrd.img-5.15.0-1053-raspi
Using DTB: bcm2711-rpi-4-b.dtb
Installing /lib/firmware/5.15.0-1053-raspi/device-tree/broadcom/bcm2711-rpi-4-b.dtb into /boot/dtbs/5.15.0-1053-raspi/./bcm2711-rpi-4-b.dtb
Taking backup of bcm2711-rpi-4-b.dtb.
Installing new bcm2711-rpi-4-b.dtb.
flash-kernel: deferring update (trigger activated)
/etc/kernel/postinst.d/zz-flash-kernel:
Using DTB: bcm2711-rpi-4-b.dtb
Installing /lib/firmware/5.15.0-1053-raspi/device-tree/broadcom/bcm2711-rpi-4-b.dtb into /boot/dtbs/5.15.0-1053-raspi/./bcm2711-rpi-4-b.dtb
Taking backup of bcm2711-rpi-4-b.dtb.
Installing new bcm2711-rpi-4-b.dtb.
flash-kernel: deferring update (trigger activated)
Processing triggers for flash-kernel (3.104ubuntu20) ...
Using DTB: bcm2711-rpi-4-b.dtb
Installing /lib/firmware/5.15.0-1053-raspi/device-tree/broadcom/bcm2711-rpi-4-b.dtb into /boot/dtbs/5.15.0-1053-raspi/./bcm2711-rpi-4-b.dtb
Taking backup of bcm2711-rpi-4-b.dtb.
Installing new bcm2711-rpi-4-b.dtb.
flash-kernel: installing version 5.15.0-1053-raspi
Taking backup of vmlinuz.
Installing new vmlinuz.
Taking backup of initrd.img.
Installing new initrd.img.
Taking backup of uboot_rpi_arm64.bin.
Installing new uboot_rpi_arm64.bin.
Taking backup of uboot_rpi_4.bin.
Installing new uboot_rpi_4.bin.
Taking backup of uboot_rpi_3.bin.
Installing new uboot_rpi_3.bin.
Generating boot script u-boot image... done.
Taking backup of boot.scr.
Installing new boot.scr.
Taking backup of start4.elf.
Installing new start4.elf.
Taking backup of fixup4db.dat.
Installing new fixup4db.dat.
Taking backup of start.elf.
Installing new pca953x.dtbo.
(中略)
Taking backup of iqaudio-dacplus.dtbo.
Installing new iqaudio-dacplus.dtbo.
Taking backup of hifiberry-dac.dtbo.
Installing new hifiberry-dac.dtbo.
Taking backup of spi-rtc.dtbo.
Installing new spi-rtc.dtbo.
Taking backup of spi2-1cs.dtbo.
Installing new spi2-1cs.dtbo.
Taking backup of cap1106.dtbo.
Installing new cap1106.dtbo.
Taking backup of w5500.dtbo.
Installing new w5500.dtbo.
Taking backup of minipitft13.dtbo.
Installing new minipitft13.dtbo.
Taking backup of README.
Installing new README.
Scanning processes...                                                                                                                                                                                                                                                     Scanning processor microcode...                                                                                                                                                                                                                                           Scanning linux images...                                                                                                                                                                                                                                                  
Running kernel seems to be up-to-date.

Failed to check for processor microcode upgrades.

No services need to be restarted.

No containers need to be restarted.

No user sessions are running outdated binaries.

No VM guests are running outdated hypervisor (qemu) binaries on this host.

インストール中に、全podがRunningになりました。見事！

$ kubectl get pod -n kube-system
NAME                                       READY   STATUS    RESTARTS      AGE
calico-kube-controllers-5fc7d6cf67-t24hh   1/1     Running   0             32m
calico-node-dq6xq                          1/1     Running   0             27m
coredns-76f75df574-mzng7                   1/1     Running   0             58m
coredns-76f75df574-t265x                   1/1     Running   0             58m
etcd-k8s-ctrl1                             1/1     Running   0             58m
kube-apiserver-k8s-ctrl1                   1/1     Running   0             58m
kube-controller-manager-k8s-ctrl1          1/1     Running   1 (48m ago)   58m
kube-proxy-d64kt                           1/1     Running   0             58m
kube-scheduler-k8s-ctrl1                   1/1     Running   1 (47m ago)   58m

あとで linux-modules-extra-raspi でググっていたところ、こちらの記事もありました。

ラズパイのOSがUbuntuで、KubernetesのCNIを追加したらエラーになった場合の対処法 #RaspberryPi – Qiita

ラズパイのOSがUbuntuで、KubernetesのCNIを追加したらエラーになった場合の対処法 - Qiita

概要UbuntuをインストールしたラズベリーパイをKubernetesのノードに追加して、CalicoやFlannelでCNI（Container Network Interface）の設定をしよ…

kubeadm join(残りのコントロールプレーン向け)

kubeadm join –control-plane の実行(k8s-ctrl2)

k8s-ctrl2 で実行します。

$ sudo kubeadm join 192.168.1.100:6443 --token hdie35.u9airq6ychkt8amq \
        --discovery-token-ca-cert-hash sha256:cd58094931470815be7e0b791357ce4ca6907cb861858915e17752baa6cfc18a \
        --control-plane --certificate-key cf6b62a26809ce3e4126c782badb0853e02d97dab46f90d7e895dd96ac1b3a1d

証明書が見つからない(削除された)場合

実行してみたところ下記のように言われてしまいました。 kubeadm init を実行してからしばらく経ってからkubeadm joinを行う場合に発生します。

[download-certs] Downloading the certificates in Secret "kubeadm-certs" in the "kube-system" Namespace
Secret "kubeadm-certs" was not found in the "kube-system" Namespace. This Secret might have expired. Please, run `kubeadm init phase upload-certs --upload-certs` on a control plane to generate a new one

upload-certs の再アップロード

手順に従い、k8s-ctrl1 で証明書を再アップロードします。

wurly@k8s-ctrl1:~$ sudo kubeadm init phase upload-certs --upload-certs
[sudo] password for wurly: 
I0507 08:03:31.056618   32741 version.go:256] remote version is much newer: v1.30.0; falling back to: stable-1.29
[upload-certs] Storing the certificates in Secret "kubeadm-certs" in the "kube-system" Namespace
[upload-certs] Using certificate key:
0626e7a06e87e81e158ac7d9d1bb1f4f8adbcd29b0ff34659de706ec533cf105

kubeadm join –control-plane の再実行(k8s-ctrl2)

k8s-ctrl2 で再度 kubeadm join を実行します。

$ sudo kubeadm join 192.168.1.100:6443 --token hdie35.u9airq6ychkt8amq \
        --discovery-token-ca-cert-hash sha256:cd58094931470815be7e0b791357ce4ca6907cb861858915e17752baa6cfc18a \
        --control-plane --certificate-key 0626e7a06e87e81e158ac7d9d1bb1f4f8adbcd29b0ff34659de706ec533cf105

今度は正常にjoinできました。

wurly@k8s-ctrl2:~$ sudo kubeadm join 192.168.1.100:6443 --token hdie35.u9airq6ychkt8amq \
        --discovery-token-ca-cert-hash sha256:cd58094931470815be7e0b791357ce4ca6907cb861858915e17752baa6cfc18a \
        --control-plane --certificate-key 0626e7a06e87e81e158ac7d9d1bb1f4f8adbcd29b0ff34659de706ec533cf105
[preflight] Running pre-flight checks
[preflight] Reading configuration from the cluster...
[preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'
[preflight] Running pre-flight checks before initializing the new control plane instance
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
W0507 08:05:48.572380   17476 checks.go:835] detected that the sandbox image "registry.k8s.io/pause:3.6" of the container runtime is inconsistent with that used by kubeadm. It is recommended that using "registry.k8s.io/pause:3.9" as the CRI sandbox image.
[download-certs] Downloading the certificates in Secret "kubeadm-certs" in the "kube-system" Namespace
[download-certs] Saving the certificates to the folder: "/etc/kubernetes/pki"
[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [k8s-ctrl2 localhost] and IPs [192.168.1.202 127.0.0.1 ::1]
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [k8s-ctrl2 localhost] and IPs [192.168.1.202 127.0.0.1 ::1]
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "apiserver-etcd-client" certificate and key
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [k8s-ctrl2 kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.96.0.1 192.168.1.202 192.168.1.100]
[certs] Generating "apiserver-kubelet-client" certificate and key
[certs] Generating "front-proxy-client" certificate and key
[certs] Valid certificates and keys now exist in "/etc/kubernetes/pki"
[certs] Using the existing "sa" key
[kubeconfig] Generating kubeconfig files
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[kubeconfig] Writing "admin.conf" kubeconfig file
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
[control-plane] Creating static Pod manifest for "kube-scheduler"
[check-etcd] Checking that the etcd cluster is healthy
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Starting the kubelet
[kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap...
[etcd] Announced new etcd member joining to the existing etcd cluster
[etcd] Creating static Pod manifest for "etcd"
{"level":"warn","ts":"2024-05-07T08:06:02.165095+0900","logger":"etcd-client","caller":"v3@v3.5.10/retry_interceptor.go:62","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0x400043f880/192.168.1.201:2379","attempt":0,"error":"rpc error: code = FailedPrecondition desc = etcdserver: can only promote a learner member which is in sync with leader"}
{"level":"warn","ts":"2024-05-07T08:06:02.275197+0900","logger":"etcd-client","caller":"v3@v3.5.10/retry_interceptor.go:62","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0x400043f880/192.168.1.201:2379","attempt":0,"error":"rpc error: code = FailedPrecondition desc = etcdserver: can only promote a learner member which is in sync with leader"}
{"level":"warn","ts":"2024-05-07T08:06:02.429338+0900","logger":"etcd-client","caller":"v3@v3.5.10/retry_interceptor.go:62","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0x400043f880/192.168.1.201:2379","attempt":0,"error":"rpc error: code = FailedPrecondition desc = etcdserver: can only promote a learner member which is in sync with leader"}
{"level":"warn","ts":"2024-05-07T08:06:02.67635+0900","logger":"etcd-client","caller":"v3@v3.5.10/retry_interceptor.go:62","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0x400043f880/192.168.1.201:2379","attempt":0,"error":"rpc error: code = FailedPrecondition desc = etcdserver: can only promote a learner member which is in sync with leader"}
{"level":"warn","ts":"2024-05-07T08:06:03.043254+0900","logger":"etcd-client","caller":"v3@v3.5.10/retry_interceptor.go:62","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0x400043f880/192.168.1.201:2379","attempt":0,"error":"rpc error: code = FailedPrecondition desc = etcdserver: can only promote a learner member which is in sync with leader"}
{"level":"warn","ts":"2024-05-07T08:06:03.582796+0900","logger":"etcd-client","caller":"v3@v3.5.10/retry_interceptor.go:62","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0x400043f880/192.168.1.201:2379","attempt":0,"error":"rpc error: code = FailedPrecondition desc = etcdserver: can only promote a learner member which is in sync with leader"}
{"level":"warn","ts":"2024-05-07T08:06:04.387198+0900","logger":"etcd-client","caller":"v3@v3.5.10/retry_interceptor.go:62","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0x400043f880/192.168.1.201:2379","attempt":0,"error":"rpc error: code = FailedPrecondition desc = etcdserver: can only promote a learner member which is in sync with leader"}
{"level":"warn","ts":"2024-05-07T08:06:05.531451+0900","logger":"etcd-client","caller":"v3@v3.5.10/retry_interceptor.go:62","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0x400043f880/192.168.1.201:2379","attempt":0,"error":"rpc error: code = FailedPrecondition desc = etcdserver: can only promote a learner member which is in sync with leader"}
[etcd] Waiting for the new etcd member to join the cluster. This can take up to 40s
The 'update-status' phase is deprecated and will be removed in a future release. Currently it performs no operation
[mark-control-plane] Marking the node k8s-ctrl2 as control-plane by adding the labels: [node-role.kubernetes.io/control-plane node.kubernetes.io/exclude-from-external-load-balancers]
[mark-control-plane] Marking the node k8s-ctrl2 as control-plane by adding the taints [node-role.kubernetes.io/control-plane:NoSchedule]

This node has joined the cluster and a new control plane instance was created:

 * Certificate signing request was sent to apiserver and approval was received.
 * The Kubelet was informed of the new secure connection details.
 * Control plane label and taint were applied to the new node.
 * The Kubernetes control plane instances scaled up.
 * A new etcd member was added to the local/stacked etcd cluster.

To start administering your cluster from this node, you need to run the following as a regular user:

        mkdir -p $HOME/.kube
        sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
        sudo chown $(id -u):$(id -g) $HOME/.kube/config

Run 'kubectl get nodes' to see this node join the cluster.

kube-ctrl2が追加されます。最初は STATUS が NotReady です。

$ k get node
NAME        STATUS     ROLES           AGE   VERSION
k8s-ctrl1   Ready      control-plane   9h    v1.29.4
k8s-ctrl2   NotReady   <none>          2s    v1.29.4

STATUS が Ready となり、追加が完了しました。

$ k get node
NAME        STATUS   ROLES           AGE     VERSION
k8s-ctrl1   Ready    control-plane   9h      v1.29.4
k8s-ctrl2   Ready    control-plane   2m33s   v1.29.4

kubeadm join –control-plane の実行(k8s-ctrl3)

次にk8s-ctrl3 でkubeadm joinを実行します。

kubeadm join –control-plane の実行結果(k8s-ctrl3)

wurly@k8s-ctrl3:~$ sudo kubeadm join 192.168.1.100:6443 --token hdie35.u9airq6ychkt8amq \
        --discovery-token-ca-cert-hash sha256:cd58094931470815be7e0b791357ce4ca6907cb861858915e17752baa6cfc18a \
        --control-plane --certificate-key 0626e7a06e87e81e158ac7d9d1bb1f4f8adbcd29b0ff34659de706ec533cf105
[sudo] password for wurly: 
[preflight] Running pre-flight checks
[preflight] Reading configuration from the cluster...
[preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'
[preflight] Running pre-flight checks before initializing the new control plane instance
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
W0507 08:08:39.636962   17354 checks.go:835] detected that the sandbox image "registry.k8s.io/pause:3.6" of the container runtime is in
consistent with that used by kubeadm. It is recommended that using "registry.k8s.io/pause:3.9" as the CRI sandbox image.
[download-certs] Downloading the certificates in Secret "kubeadm-certs" in the "kube-system" Namespace
[download-certs] Saving the certificates to the folder: "/etc/kubernetes/pki"
[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Generating "front-proxy-client" certificate and key
[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [k8s-ctrl3 localhost] and IPs [192.168.1.203 127.0.0.1 ::1]
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [k8s-ctrl3 localhost] and IPs [192.168.1.203 127.0.0.1 ::1]
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "apiserver-etcd-client" certificate and key
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [k8s-ctrl3 kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.96.0.1 192.168.1.203 192.168.1.100]
[certs] Generating "apiserver-kubelet-client" certificate and key
[certs] Valid certificates and keys now exist in "/etc/kubernetes/pki"
[certs] Using the existing "sa" key
[kubeconfig] Generating kubeconfig files
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[kubeconfig] Writing "admin.conf" kubeconfig file
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
[control-plane] Creating static Pod manifest for "kube-scheduler"
[check-etcd] Checking that the etcd cluster is healthy
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Starting the kubelet
[kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap...
[etcd] Announced new etcd member joining to the existing etcd cluster
[etcd] Creating static Pod manifest for "etcd"
{"level":"warn","ts":"2024-05-07T08:09:11.367848+0900","logger":"etcd-client","caller":"v3@v3.5.10/retry_interceptor.go:62","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0x4000299dc0/192.168.1.201:2379","attempt":0,"error":"rpc error: code = FailedPrecondition desc = etcdserver: can only promote a learner member which is in sync with leader"}
{"level":"warn","ts":"2024-05-07T08:09:11.521986+0900","logger":"etcd-client","caller":"v3@v3.5.10/retry_interceptor.go:62","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0x4000299dc0/192.168.1.201:2379","attempt":0,"error":"rpc error: code = FailedPrecondition desc = etcdserver: can only promote a learner member which is in sync with leader"}
{"level":"warn","ts":"2024-05-07T08:09:11.684702+0900","logger":"etcd-client","caller":"v3@v3.5.10/retry_interceptor.go:62","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0x4000299dc0/192.168.1.201:2379","attempt":0,"error":"rpc error: code = FailedPrecondition desc = etcdserver: can only promote a learner member which is in sync with leader"}
{"level":"warn","ts":"2024-05-07T08:09:11.922983+0900","logger":"etcd-client","caller":"v3@v3.5.10/retry_interceptor.go:62","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0x4000299dc0/192.168.1.201:2379","attempt":0,"error":"rpc error: code = FailedPrecondition desc = etcdserver: can only promote a learner member which is in sync with leader"}
{"level":"warn","ts":"2024-05-07T08:09:12.268964+0900","logger":"etcd-client","caller":"v3@v3.5.10/retry_interceptor.go:62","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0x4000299dc0/192.168.1.201:2379","attempt":0,"error":"rpc error: code = FailedPrecondition desc = etcdserver: can only promote a learner member which is in sync with leader"}
{"level":"warn","ts":"2024-05-07T08:09:12.797185+0900","logger":"etcd-client","caller":"v3@v3.5.10/retry_interceptor.go:62","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0x4000299dc0/192.168.1.201:2379","attempt":0,"error":"rpc error: code = FailedPrecondition desc = etcdserver: can only promote a learner member which is in sync with leader"}
{"level":"warn","ts":"2024-05-07T08:09:13.575141+0900","logger":"etcd-client","caller":"v3@v3.5.10/retry_interceptor.go:62","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0x4000299dc0/192.168.1.201:2379","attempt":0,"error":"rpc error: code = FailedPrecondition desc = etcdserver: can only promote a learner member which is in sync with leader"}
{"level":"warn","ts":"2024-05-07T08:09:14.750511+0900","logger":"etcd-client","caller":"v3@v3.5.10/retry_interceptor.go:62","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0x4000299dc0/192.168.1.201:2379","attempt":0,"error":"rpc error: code = FailedPrecondition desc = etcdserver: can only promote a learner member which is in sync with leader"}
{"level":"warn","ts":"2024-05-07T08:09:16.463716+0900","logger":"etcd-client","caller":"v3@v3.5.10/retry_interceptor.go:62","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0x4000299dc0/192.168.1.201:2379","attempt":0,"error":"rpc error: code = FailedPrecondition desc = etcdserver: can only promote a learner member which is in sync with leader"}
[etcd] Waiting for the new etcd member to join the cluster. This can take up to 40s
The 'update-status' phase is deprecated and will be removed in a future release. Currently it performs no operation
[mark-control-plane] Marking the node k8s-ctrl3 as control-plane by adding the labels: [node-role.kubernetes.io/control-plane node.kubernetes.io/exclude-from-external-load-balancers]
[mark-control-plane] Marking the node k8s-ctrl3 as control-plane by adding the taints [node-role.kubernetes.io/control-plane:NoSchedule]

This node has joined the cluster and a new control plane instance was created:

 * Certificate signing request was sent to apiserver and approval was received.
 * The Kubelet was informed of the new secure connection details.
 * Control plane label and taint were applied to the new node.
 * The Kubernetes control plane instances scaled up.
 * A new etcd member was added to the local/stacked etcd cluster.

To start administering your cluster from this node, you need to run the following as a regular user:

        mkdir -p $HOME/.kube
        sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
        sudo chown $(id -u):$(id -g) $HOME/.kube/config

Run 'kubectl get nodes' to see this node join the cluster.

kube-ctrl3が追加されます。最初は STATUS が NotReady です。

$ k get node
NAME        STATUS     ROLES           AGE     VERSION
k8s-ctrl1   Ready      control-plane   9h      v1.29.4
k8s-ctrl2   Ready      control-plane   3m22s   v1.29.4
k8s-ctrl3   NotReady   control-plane   13s     v1.29.4

STATUS が Ready となり、追加が完了しました。

$ k get node
NAME        STATUS   ROLES           AGE     VERSION
k8s-ctrl1   Ready    control-plane   9h      v1.29.4
k8s-ctrl2   Ready    control-plane   4m55s   v1.29.4
k8s-ctrl3   Ready    control-plane   106s    v1.29.4

これで、コントロールプレーンの構築は完了です。

(参考)コントロールプレーン構築完了時のpodの様子

podの状態は下記のようになりました。

calico-kube-controllers が一つしかない(k8s-ctrl1のみに存在する)ことが気になりました。

$ k get pod -A
NAMESPACE     NAME                                       READY   STATUS    RESTARTS      AGE
kube-system   calico-kube-controllers-5fc7d6cf67-t24hh   1/1     Running   2 (73m ago)   9h
kube-system   calico-node-dq6xq                          1/1     Running   2 (73m ago)   9h
kube-system   calico-node-gqkvn                          1/1     Running   0             12m
kube-system   calico-node-tzthw                          1/1     Running   0             15m
kube-system   coredns-76f75df574-mzng7                   1/1     Running   2 (73m ago)   9h
kube-system   coredns-76f75df574-t265x                   1/1     Running   2 (73m ago)   9h
kube-system   etcd-k8s-ctrl1                             1/1     Running   2 (73m ago)   9h
kube-system   etcd-k8s-ctrl2                             1/1     Running   0             15m
kube-system   etcd-k8s-ctrl3                             1/1     Running   0             12m
kube-system   kube-apiserver-k8s-ctrl1                   1/1     Running   2 (73m ago)   9h
kube-system   kube-apiserver-k8s-ctrl2                   1/1     Running   0             15m
kube-system   kube-apiserver-k8s-ctrl3                   1/1     Running   0             12m
kube-system   kube-controller-manager-k8s-ctrl1          1/1     Running   3 (73m ago)   9h
kube-system   kube-controller-manager-k8s-ctrl2          1/1     Running   0             15m
kube-system   kube-controller-manager-k8s-ctrl3          1/1     Running   0             12m
kube-system   kube-proxy-ctspc                           1/1     Running   0             15m
kube-system   kube-proxy-d64kt                           1/1     Running   2 (73m ago)   9h
kube-system   kube-proxy-tmqkh                           1/1     Running   0             12m
kube-system   kube-scheduler-k8s-ctrl1                   1/1     Running   4 (13m ago)   9h
kube-system   kube-scheduler-k8s-ctrl2                   1/1     Running   0             15m
kube-system   kube-scheduler-k8s-ctrl3                   1/1     Running   0             12m

(参考)k8s-ctrl1をネットワークから切り離したときの挙動

試しに、k8s-ctrl1をネットワークから切り離してみました。

すると、5分くらい経ってから k8s-ctrl3 に calico-kube-controllers が生成され、k8s-ctrl1 のcalico-kube-controllersは消滅していきます。

k get pod -o wide
NAME                                       READY   STATUS        RESTARTS        AGE   IP              NODE        NOMINATED NODE   READINESS GATES
calico-kube-controllers-5fc7d6cf67-t24hh   1/1     Terminating   2 (108m ago)    10h   172.16.190.25   k8s-ctrl1   <none>           <none>
calico-kube-controllers-5fc7d6cf67-zhzvh   1/1     Running       0               29s   172.16.35.2     k8s-ctrl3   <none>           <none>
calico-node-dq6xq                          1/1     Running       2 (108m ago)    9h    192.168.1.201   k8s-ctrl1   <none>           <none>
calico-node-gqkvn                          1/1     Running       1 (7m14s ago)   47m   192.168.1.203   k8s-ctrl3   <none>           <none>
calico-node-tzthw                          1/1     Running       0               50m   192.168.1.202   k8s-ctrl2   <none>           <none>
coredns-76f75df574-dc4fc                   1/1     Running       0               29s   172.16.35.1     k8s-ctrl3   <none>           <none>
coredns-76f75df574-jsbp4                   1/1     Running       0               29s   172.16.164.1    k8s-ctrl2   <none>           <none>
coredns-76f75df574-mzng7                   1/1     Terminating   2 (108m ago)    10h   172.16.190.23   k8s-ctrl1   <none>           <none>
coredns-76f75df574-t265x                   1/1     Terminating   2 (108m ago)    10h   172.16.190.24   k8s-ctrl1   <none>           <none>
etcd-k8s-ctrl1                             1/1     Running       2 (108m ago)    10h   192.168.1.201   k8s-ctrl1   <none>           <none>
etcd-k8s-ctrl2                             1/1     Running       0               50m   192.168.1.202   k8s-ctrl2   <none>           <none>
etcd-k8s-ctrl3                             1/1     Running       1 (7m14s ago)   47m   192.168.1.203   k8s-ctrl3   <none>           <none>
kube-apiserver-k8s-ctrl1                   1/1     Running       2 (108m ago)    10h   192.168.1.201   k8s-ctrl1   <none>           <none>
kube-apiserver-k8s-ctrl2                   1/1     Running       0               50m   192.168.1.202   k8s-ctrl2   <none>           <none>
kube-apiserver-k8s-ctrl3                   1/1     Running       1 (7m14s ago)   47m   192.168.1.203   k8s-ctrl3   <none>           <none>
kube-controller-manager-k8s-ctrl1          1/1     Running       3 (108m ago)    10h   192.168.1.201   k8s-ctrl1   <none>           <none>
kube-controller-manager-k8s-ctrl2          1/1     Running       0               50m   192.168.1.202   k8s-ctrl2   <none>           <none>
kube-controller-manager-k8s-ctrl3          1/1     Running       1 (7m14s ago)   47m   192.168.1.203   k8s-ctrl3   <none>           <none>
kube-proxy-ctspc                           1/1     Running       0               50m   192.168.1.202   k8s-ctrl2   <none>           <none>
kube-proxy-d64kt                           1/1     Running       2 (108m ago)    10h   192.168.1.201   k8s-ctrl1   <none>           <none>
kube-proxy-tmqkh                           1/1     Running       1 (7m14s ago)   47m   192.168.1.203   k8s-ctrl3   <none>           <none>
kube-scheduler-k8s-ctrl1                   1/1     Running       4 (48m ago)     10h   192.168.1.201   k8s-ctrl1   <none>           <none>
kube-scheduler-k8s-ctrl2                   1/1     Running       0               50m   192.168.1.202   k8s-ctrl2   <none>           <none>
kube-scheduler-k8s-ctrl3                   1/1     Running       1 (7m14s ago)   47m   192.168.1.203   k8s-ctrl3   <none>           <none>

kubeadm join(ワーカーノードの参加)

(参考)kubeadm joinの実行前の状態

kubeadm joinを行う前にpodの状態を確認しておきます。下記はコントロールプレーンのみ存在している状態です。

$ k get pod
NAME                                       READY   STATUS    RESTARTS        AGE
calico-kube-controllers-5fc7d6cf67-zhzvh   1/1     Running   1 (5m34s ago)   4d2h
calico-node-dq6xq                          1/1     Running   3 (5m26s ago)   4d12h
calico-node-gqkvn                          1/1     Running   2 (5m34s ago)   4d2h
calico-node-tzthw                          1/1     Running   1 (5m34s ago)   4d2h
coredns-76f75df574-dc4fc                   1/1     Running   1 (5m34s ago)   4d2h
coredns-76f75df574-jsbp4                   1/1     Running   1 (5m34s ago)   4d2h
etcd-k8s-ctrl1                             1/1     Running   3 (5m26s ago)   4d12h
etcd-k8s-ctrl2                             1/1     Running   1 (5m34s ago)   4d2h
etcd-k8s-ctrl3                             1/1     Running   2 (5m34s ago)   4d2h
kube-apiserver-k8s-ctrl1                   1/1     Running   3 (5m26s ago)   4d12h
kube-apiserver-k8s-ctrl2                   1/1     Running   1 (5m34s ago)   4d2h
kube-apiserver-k8s-ctrl3                   1/1     Running   2 (5m35s ago)   4d2h
kube-controller-manager-k8s-ctrl1          1/1     Running   4 (5m26s ago)   4d12h
kube-controller-manager-k8s-ctrl2          1/1     Running   1 (5m34s ago)   4d2h
kube-controller-manager-k8s-ctrl3          1/1     Running   2 (5m34s ago)   4d2h
kube-proxy-ctspc                           1/1     Running   1 (5m34s ago)   4d2h
kube-proxy-d64kt                           1/1     Running   3 (5m26s ago)   4d12h
kube-proxy-tmqkh                           1/1     Running   2 (5m34s ago)   4d2h
kube-scheduler-k8s-ctrl1                   1/1     Running   5 (5m26s ago)   4d12h
kube-scheduler-k8s-ctrl2                   1/1     Running   1 (5m34s ago)   4d2h
kube-scheduler-k8s-ctrl3                   1/1     Running   2 (5m34s ago)   4d2h
$ k get nodes
NAME        STATUS   ROLES           AGE     VERSION
k8s-ctrl1   Ready    control-plane   4d12h   v1.29.4
k8s-ctrl2   Ready    control-plane   4d2h    v1.29.4
k8s-ctrl3   Ready    control-plane   4d2h    v1.29.4

kubeadm joinの実行

kubeadm joinを実行します。

sudo kubeadm join 192.168.1.100:6443 --token hdie35.u9airq6ychkt8amq \
        --discovery-token-ca-cert-hash sha256:cd58094931470815be7e0b791357ce4ca6907cb861858915e17752baa6cfc18a --v=5

kubeadm joinの結果

実行してみると、下記のメッセージから先に進むことができません。

wurly@k8s-worker1:~$ sudo kubeadm join 192.168.1.100:6443 --token hdie35.u9airq6ychkt8amq \
        --discovery-token-ca-cert-hash sha256:cd58094931470815be7e0b791357ce4ca6907cb861858915e17752baa6
cfc18a --v=5
[sudo] password for wurly: 
I0511 02:25:53.304043    3224 join.go:413] [preflight] found NodeName empty; using OS hostname as NodeName
I0511 02:25:53.306686    3224 initconfiguration.go:122] detected and using CRI socket: unix:///var/run/containerd/containerd.sock
[preflight] Running pre-flight checks
I0511 02:25:53.306741    3224 preflight.go:93] [preflight] Running general checks
I0511 02:25:53.306785    3224 checks.go:280] validating the existence of file /etc/kubernetes/kubelet.conf
I0511 02:25:53.306794    3224 checks.go:280] validating the existence of file /etc/kubernetes/bootstrap-kubelet.conf
I0511 02:25:53.306802    3224 checks.go:104] validating the container runtime
I0511 02:25:53.326536    3224 checks.go:639] validating whether swap is enabled or not
I0511 02:25:53.326595    3224 checks.go:370] validating the presence of executable crictl
I0511 02:25:53.326622    3224 checks.go:370] validating the presence of executable conntrack
I0511 02:25:53.326640    3224 checks.go:370] validating the presence of executable ip
I0511 02:25:53.326660    3224 checks.go:370] validating the presence of executable iptables
I0511 02:25:53.326682    3224 checks.go:370] validating the presence of executable mount
I0511 02:25:53.326703    3224 checks.go:370] validating the presence of executable nsenter
I0511 02:25:53.326723    3224 checks.go:370] validating the presence of executable ebtables
I0511 02:25:53.326744    3224 checks.go:370] validating the presence of executable ethtool
I0511 02:25:53.326761    3224 checks.go:370] validating the presence of executable socat
I0511 02:25:53.326781    3224 checks.go:370] validating the presence of executable tc
I0511 02:25:53.326795    3224 checks.go:370] validating the presence of executable touch
I0511 02:25:53.326811    3224 checks.go:516] running all checks
I0511 02:25:53.336882    3224 checks.go:401] checking whether the given node name is valid and reachable using net.LookupHost
I0511 02:25:53.337071    3224 checks.go:605] validating kubelet version
I0511 02:25:53.382237    3224 checks.go:130] validating if the "kubelet" service is enabled and active
I0511 02:25:53.390207    3224 checks.go:203] validating availability of port 10250
I0511 02:25:53.390336    3224 checks.go:280] validating the existence of file /etc/kubernetes/pki/ca.crt
I0511 02:25:53.390346    3224 checks.go:430] validating if the connectivity type is via proxy or direct
I0511 02:25:53.390367    3224 checks.go:329] validating the contents of file /proc/sys/net/bridge/bridge-nf-call-iptables
I0511 02:25:53.390389    3224 checks.go:329] validating the contents of file /proc/sys/net/ipv4/ip_forward
I0511 02:25:53.390405    3224 join.go:532] [preflight] Discovering cluster-info
I0511 02:25:53.390417    3224 token.go:80] [discovery] Created cluster-info discovery client, requesting info from "192.168.1.100:6443"
I0511 02:25:53.416226    3224 token.go:223] [discovery] The cluster-info ConfigMap does not yet contain a JWS signature for token ID "hdie35", will try again
I0511 02:25:58.596077    3224 token.go:223] [discovery] The cluster-info ConfigMap does not yet contain a JWS signature for token ID "hdie35", will try again
(中略)
I0511 02:28:58.158711    3224 token.go:223] [discovery] The cluster-info ConfigMap does not yet contain a JWS signature for token ID "hdie35", will try again
I0511 02:29:03.706321    3224 token.go:223] [discovery] The cluster-info ConfigMap does not yet contain a JWS signature for token ID "hdie35", will try again

トークンの期限切れ

下記のメッセージが問題があることを示していると思われます。

The cluster-info ConfigMap does not yet contain a JWS signature for token ID “hdie35”, will try again

Cluster-info ConfigMap does not yet contain a JWS signature – General Discussions – Discuss Kubernetes

Cluster-info ConfigMap does not yet contain a JWS signature

Asking for help? Comment out what you need so we can get more information to help you! Cluster information: Kubernetes version: 1.18.12 Cloud being used: bare...

上記によるとtokenが期限切れになっているようです。私の場合、数日にわたってクラスター構築の作業を行っているため、このような状況になっています。 (一気に作業を行う場合にはこのような問題にはならないはずです) トークンを再作成します。

トークンの再生成

wurly@k8s-ctrl1:~$ kubeadm token list
wurly@k8s-ctrl1:~$ sudo kubeadm token list
[sudo] password for wurly: 
wurly@k8s-ctrl1:~$ sudo kubeadm token create
d8gufx.xhgtb45qps7h7yhv
wurly@k8s-ctrl1:~$ sudo kubeadm token list
TOKEN                     TTL         EXPIRES                USAGES                   DESCRIPTION                                                EXTRA GROUPS
d8gufx.xhgtb45qps7h7yhv   23h         2024-05-12T02:29:57Z   authentication,signing   <none>                                                     system:bootstrappers:kubeadm:default-node-token

kubeadm joinの再実行

指定するtokenを変更して再実行します。

sudo kubeadm join 192.168.1.100:6443 --token d8gufx.xhgtb45qps7h7yhv \
        --discovery-token-ca-cert-hash sha256:cd58094931470815be7e0b791357ce4ca6907cb861858915e17752baa6cfc18a --v=5

kubeadm joinの再実行結果

下記の通り、正常にクラスターに参加できました。

wurly@k8s-worker1:~$ sudo kubeadm join 192.168.1.100:6443 --token d8gufx.xhgtb45qps7h7yhv \
        --discovery-token-ca-cert-hash sha256:cd58094931470815be7e0b791357ce4ca6907cb861858915e17752baa6cfc18a --v=5
I0511 02:30:54.365840    3626 join.go:413] [preflight] found NodeName empty; using OS hostname as NodeName
I0511 02:30:54.365949    3626 initconfiguration.go:122] detected and using CRI socket: unix:///var/run/containerd/containerd.sock
[preflight] Running pre-flight checks
I0511 02:30:54.365999    3626 preflight.go:93] [preflight] Running general checks
I0511 02:30:54.366035    3626 checks.go:280] validating the existence of file /etc/kubernetes/kubelet.conf
I0511 02:30:54.366048    3626 checks.go:280] validating the existence of file /etc/kubernetes/bootstrap-kubelet.conf
I0511 02:30:54.366057    3626 checks.go:104] validating the container runtime
I0511 02:30:54.382819    3626 checks.go:639] validating whether swap is enabled or not
I0511 02:30:54.382869    3626 checks.go:370] validating the presence of executable crictl
I0511 02:30:54.382893    3626 checks.go:370] validating the presence of executable conntrack
I0511 02:30:54.382909    3626 checks.go:370] validating the presence of executable ip
I0511 02:30:54.382927    3626 checks.go:370] validating the presence of executable iptables
I0511 02:30:54.382945    3626 checks.go:370] validating the presence of executable mount
I0511 02:30:54.382962    3626 checks.go:370] validating the presence of executable nsenter
I0511 02:30:54.382980    3626 checks.go:370] validating the presence of executable ebtables
I0511 02:30:54.383006    3626 checks.go:370] validating the presence of executable ethtool
I0511 02:30:54.383021    3626 checks.go:370] validating the presence of executable socat
I0511 02:30:54.383036    3626 checks.go:370] validating the presence of executable tc
I0511 02:30:54.383054    3626 checks.go:370] validating the presence of executable touch
I0511 02:30:54.383077    3626 checks.go:516] running all checks
I0511 02:30:54.394226    3626 checks.go:401] checking whether the given node name is valid and reachable using net.LookupHost
I0511 02:30:54.394386    3626 checks.go:605] validating kubelet version
I0511 02:30:54.429461    3626 checks.go:130] validating if the "kubelet" service is enabled and active
I0511 02:30:54.436446    3626 checks.go:203] validating availability of port 10250
I0511 02:30:54.436570    3626 checks.go:280] validating the existence of file /etc/kubernetes/pki/ca.crt
I0511 02:30:54.436582    3626 checks.go:430] validating if the connectivity type is via proxy or direct
I0511 02:30:54.436609    3626 checks.go:329] validating the contents of file /proc/sys/net/bridge/bridge-nf-call-iptables
I0511 02:30:54.436642    3626 checks.go:329] validating the contents of file /proc/sys/net/ipv4/ip_forward
I0511 02:30:54.436666    3626 join.go:532] [preflight] Discovering cluster-info
I0511 02:30:54.436685    3626 token.go:80] [discovery] Created cluster-info discovery client, requesting info from "192.168.1.100:6443"
I0511 02:30:54.466412    3626 token.go:118] [discovery] Requesting info from "192.168.1.100:6443" again to validate TLS against the pinned public key
I0511 02:30:54.498645    3626 token.go:135] [discovery] Cluster info signature and contents are valid and TLS certificate validates against pinned roots, will use API Server "192.168.1.100:6443"
I0511 02:30:54.498675    3626 discovery.go:52] [discovery] Using provided TLSBootstrapToken as authentication credentials for the join process
I0511 02:30:54.498692    3626 join.go:546] [preflight] Fetching init configuration
I0511 02:30:54.498702    3626 join.go:592] [preflight] Retrieving KubeConfig objects
[preflight] Reading configuration from the cluster...
[preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'
I0511 02:30:54.545089    3626 kubeproxy.go:55] attempting to download the KubeProxyConfiguration from ConfigMap "kube-proxy"
I0511 02:30:54.563297    3626 kubelet.go:74] attempting to download the KubeletConfiguration from ConfigMap "kubelet-config"
I0511 02:30:54.579250    3626 initconfiguration.go:114] skip CRI socket detection, fill with the default CRI socket unix:///var/run/containerd/containerd.sock
I0511 02:30:54.580713    3626 interface.go:432] Looking for default routes with IPv4 addresses
I0511 02:30:54.580750    3626 interface.go:437] Default route transits interface "eno0"
I0511 02:30:54.581417    3626 interface.go:209] Interface eno0 is up
I0511 02:30:54.581560    3626 interface.go:257] Interface "eno0" has 3 addresses :[192.168.1.211/24 240d:1a:1d8:a600:b6a9:fcff:fe21:8100/64 fe80::b6a9:fcff:fe21:8100/64].
I0511 02:30:54.581610    3626 interface.go:224] Checking addr  192.168.1.211/24.
I0511 02:30:54.581636    3626 interface.go:231] IP found 192.168.1.211
I0511 02:30:54.581681    3626 interface.go:263] Found valid IPv4 address 192.168.1.211 for interface "eno0".
I0511 02:30:54.581732    3626 interface.go:443] Found active IP 192.168.1.211 
I0511 02:30:54.590899    3626 preflight.go:104] [preflight] Running configuration dependant checks
I0511 02:30:54.590937    3626 controlplaneprepare.go:225] [download-certs] Skipping certs download
I0511 02:30:54.590973    3626 kubelet.go:121] [kubelet-start] writing bootstrap kubelet config file at /etc/kubernetes/bootstrap-kubelet.conf
I0511 02:30:54.593788    3626 kubelet.go:136] [kubelet-start] writing CA certificate at /etc/kubernetes/pki/ca.crt
I0511 02:30:54.595514    3626 kubelet.go:157] [kubelet-start] Checking for an existing Node in the cluster with name "k8s-worker1" and status "Ready"
I0511 02:30:54.608006    3626 kubelet.go:172] [kubelet-start] Stopping the kubelet
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Starting the kubelet
[kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap...
I0511 02:30:55.941393    3626 cert_rotation.go:137] Starting client certificate rotation controller
I0511 02:30:55.943223    3626 kubelet.go:220] [kubelet-start] preserving the crisocket information for the node
I0511 02:30:55.943289    3626 patchnode.go:31] [patchnode] Uploading the CRI Socket information "unix:///var/run/containerd/containerd.sock" to the Node API object "k8s-worker1" as an annotation

This node has joined the cluster:
 * Certificate signing request was sent to apiserver and a response was received.
 * The Kubelet was informed of the new secure connection details.

Run 'kubectl get nodes' on the control-plane to see this node join the cluster.

(参考)ワーカーのクラスター参加後のpodの状態

ワーカーがクラスターに加わり、最初は NotReadyです。

$ k get nodes
NAME          STATUS     ROLES           AGE     VERSION
k8s-ctrl1     Ready      control-plane   4d13h   v1.29.4
k8s-ctrl2     Ready      control-plane   4d3h    v1.29.4
k8s-ctrl3     Ready      control-plane   4d3h    v1.29.4
k8s-worker1   NotReady   <none>          37s     v1.29.4

Readyになりました。

$ k get nodes
NAME          STATUS   ROLES           AGE     VERSION
k8s-ctrl1     Ready    control-plane   4d13h   v1.29.4
k8s-ctrl2     Ready    control-plane   4d3h    v1.29.4
k8s-ctrl3     Ready    control-plane   4d3h    v1.29.4
k8s-worker1   Ready    <none>          93s     v1.29.4

podの状況です。

calico と kube-proxy がワーカー上で動いていることがわかります。

$ k get pod -o wide
NAME                                       READY   STATUS    RESTARTS      AGE     IP              NODE          NOMINATED NODE   READINESS GATES
calico-kube-controllers-5fc7d6cf67-zhzvh   1/1     Running   1 (34m ago)   4d2h    172.16.35.3     k8s-ctrl3     <none>           <none>
calico-node-dq6xq                          1/1     Running   3 (33m ago)   4d12h   192.168.1.201   k8s-ctrl1     <none>           <none>
calico-node-gqkvn                          1/1     Running   2 (34m ago)   4d3h    192.168.1.203   k8s-ctrl3     <none>           <none>
calico-node-qv8g9                          1/1     Running   0             114s    192.168.1.211   k8s-worker1   <none>           <none>
calico-node-tzthw                          1/1     Running   1 (34m ago)   4d3h    192.168.1.202   k8s-ctrl2     <none>           <none>
coredns-76f75df574-dc4fc                   1/1     Running   1 (34m ago)   4d2h    172.16.35.4     k8s-ctrl3     <none>           <none>
coredns-76f75df574-jsbp4                   1/1     Running   1 (34m ago)   4d2h    172.16.164.2    k8s-ctrl2     <none>           <none>
etcd-k8s-ctrl1                             1/1     Running   3 (33m ago)   4d13h   192.168.1.201   k8s-ctrl1     <none>           <none>
etcd-k8s-ctrl2                             1/1     Running   1 (34m ago)   4d3h    192.168.1.202   k8s-ctrl2     <none>           <none>
etcd-k8s-ctrl3                             1/1     Running   2 (34m ago)   4d3h    192.168.1.203   k8s-ctrl3     <none>           <none>
kube-apiserver-k8s-ctrl1                   1/1     Running   3 (33m ago)   4d13h   192.168.1.201   k8s-ctrl1     <none>           <none>
kube-apiserver-k8s-ctrl2                   1/1     Running   1 (34m ago)   4d3h    192.168.1.202   k8s-ctrl2     <none>           <none>
kube-apiserver-k8s-ctrl3                   1/1     Running   2 (34m ago)   4d3h    192.168.1.203   k8s-ctrl3     <none>           <none>
kube-controller-manager-k8s-ctrl1          1/1     Running   4 (33m ago)   4d13h   192.168.1.201   k8s-ctrl1     <none>           <none>
kube-controller-manager-k8s-ctrl2          1/1     Running   1 (34m ago)   4d3h    192.168.1.202   k8s-ctrl2     <none>           <none>
kube-controller-manager-k8s-ctrl3          1/1     Running   2 (34m ago)   4d3h    192.168.1.203   k8s-ctrl3     <none>           <none>
kube-proxy-ctspc                           1/1     Running   1 (34m ago)   4d3h    192.168.1.202   k8s-ctrl2     <none>           <none>
kube-proxy-d64kt                           1/1     Running   3 (33m ago)   4d13h   192.168.1.201   k8s-ctrl1     <none>           <none>
kube-proxy-h4zjz                           1/1     Running   0             114s    192.168.1.211   k8s-worker1   <none>           <none>
kube-proxy-tmqkh                           1/1     Running   2 (34m ago)   4d3h    192.168.1.203   k8s-ctrl3     <none>           <none>
kube-scheduler-k8s-ctrl1                   1/1     Running   5 (33m ago)   4d13h   192.168.1.201   k8s-ctrl1     <none>           <none>
kube-scheduler-k8s-ctrl2                   1/1     Running   1 (34m ago)   4d3h    192.168.1.202   k8s-ctrl2     <none>           <none>
kube-scheduler-k8s-ctrl3                   1/1     Running   2 (34m ago)   4d3h    192.168.1.203   k8s-ctrl3     <none>           <none>