You see a whole bunch of pods stuck in a creating loop
★ kubectl get pods -n my-groovy-namespace NAME READY STATUS RESTARTS AGE my-groovy-pod-tp2h9 0/1 ContainerCreating 0 9m my-groovy-pod-xjhcc 1/1 Running 0 9m my-groovy-pod-8d5b9 0/1 ContainerCreating 0 32s my-groovy-pod-vfksp 1/1 Running 0 9m my-groovy-pod-55szr 0/1 ContainerCreating 0 9m my-groovy-pod-fmjnx 1/1 Running 0 9m my-groovy-pod-5wkrq 0/1 ContainerCreating 0 52s my-groovy-pod-rsghq 1/1 Running 0 9m
You inspect one of them and you see a bunch of messages saying stuff like below
★ kubectl describe pod -n my-groovy-namespace my-groovy-pod-tp2h9 . . . Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Scheduled 38s default-scheduler Successfully assigned my-groovy-pod-tp2h9 to k8s-qapool-27652675-4 Normal SuccessfulMountVolume 37s kubelet, k8s-qapool-27652675-4 MountVolume.SetUp succeeded for volume "default-token-zbpr5" Warning FailedCreatePodSandBox 12s (x8 over 36s) kubelet, k8s-qapool-27652675-4 Failed create pod sandbox. Warning FailedSync 12s (x8 over 36s) kubelet, k8s-qapool-27652675-4 Error syncing pod Normal SandboxChanged 10s (x8 over 36s) kubelet, k8s-qapool-27652675-4 Pod sandbox changed, it will be killed and re-created.
In the dashboard you see lots of red and messages saying
Failed create pod sandbox. Error syncing pod
You notice that it is all related to the same node
★ ssh k8s-qapool-27652675-4 ★ sudo su - ★ ps -ef | grep [d]ocker root 7042 1 2 Mar19 ? 2-04:02:06 dockerd -H fd:// --storage-driver=overlay2 --bip=172.17.0.1/16 root 7746 1 0 Mar19 ? 00:28:17 docker-containerd-shim 9f07a121fb66d61285fb246b5157cb640f7df902965421603c0ad66800a00637 /var/run/docker/libcontainerd/9f07a121fb66d61285fb246b5157cb640f7df902965421603c0ad66800a00637 docker-runc root 8157 1 0 Mar19 ? 00:00:05 docker-containerd-shim 9258d496ab97ce1e0736bc0521cf4146045fdfb30b835675433622e5c7ed7c67 /var/run/docker/libcontainerd/9258d496ab97ce1e0736bc0521cf4146045fdfb30b835675433622e5c7ed7c67 docker-runc root 8208 1 0 Mar19 ? 00:00:16 docker-containerd-shim 0ae4b84fdf301965f047a9aa6275be1dfa1daa03f492bf5bd29c4020d9ce165a /var/run/docker/libcontainerd/0ae4b84fdf301965f047a9aa6275be1dfa1daa03f492bf5bd29c4020d9ce165a docker-runc root 12741 1 0 Jun28 ? 00:00:00 docker-containerd-shim 858aeacbe1ee80724423aa24e14bbd0f984e3f98e6c1ba13958a48bd72f73628 /var/run/docker/libcontainerd/858aeacbe1ee80724423aa24e14bbd0f984e3f98e6c1ba13958a48bd72f73628 docker-runc root 34870 1 0 Jun28 ? 00:00:00 docker-containerd-shim b005c6f1d189e77c48ac9573631689515389eb634666f63ca18835e0cda9be09 /var/run/docker/libcontainerd/b005c6f1d189e77c48ac9573631689515389eb634666f63ca18835e0cda9be09 docker-runc root 41490 1 0 Jun30 ? 00:00:00 docker-containerd-shim bff891b51748ab7c963b5585669e5d79943e731960f13e4943f8fb1f7dc04822 /var/run/docker/libcontainerd/bff891b51748ab7c963b5585669e5d79943e731960f13e4943f8fb1f7dc04822 docker-runc root 41562 1 0 May09 ? 00:00:02 docker-containerd-shim 13f6fad9ed621b5de2675db686be4a7b693c2cfe2446620860e601b94d131465 /var/run/docker/libcontainerd/13f6fad9ed621b5de2675db686be4a7b693c2cfe2446620860e601b94d131465 docker-runc root 41613 1 0 May09 ? 00:00:02 docker-containerd-shim ae0f4e408bfee941c155832ed2cba934884dd770fac1cd0f0791eaca75416a1a /var/run/docker/libcontainerd/ae0f4e408bfee941c155832ed2cba934884dd770fac1cd0f0791eaca75416a1a docker-runc
You restart the docker daemon running on that node
★ service docker restart
You then delete all the faulty pods and notice them being reprovisioned succesfully again with the working docker daemon
★ kubectl get pods -n my-groovy-namespace NAME READY STATUS RESTARTS AGE my-groovy-pod-75lh7 1/1 Running 0 1m my-groovy-pod-xjhcc 1/1 Running 0 30m my-groovy-pod-xxxp5 1/1 Running 0 1m my-groovy-pod-vfksp 1/1 Running 0 30m my-groovy-pod-pzjdh 1/1 Running 0 1m my-groovy-pod-fmjnx 1/1 Running 0 30m my-groovy-pod-sp6sg 1/1 Running 0 1m my-groovy-pod-rsghq 1/1 Running 0 30m
Et voila