How to Safely Drain a Node in Kubernetes

This post will help us learn how to properly drain a node in Kubernetes to prepare for maintenance.

In this Kubernetes tutorial, you will learn to drain a node using kubectl drain command to prepare for maintenance.

It is as simple as entering this command:

kubectl drain node_name

As a matter of fact, you can get the nodes details using kubectl get nodes command.

However, there is more to draining nodes in Kubernetes so let’s take a detailed look at it.

Why do you need to drain nodes?

The reason is because Kubernetes is designed to be fault tolerant of worker node failures.

There might be different reasons a worker node becomes unusable. The reason can be; because of a hardware problem, a cloud provider problem. Another reason; if there are network issues between worker and master node, the Kubernetes master handles it effectively.

On the other hand, that doesn’t mean it will always be the case. And this is when you need to drain the nodes and remove all the pods.

The draining is the process for safely evicting all the pods from a node. This way, the containers running on the pod terminate gracefully.

How to properly drain nodes in Kubernetes

Let’s start with the practical demonstration.

Step 1: Mark the node as unschedulable (cordon)

To perform maintenance on a node, you should unschedule and then drain a node.

First have a look at the currently running nodes:

root@kmaster-rj:~# kubectl get nodes
NAME          STATUS   ROLES    AGE   VERSION
kmaster-rj    Ready    master   44d   v1.18.8
kworker-rj1   Ready    <none>   44d   v1.18.8
kworker-rj2   Ready    <none>   44d   v1.18.8
root@kmaster-rj:~#

Look at the pods running on different nodes:

root@kmaster-rj:~# kubectl get pods -o wide
NAME                      READY   STATUS    RESTARTS   AGE     IP              NODE          NOMINATED NODE   READINESS GATES
my-dep-557548758d-gprnr   1/1     Running   1          4d23h   172.16.213.48   kworker-rj1   <none>           <none>
my-dep-557548758d-d2pmd   1/1     Running   1          4d15h     172.16.213.57   kworker-rj2   <none>           <none>
pod-delete-demo           1/1     Running   1          2d      172.16.213.56   kworker-rj1   <none>           <none>
root@kmaster-rj:~#

Now mark the node as unschedulable by running the following command:

root@kmaster-rj:~# kubectl cordon kworker-rj2
node/kworker-rj2 cordoned
root@kmaster-rj:~# 

List the nodes again:

root@kmaster-rj:~# kubectl get nodes
NAME          STATUS                     ROLES    AGE   VERSION
kmaster-rj    Ready                      master   44d   v1.18.8
kworker-rj1   Ready                      <none>   44d   v1.18.8
kworker-rj2   Ready,SchedulingDisabled   <none>   44d   v1.18.8
root@kmaster-rj:~#

You can notice that the node kworker-rj2 is now labeled as SchedulingDisabled.

Till this step it doesn’t evict the pods running on that node. Verify the pod status:

root@kmaster-rj:~# kubectl get pods -o wide
NAME                      READY   STATUS    RESTARTS   AGE     IP              NODE          NOMINATED NODE   READINESS GATES
my-dep-557548758d-gprnr   1/1     Running   1          4d23h   172.16.213.48   kworker-rj1   <none>           <none>
my-dep-557548758d-d2pmd   1/1     Running   1          4d15h     172.16.213.57   kworker-rj2   <none>           <none>
pod-delete-demo           1/1     Running   1          2d      172.16.213.56   kworker-rj1   <none>           <none>
root@kmaster-rj:~#

You can see that pod “my-dep-557548758d-d2pmd” still running on kworker-rj2 node.

Step 2: Drain the node to prepare for maintenance

Now drain the node in preparation for maintenance to remove pods that are running on the node by running the following command:

root@kmaster-rj:~# kubectl drain kworker-rj2 --grace-period=300 --ignore-daemonsets=true
node/kworker-rj2 already cordoned
WARNING: ignoring DaemonSet-managed Pods: kube-system/calico-node-fl8dl, kube-system/kube-proxy-95vdf
evicting pod default/my-dep-557548758d-d2pmd
pod/my-dep-557548758d-d2pmd evicted
node/kworker-rj2 evicted
root@kmaster-rj:~#

NOTE: kubectl drain cannot delete Pods not managed by ReplicationController, ReplicaSet, Job, DaemonSet or StatefulSet. You need to use –force to override that and by doing that the individual pods will be deleted permanently.

Now look at the pods:

root@kmaster-rj:~# kubectl get pods -o wide
NAME                      READY   STATUS    RESTARTS   AGE     IP              NODE          NOMINATED NODE   READINESS GATES
my-dep-557548758d-gprnr   1/1     Running   1          4d23h   172.16.213.48   kworker-rj1   <none>           <none>
my-dep-557548758d-dsanh   1/1     Running   0          27s     172.16.213.38   kworker-rj1   <none>           <none>
pod-delete-demo           1/1     Running   1          2d      172.16.213.56   kworker-rj1   <none>           <none>
root@kmaster-rj:~#

The pod which was running on kworker-rj2 node evicted from there and started as a new pod on kworker-rj1 node.

Nodes status remains the same:

root@kmaster-rj:~# kubectl get nodes
NAME          STATUS                     ROLES    AGE   VERSION
kmaster-rj    Ready                      master   44d   v1.18.8
kworker-rj1   Ready                      <none>   44d   v1.18.8
kworker-rj2   Ready,SchedulingDisabled   <none>   44d   v1.18.8
root@kmaster-rj:~#

Step 3: Uncordon the node after maintenance completes

You need to run following command afterwards to tell Kubernetes that it can resume scheduling new pods onto the node.

root@kmaster-rj:~# kubectl uncordon kworker-rj2
node/kworker-rj2 uncordoned

Verify the node status:

root@kmaster-rj:~# kubectl get nodes
NAME          STATUS   ROLES    AGE   VERSION
kmaster-rj    Ready    master   44d   v1.18.8
kworker-rj1   Ready    <none>   44d   v1.18.8
kworker-rj2   Ready    <none>   44d   v1.18.8

Node kworker-rj2 becomes ready again to handle new workloads.

Related searches

  • kubernetes add node to existing cluster
  • kubernetes node pool
  • kubectl delete node
  • kubectl drain node force
  • kubectl restart node
  • kubectl drain ignore-daemonsets
  • kubectl evict pod
  • kubernetes eviction api

I hope you like this quick tip about draining nodes in Kubernetes because it sure works like magic.

Exit mobile version