Kubernetes Pod Disruption Budget: A Practical Guide

Kubernetes Pod Disruption Budget Practical Guide
Kubernetes Pod Disruption Budget Practical Guide
Pod Disruption Budget: The Practical Guide

Managing Kubernetes clusters poses a difficulty in maintaining consistent availability and resilience against faults. Although employing replicas proves beneficial in ensuring the existence of multiple application instances, it doesn’t ensure uninterrupted application runtime.

This is where the Pod Disruption Budget (PDB) becomes crucial. PDB, a feature within Kubernetes, contributes to sustaining application stability by establishing regulations on the acceptable number of disruptions an application can handle.

This article delves into the specifics of PDB, exploring its definition, how to create it , optimal use cases, and the underlying reasons for its importance.

What is a “Pod Disruption”?

Pod disruption refers to the occurrence when a pod is deliberately removed or evicted from a node. This can occur for various reasons, including:

  1. Node maintenance (such as OS upgrades or hardware upgrades).
  2. Kubernetes cluster upgrades.
  3. Autoscaling down.
  4. Pod rescheduling due to node resource constraints.

In Kubernetes, there are two categories of disruptions:

  • Voluntary disruptions: These are disruptions that can be controlled and scheduled. They are expected to adhere to the Pod Disruption Budget (PDB) that you have defined.
  • Involuntary disruptions: These are unforeseen disruptions that cannot be predicted or controlled, such as hardware failures on a node or a kernel panic. It’s important to note that these types of disruptions do not adhere to the constraints set by PDB.

What is a “Pod Disruption Budget”?

Now that we’ve covered what Pod disruption is, let’s delve into the tool designed to help us manage it. In simple terms, a Pod Disruption Budget, or PDB, enables you to control the number of replicas that should be accessible at any given moment. When configuring a PDB for an application, you specify either:

  1. The minimum number of replicas a Pod must have available at all times (referred to as min available).
  2. The maximum number of replicas that can be unavailable (referred to as max unavailable).

In practical terms, this implies that if, for instance, your application has 5 replicas and you set a PDB with a minimum available requirement of 2 replicas, the PDB won’t impact your application as long as two replicas are operational.

However, if the number of replicas falls below 2, certain Kubernetes operations will be halted. For example, the scaling down of your cluster will be paused if it results in having fewer than 2 replicas due to the scale-down process.

Requirements to use “Pod Disruption Budget”?

In order to use Pod Disruption Budget (PDB), the requirements are straightforward:

  1. Kubernetes Version: Ensure your Kubernetes version is 1.21 or later.
  2. Pod Labeling: To create and apply PDB, you need to specify the Pods on which it should take effect. Therefore, label your Pods accordingly, making it simple to identify those on which the PDB should be applied. This labeling helps in the precise application of Pod Disruption Budgets.

How to create Pod Disruption Budget ?

We’ll discuss various methods available to create a Pod Disruption Budget (PDB) object.

Kubectl Create

To promptly apply a Pod Disruption Budget (PDB) to a specific workload, execute the following kubectl command:

kubectl create poddisruptionbudget my-app-pdb --min-available=1 \
--selector=app=my-super-app

Let’s break it down:

  • poddisruptionbudget: This is the Kubernetes API resource type we aim to create, representing the “Pod Disruption Budget” resource. Alternatively, you can use the short name “pdb.”
  • my-app-pdb: This is the name assigned to the PDB resource specifically created for the application “super-critical-app.”
  • –min-available=1: This flag ensures that a minimum of 1 replica of our application is always available, setting the threshold for disruption.
  • –selector=app=my-super-app: This flag is used to target the Pods on which the PDB should be applied. In this case, it specifies that the PDB is intended for Pods with the label “app=super-critical-app.”

YAML definition

Another method to create Pod Disruption Budget (PDB) objects is by using YAML files to define their configuration.

Let’s take a look at an example of the same PDB discussed in the previous section, specifically utilizing the minAvailable parameter.

apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: my-app-pdb
spec:
  minAvailable: 1
  selector:
    matchLabels:
      app: my-super-app

Executing kubectl apply -f <YAML_FILE> will generate the same Pod Disruption Budget (PDB). Now, let’s explore an example using the maxUnavailable parameter.

apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: my-app-pdb
spec:
  maxUnavailable: 0
  selector:
    matchLabels:
      app: my-super-app

Helm Charts

Here’s an example of how you might define a PDB in a Helm Chart:

apiVersion: policy/v1beta1
kind: PodDisruptionBudget
metadata:
 name: "{{ .Release.Name }}-pdb"
spec:
 minAvailable: 2
 selector:
 matchLabels:
 app: "{{ .Release.Name }}"


In this illustration, the Pod Disruption Budget (PDB) is configured to maintain a minimum of two Pods labeled app: {{ .Release.Name }} accessible during voluntary disruptions.

Kubernetes endeavors to adhere to the PDB guidelines when executing operations that could potentially render the application unavailable. For instance, it will attempt to assign Pods to nodes in a manner that avoids violating the PDB.

It’s crucial to recognize that PDBs don’t ensure a constant number or percentage of available Pods. In instances of unforeseen disruptions or insufficient cluster resources to schedule a new Pod after a node failure, the count of available Pods may drop below the specified threshold.

When integrating PDBs into your Helm Chart, it’s essential to confirm that the labels in the selector field align with the labels of the Pods you intend to safeguard.

Verify PDB is Created and Applied

Let’s begin by listing our Pod Disruption Budget (PDB) objects. We expect to see one, specifically the one crafted in the preceding section, named “app-pdb.”

$ kubectl get pdb

NAME        MIN AVAILABLE   MAX UNAVAILABLE   ALLOWED DISRUPTIONS   AGE
my-app-pdb         1              0                      0     

As it currently stands, the configuration stipulates a minimum of 1 replica. Given that we operate with a default setting of 1 replica, the maximum permissible unavailability is limited to 1 (any more, and the application would cease to run).

How Test Kubernetes PDB ?

To truly grasp the effectiveness of Pod Disruption Budgets (PDB), the most insightful approach is to actively test it across various scenarios where its role is to safeguard the application from falling below a specified number of replicas.

Kubernetes Node Drain

We’ll kick off by performing a node drain, and not just any node—specifically the node where our application replicas are currently running. Node draining involves relocating all Pods from a designated node after marking it as “cordoned,” indicating that no new Pods can be scheduled on that node.

Assuming we’ve executed the command

kubectl get po -o wide | grep -i my-super-app 

We identified that the node name is “teckbootcamps-node” , let’s proceed with draining that node.

$ kubectl drain teckbootcamps-node --ignore-daemonsets

node/teckbootcamps-node cordoned

That’s a positive beginning. Initially, we observe that our node is cordoned, signifying that no new workloads will be assigned to it. Let’s proceed to examine the subsequent output for further insights.

evicting pod default/last-app
evicting pod default/my-super-app
evicting pod default/funny-app

Fascinating. It’s noteworthy that all Pods on the node are slated for eviction. However, it’s crucial not to misinterpret this as PDB failing to function.

This is simply a notification of its intended actions. Let’s proceed to uncover what unfolds next.

evicting pod default/my-super-app
error when evicting pods/"my-super-app" -n "default" (will retry after 5s): Cannot evict pod as it would violate the pod's disruption budget.
evicting pod default/jkog-cc8457d4-pkhzs
error when evicting pods/"my-super-app" -n "default" (will retry after 5s): Cannot evict pod as it would violate the pod's disruption budget.


Here comes the PDB to the rescue! Kubernetes faces a roadblock in evicting the targeted Pod (the one specified in our recently created PDB).

It assesses the configured PDB and deduces that evacuating this Pod would result in the available number of Pods dropping from 1 to 0, falling below the defined threshold of minAvailable=1.

Kubernetes Node Pool Upgrade

Let’s put PDB to the test within a different workflow—specifically, upgrading a node pool in a GKE cluster on Google Cloud Platform (GCP) with a single node and a minAvailable setting of 1. Typically, in such a process, nodes are marked as cordoned to prevent new workloads from being scheduled on them.

Subsequently, a drain operation is applied to shift workloads to the new node featuring the updated Kubernetes version. In theory, PDB should intervene, as this scenario implies the reduction of replicas to 0 when evicting Pods from one node to another. Let’s examine its performance in this context.

Using the gcloud CLI 

gcloud container clusters upgrade CLUSTER_NAME --node-pool=NODE_POOL_NAME --cluster-version VERSION

The outcome? No upgrade! Well, not exactly. Initially, your workload won’t transition to the new node, effectively remaining as the sole occupant on the old node (assuming there are no PDBs on other workloads). However, take note of this intriguing message you receive from GCP.

Is Pod Disruption Budget (PDB) the best solution to ensure continuous operation of your application?

In short, no. Pod Disruption Budget (PDB) isn’t a one-size-fits-all solution to ensure your application runs without interruption. It’s not foolproof technically, and sometimes, additional methods are necessary to guarantee continuous and proper operation of your application. Let’s explore some examples.

Consider a straightforward scenario: you have a Pod named “my-cool-app” with one replica, and a PDB is applied with minAvailable=1, indicating there should always be one running replica, disallowing disruptions to the pod.

Now, if you run kubectl delete po my-cool-app, what do you think will happen? If your answer is “it will be deleted,” you’re correct. PDB won’t prevent the Pod from deletion because this direct deletion is considered an administrative operation initiated by the admin, not managed by Kubernetes services themselves. Therefore, PDB has no impact when the admin directly removes a Pod.

Pitfalls of Kubernetes PDB

PDB, designed to ensure continuous application uptime, may hinder certain operations. For instance, if you attempt to drain a node, PDB might block the operation, leaving applications on the node that can’t be evicted due to PDB restrictions. While PDB’s purpose is to maintain uninterrupted application operation, if not planned properly, it could disrupt existing processes.

Although PDB doesn’t entirely block some operations, it can introduce delays. Consider the example of upgrading Kubernetes versions in GCP’s GKE node pool. Initially, PDB may delay node draining, but eventually, the operation proceeds, albeit with an hour’s delay. So, while PDB won’t prevent application downtime, it does delay the node pool upgrade process.

PDB’s impact extends to your cluster’s ability to scale down. If different applications run on two nodes and Kubernetes could consolidate them on a single node for scaling down, PDB prevents this to avoid disruptions. This protection, however, comes at the cost of higher cluster expenses for maintaining application reliability.

Are you preparing for Kubernetes certification ? check out all certification guides here :

Conclusion

In conclusion, we’ve explored the concept of “Pod Disruption” and its counterpart, the “Pod Disruption Budget” (PDB). We discussed the prerequisites for implementing PDB and delved into various methods for creating it, including using kubectl create, YAML definitions, and Helm Charts. The verification process to ensure the successful creation and application of PDB was also covered.

We then moved on to testing PDB in real-world scenarios, such as Kubernetes node draining and node pool upgrades. While PDB is a valuable tool for maintaining application availability, it’s important to acknowledge that it may not be the ultimate solution for continuous operation, and we highlighted some pitfalls associated with its usage in Kubernetes environments.

Author

  • Mohamed BEN HASSINE

    Mohamed BEN HASSINE is a Hands-On Cloud Solution Architect based out of France. he has been working on Java, Web , API and Cloud technologies for over 12 years and still going strong for learning new things. Actually , he plays the role of Cloud / Application Architect in Paris ,while he is designing cloud native solutions and APIs ( REST , gRPC). using cutting edge technologies ( GCP / Kubernetes / APIGEE / Java / Python )

0 Shares:
You May Also Like
Scaling Kubernetes (GKE / AKS / EKS )
Read More

Scaling Kubernetes by Examples

Table of Contents Hide Scaling a Kubernetes DeploymentUse CaseSolutionImplementing Horizontal Pod AutoscalingUse CaseSolutionAutomating Cluster Scaling in GKEUse CaseSolutionDiscussionDynamically…