How to enable cgroup2 support in K8s?

How to enable cgroup2 support in K8s?
How to enable cgroup2 support in K8s?

What is cgroup

📚️ Reference:

Control groups, often called cgroups, are a feature of the Linux kernel. It allows organizing processes into hierarchical groups and then limiting and monitoring the usage of various resources. The kernel’s cgroup interface is provided through a pseudo file system called cgroupfs. Grouping is implemented in the core cgroup kernel code, while resource tracking and throttling is implemented in a set of subsystems per resource type (memory, CPU, etc.).

cgroup is the underlying technology stack for containers and cloud native. Both kubelet and CRI need to interface with cgroup to enforce resource management for pods and containers, namely: requests/limits and cpu/memory.

There are two cgroup versions in Linux: cgroup v1 and cgroup v2. Cgroup v2 is the new generation of cgroup API.

The cgroup2 feature has been officially stable since Kubernetes v1.25.

What are the advantages of cgroup v2?

📚️ Reference:

cgroup v2 provides a unified control system with enhanced resource management capabilities.

cgroup v2 has many improvements over cgroup v1, such as:

  • Single unified hierarchical design across APIs
  • Safer subtree delegation to containers
  • Updated features such as Pressure Stall Information (PSI)
  • Enhanced resource allocation management and isolation across multiple resources
    • Unified accounting for different types of memory allocation (network memory, kernel memory, etc.)
    • Consider non-immediate resource changes, such as page cache writebacks

Some Kubernetes features specifically use cgroups v2 to enhance resource management and isolation. For example, the MemoryQoS feature improves memory QoS and relies on cgroup v2 primitives.

Prerequisites for using cgroup v2

📚️ Reference:

cgroup v2 has the following requirements:

  • Operating system distribution enables cgroup v2
    • Ubuntu (starting from 21.10, 22.04+ recommended)
    • Debian GNU/Linux (starting from Debian 11 Bullseye)
    • Fedora (starting in 31)
    • RHEL and RHEL-like distributions (starting from 9)
  • Linux kernel is 5.8 or higher
  • The container runtime supports cgroup v2. For example:
  • The kubelet and container runtime are configured to use the systemd cgroup driver

Using cgroup v2

📝 Notes:

Here we take Debian 11 Bullseye + containerd v1.4 as an example.

Enable and check cgroup v2 for Linux nodes

Debian 11 Bullseye has cgroup v2 enabled by default.

This can be verified by the following command:

stat -fc %T /sys/fs/cgroup/
  • For cgroup v2, the output is cgroup2fs.
  • For cgroup v1, the output is tmpfs.

If it is not enabled, you /etc/default/grubcan GRUB_CMDLINE_LINUXadd it in the following systemd.unified_cgroup_hierarchy=1and then executesudo update-grub

📝 Notes:If it is a Raspberry Pi, the standard Raspberry Pi OS installation will not be enabled cgroups. Required cgroupsto start the systemd service. Can be enabled by cgroup_memory=1 cgroup_enable=memory systemd.unified_cgroup_hierarchy=1appending /boot/cmdline.txtto cgroups.and restart to take effect

kubelet uses systemd cgroup driver

kubeadm supports kubeadm initpassing a KubeletConfigurationstructure when executing . KubeletConfigurationIt contains cgroupDriverfields that can be used to control the cgroup driver of kubelet.

Description: In version 1.22, if the user does not KubeletConfigurationset cgroupDriverthe field in , kubeadm initit will be set to the default value systemd.

Here is a minimal example where this field is explicitly configured:

# kubeadm-config.yaml
kind: ClusterConfiguration
apiVersion: kubeadm.k8s.io/v1beta3
kubernetesVersion: v1.21.0
---
kind: KubeletConfiguration
apiVersion: kubelet.config.k8s.io/v1beta1
cgroupDriver: systemd

Such a configuration file can be passed to the kubeadm command:

kubeadm init --config kubeadm-config.yaml

illustrate:

Kubeadm uses the same one for all nodes in the cluster KubeletConfigurationKubeletConfigurationStored in a ConfigMapkube-system object under the namespace .

Executing subcommands such as init, , joinand will cause kubeadm to write to the file , which will then be passed to the kubelet on the local node.upgradeKubeletConfiguration/var/lib/kubelet/config.yaml

containerd uses the systemd cgroup driver

edit /etc/containerd/config.toml:

[plugins.cri.containerd.runtimes.runc.options]
    SystemdCgroup = true

Upgrade monitoring components to support cgroup v2 monitoring

📚️ Reference:

cgroup v2 uses a different API than cgroup v1, so if any applications directly access the cgroup file system, these applications need to be updated to support cgroup v2. For example:

  • Some third-party monitoring and security agents may rely on the cgroup file system. You will need to update these agents to versions that support cgroup v2.
  • If you run cAdvisor as a standalone DaemonSet to monitor Pods and containers, you need to update it to v0.43.0 or higher.
  • If you use JDK, it is recommended to use JDK 11.0.16 and higher or JDK 15 and higher to fully support cgroup v2 .

Complete 🎉🎉🎉

Conclusion

The cgroup2 feature of Kubernetes has been officially stable since v1.25. Compared with cgroup v1, cgroup2 has the following advantages:

  • Single unified hierarchical design across APIs
  • Safer subtree delegation to containers
  • Updated features such as Pressure Stall Information (PSI)
  • Enhanced resource allocation management and isolation across multiple resources
    • Unified accounting for different types of memory allocation (network memory, kernel memory, etc.)
    • Consider non-immediate resource changes, such as page cache writebacks

It is recommended to use Linux and CRI that support cgroup v2 when using Kubernetes v1.25 and above. And enable the cgroup v2 feature of Kubernetes.

📚️Reference Documentation

Author

  • Mohamed BEN HASSINE

    Mohamed BEN HASSINE is a Hands-On Cloud Solution Architect based out of France. he has been working on Java, Web , API and Cloud technologies for over 12 years and still going strong for learning new things. Actually , he plays the role of Cloud / Application Architect in Paris ,while he is designing cloud native solutions and APIs ( REST , gRPC). using cutting edge technologies ( GCP / Kubernetes / APIGEE / Java / Python )

    View all posts
0 Shares:
You May Also Like