What is cgroup
Reference:
Control groups, often referred to as cgroups, are a feature of the Linux kernel. It allows processes to be organized into hierarchical groups and then limit and monitor the use of various resources. The kernel’s cgroup interface is provided through a pseudo-file system called cgroupfs. Grouping is implemented in the core’s cgroup kernel code, while resource tracking and throttling are implemented in a set of subsystems for each resource type (memory, CPU, and so on).
cgroup is the underlying technology stack for containers and cloud native. Both kubelets and CRIs need to interface with cgroups to enforce the management of resources for pods and containers, i.e., requests/limits and cpu/memory.
There are two versions of cgroup in Linux: cgroup v1 and cgroup v2. cgroup v2 is a new generation of cgroup APIs.
Kubernetes cgroup2 features officially stable.
What are the advantages of cgroup v2
Reference:
cgroup v2 provides a unified control system with enhanced resource management capabilities.
cgroup v2 makes several improvements to cgroup v1, such as:
- A single, unified hierarchical design in the API
- More secure subtrees are delegated to containers
- Updated features, egPressure Stall Information (PSI)
- Enhanced resource allocation management and isolation across multiple resources
- Unified accounting of different types of memory allocations (network memory, kernel memory, etc.)
- Consider non-immediate resource changes, such as page cache writeback
Some Kubernetes features specifically use cgroup v2 to enhance resource management and isolation. For exampleMemoryQoS The Memory QoS feature improves memory QoS and relies on cgroup v2 primitives.
Use the cgroup v2 prerequisite
Reference:
cgroup v2 has the following requirements:
- The operating system release enables cgroup v2
- Ubuntu (starting with 21.10, 22.04+ recommended)
- Debian GNU/Linux (starting with Debian 11 Bullseye)
- Fedora (from 31)
- RHEL and RHEL-like distributions (starting at 9)
- …
- The Linux kernel is 5.8 or later
- The container runtime supports cgroup v2. For example:
- containerd v1.4 and later
- cri-o v1.20 and later
- The kubelet and container runtime are configured to use Systemd cgroup driver
Use cgroup v2
Notes:
Here we take Debian 11 Bullseye + containerd v1.4 as an example.
Enable and check cgroup v2 for Linux nodes
Debian 11 Bullseye has cgroup v2 enabled by default.
This can be verified by the following command:
stat -fc %T /sys/fs/cgroup/
- For cgroup v2, the output is
cgroup2fs
。 - For cgroup v1, the output is
tmpfs
。
If it is not enabled, it can be done through the /etc/default/grub
lower GRUB_CMDLINE_LINUX
added systemd.unified_cgroup_hierarchy=1
, and execute sudo update-grub
Notes:
In the case of a Raspberry Pi, the standard Raspberry Pi OS is not enabled when installedcgroups
。 needcgroups
to start the systemd service. You can do this by converting thecgroup_memory=1 cgroup_enable=memory systemd.unified_cgroup_hierarchy=1
Attach to/boot/cmdline.txt
to enablecgroups
。
and restart to take effect
The kubelet uses the systemd cgroup driver
Kubeadm support in execution kubeadm init
, pass one KubeletConfiguration
Structure. KubeletConfiguration
contain cgroupDriver
field, which can be used to control the cgroup driver of the kubelet.
Illustrate: In version 1.22, if the user is not there KubeletConfiguration
in Settings cgroupDriver
field kubeadm init
It is set as the default systemd
。
Here is a minimized example where this field is explicitly configured:
# kubeadm-config.yaml
kind: ClusterConfiguration
apiVersion: kubeadm.k8s.io/v1beta3
kubernetesVersion: v1.21.0
kind: KubeletConfiguration
apiVersion: kubelet.config.k8s.io/v1beta1
cgroupDriver: systemd
Such a configuration file can be passed to the kubeadm command:
kubeadm init --config kubeadm-config.yaml
Illustrate:
Kubeadm uses the same for all nodes in the cluster KubeletConfiguration
。 KubeletConfiguration
Stored in kube-system
One under the namespace ConfigMap object.
execute init
、join
and upgrade
and other subcommands will cause kubeadm to will KubeletConfiguration
Write to a file /var/lib/kubelet/config.yaml
, which in turn passes it to the local node’s kubelet.
Containerd uses the systemd cgroup driver
edit /etc/containerd/config.toml
:
[plugins.cri.containerd.runtimes.runc.options]
SystemdCgroup = true
Upgrade monitoring components to support cgroup v2 monitoring
📚️Reference:
cgroup v2 uses a different API than cgroup v1, so if any apps access the cgroup file system directly, they will need to be updated to support the version of cgroup v2. For example:
- Some third-party monitoring and security agents may rely on the cgroup file system. You’ll want to update these agents to a version that supports cgroup v2.
- If running as a stand-alone DaemonSet cAdvisor To monitor pods and containers, you need to update them to v0.43.0 or later.
- If you use the JDK, JDK 11.0.16 and later or JDK 15 and later is recommended toFull support for cgroup v2。
Kubernetes cgroup2 features officially stable. cgroup2 has the following advantages over cgroup v1:
- A single, unified hierarchical design in the API
- More secure subtrees are delegated to containers
- Updated features, egPressure Stall Information (PSI)
- Enhanced resource allocation management and isolation across multiple resources
- Unified accounting of different types of memory allocations (network memory, kernel memory, etc.)
- Consider non-immediate resource changes, such as page cache writeback
It is recommended to use Linux and CRI that support cgroup v2 when using Kubernetes v1.25 and above. And enable the cgroup v2 feature of Kubernetes.