Fundamentals 12 min read

Linux CPU Power Management: P‑states and C‑states in Kernel 2.6 and 4.18

This article explains how Linux kernels 2.6 and 4.18 manage CPU power‑saving P‑states and C‑states, compares energy‑saving and performance modes, and details the underlying idle drivers, BIOS settings, and performance impacts on Intel Xeon E5‑2630 v4 CPUs.

58 Tech
58 Tech
58 Tech
Linux CPU Power Management: P‑states and C‑states in Kernel 2.6 and 4.18

In the previous section we examined how CPU performance states and the MONITOR/MWAIT instructions can affect Linux performance; this article now dives into how the 2.6 and 4.18 Linux kernels handle those instructions and power‑saving states.

Linux controls CPU efficiency mainly through P‑states (performance vs. powersave) and C‑states (idle depths), and the discussion focuses on x86 details.

P‑states are governed by policies such as performance and powersave , which limit the maximum CPU frequency; the BIOS may allow the OS to manage these states, and tools like cpupower (installed via cpupowerutils on 2.6 kernels) read and set the values under /sys/devices/system/cpu/cpu*/cpufreq/ . Linux computes the next frequency with the formula 1.25 × max_frequency × CPU_utilization , maps it to a P‑state level, and instructs the CPU to adjust voltage accordingly.

Recommended BIOS configurations are: for latency‑sensitive workloads, disable P‑states; otherwise enable them, set the policy to performance , turn on Turbo Boost, and let the BIOS manage power.

C‑states have only enable/disable options, with deeper levels such as C1E, C3, C6. Enabling C‑states in the BIOS may allow the OS to enter deep sleep even if the BIOS disables them, especially when MONITOR/MWAIT is enabled.

In kernel 2.6 the idle subsystem offers four built‑in idle methods ( poll_idle , mwait_idle , c1e_idle , default_idle ) and a driver‑based method ( cpuidle_idle_call ). The idle method can be forced via the kernel command line, e.g., idle=poll . Two drivers are available: intel_idle (requires Intel CPU with MWAIT support) and acpi_idle (requires deeper sleep support). When a driver is active, the governor (usually menu ) selects the next C‑state based on target residency time and wake‑up latency, comparing against the system latency request (default 2000 µs) exposed via /dev/cpu_dma_latency or the tuned service.

The cpuidle_idle_call routine then asks the driver to enter the chosen state using mwait , halt , or inb . Disabling MONITOR/MWAIT forces the CPU to stay in C1.

Performance tests on an Intel Xeon E5‑2630 v4 show that the energy‑saving mode (with higher turbo frequency and deep C6 sleep) can be 20 % faster than the performance mode, which often stays in C1. However, under high‑concurrency workloads (e.g., multiple threads running perf stat bc ), the performance mode demonstrates better latency and stability.

In kernel 4.18 the idle handling is largely the same, but cpuidle_idle_call is built‑in and primarily uses the intel_idle driver, which again depends on MONITOR/MWAIT being enabled. The menu governor still selects states, but the latency interface moved to /sys/kernel/debug/pm_qos/cpu_dma_latency . Apart from the driver change, the behavior mirrors that of 2.6.

The article concludes that the main difference in 4.18 is the driver swap, making MONITOR/MWAIT the key factor for entering deep sleep, and promises a follow‑up section to reproduce the 4.18 scenario.

LinuxC-statesP-states
58 Tech
Written by

58 Tech

Official tech channel of 58, a platform for tech innovation, sharing, and communication.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.