Tencent Cloud PMU Improvements and AMD PerfMon V2: Enhancing KVM Virtualization Performance
Tencent Cloud’s extensive PMU enhancements—including guest PEBS support, new Intel SPR PDIR++ and PDist features, and AMD Zen4 PerfMon V2 with global‑control registers—significantly lower KVM virtualization overhead, boost instruction‑level accuracy, and provide upstream open‑source contributions for broader cloud performance optimization.
At the recently concluded KVM Forum, the 2023 Global Enterprise KVM Open‑Source Contribution Ranking was released. Tencent Cloud ranked among the top contributors for the seventh consecutive year and its self‑developed PMU improvements were highlighted as the KVM Annual Core Breakthrough.
The work builds a foundational framework for performance‑optimisation analysis in the cloud, offering developers reliable quantitative tools to reduce costs and improve efficiency across diverse workloads. Increasingly, cloud providers are enabling the use of PMU tools inside virtual machines to optimise target workloads.
The PMU improvements comprise a series of contributions, including:
Support for Guest PEBS on Intel SPR and newer CPUs (features PRIP++ and PDist).
Fixes for Guest PBES counter cross‑mapping tracking.
Prevention of mixing ordinary counters with PEBS counters in virtualization.
Added checks for host perf subsystem dependencies when KVM enables vPMU.
Removed speculative limits on the maximum number of generic counters.
Applied performance‑event filter mechanisms to emulated instructions.
Deferred interrupt injection for emulated‑instruction overflow.
Introduced a more generic event‑filtering mechanism and user‑space interface.
Extended mask‑checking range for the performance‑event filter.
Fixed desynchronisation when user‑space repeatedly updates vPMU capabilities.
Prohibited Legacy LBR support on Arch LBR machines.
Re‑implemented key functions and execution order for better performance.
Re‑implemented vPMU handling on AMD for greater scalability.
Avoided erroneous release of zero‑sample‑period events for special uses.
Re‑implemented register tracking during vCPU context save/restore.
Temporarily disabled vPMU on mixed‑CPU platforms to avoid random access errors.
Various code refactorings and documentation updates.
Building on last year’s Guest PEBS foundation, this year Tencent Cloud added two new Intel SPR capabilities: precise‑instruction‑retirement (PDIR) enhanced to PDIR++ and a new precise‑distribution facility (PDist). Both provide PEBS records that are generated immediately after the instruction that caused an overflow, eliminating sliding effects and improving instruction‑level accuracy.
Meanwhile, the rapid growth of AMD‑based VM instances has driven demand for virtualised PMU support. Tencent Cloud contributed AMD Zen4 performance‑monitoring support, including the new AMD PerfMon V2 feature. PerfMon V2 introduces a global‑control register that can enable or disable multiple counters simultaneously, reducing the overhead and interference of performance‑monitoring tools in multi‑counter scenarios.
With AMD PerfMon V2, the PMU software agent can configure several event‑selection registers first, then toggle all counters with a single global‑control write, collapsing multiple VM exits into one. Additional global STATUS and OVF_CTRL registers provide per‑counter overflow handling, delivering near‑bare‑metal accuracy while dramatically cutting virtualization cost. The implementation is slated for inclusion in Linux kernel 6.5, and Tencent Cloud engineers have refactored the previous AMD vPMU code to share virtualization logic with Intel, easing upstream maintenance.
Through these layered innovations, Tencent Cloud aims to lower virtualization overhead, boost performance, and make advanced low‑level capabilities transparent to customers, while continuing to contribute to the open‑source virtualization ecosystem.
Tencent Cloud Developer
Official Tencent Cloud community account that brings together developers, shares practical tech insights, and fosters an influential tech exchange community.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.