How TencentOS “Ruyi” Solves Page‑Cache Overuse in Container Environments
This article explains the challenges of uncontrolled page‑cache growth in containerized workloads, reviews community attempts to limit it, and details TencentOS “Ruyi” memory‑QoS solutions—including cgroup‑level page‑cache limits, implementation details, and observed performance effects.
Introduction
TencentOS “Ruyi” is an OS‑side resource isolation solution targeting large‑scale container clusters. It provides QoS for CPU, I/O, memory, and network to improve resource utilization and reduce server costs in mixed online/offline workloads.
Background of Memory Isolation
In container environments, each container has a memory quota, but the Linux page cache can grow without bound, consuming free memory and causing delays in memory allocation for business workloads. Limiting page‑cache usage is therefore critical.
Community Solutions
Various community patches have attempted to limit page cache, such as restricting the proportion of page cache memory ( LWN article ) and limiting negative dentry memory ( LKML discussion ). However, many proposals were not merged due to concerns about kernel complexity.
"Ruyi" Memory QoS Design
TencentOS “Ruyi” explores several approaches to control container page‑cache usage.
2.2.1. Approach 1
Implement cgroup‑level
dirty_background_ratio/
dirty_ratio. This was not adopted because it interferes with I/O QoS and does not cover non‑dirty page cache.
2.2.2. Approach 2
Extend the existing global page‑cache limit to support cgroup‑level limits. A per‑cgroup page counter tracks page‑cache usage; when a new allocation would exceed the limit, the system attempts configurable reclamation before allowing the allocation.
Non‑direct I/O reads first check the page cache via
pagecache_get_page. If the cgroup’s page‑cache quota is exceeded, reclamation is triggered; if reclamation fails after configured retries, the process is OOM‑killed.
Users can view current page‑cache usage and statistics via
memory.eventsand control behavior with sysctl parameters such as
vm.pagecache_limit_global,
vm.pagecache_limit_ignore_dirty, and
vm.pagecache_limit_ignore_slab, which work for both cgroup v1 and v2.
Implementation Details
Images illustrate the architecture and metrics:
Results
Without limits, page‑cache memory continuously grows until it exhausts RAM. With the cgroup‑level limit enabled, page‑cache usage stabilizes at a configurable threshold, preventing OOM situations while still allowing reasonable file I/O performance.
Open Issues
Enabling page‑cache limits trades off some I/O throughput for more predictable memory availability. Users must balance these factors based on workload characteristics.
Tencent Architect
We share technical insights on storage, computing, and access, and explore industry-leading product technologies together.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.