Master Ceph Cache Tiering: Principles, Modes, and Deployment Guide
This article explains the fundamentals of Ceph cache tiering, covering cache and buffer concepts, the two cache‑pool modes (write‑back and read‑forward), step‑by‑step deployment, configuration parameters, and proper procedures for creating, tuning, and safely removing cache pools.
Cache Pool Principle
Cache (read‑optimised) and Buffer (write‑optimised) are techniques used to bridge speed gaps between devices. Cache stores recently read data to improve hit rate, while Buffer aggregates writes to reduce fragmentation and seek overhead.
In modern systems, caches appear at many levels: CPU L1/L2/L3, page cache, and even volatile disk caches (typically 16‑64 MB).
Cache targets hot (frequently read) data; its performance metric is the hit rate. Buffer collects scattered writes and flushes them in batches, improving write performance.
Generally, Cache optimises reads, Buffer optimises writes.
Ceph Cache Pool Implementation
Ceph stores data as objects in pools, each backed by OSDs. Performance varies with the underlying storage media, so Ceph introduces cache pools to combine fast SSDs (front‑end) with slower HDDs (back‑end), balancing cost and performance.
Data is classified as hot or cold; hot data is promoted to the SSD‑backed cache pool, while cold data remains in the HDD pool.
Cache Pool Modes
1) Write‑back mode: Writes are acknowledged after reaching the SSD cache pool, then asynchronously flushed to the HDD pool. Reads first hit the SSD cache; if data is not present, it is fetched from the HDD pool and cached. Suitable for frequently updated data that is also frequently read.
2) Read‑forward (or read‑only) mode: Both reads and writes go to the SSD cache. If data is missing, the request is forwarded to the HDD pool. Used when the SSD cache is full and cannot be expanded.
Deploying a Cache Pool
Since Ceph 0.80, cache pools are supported. To use an SSD pool as a cache for an HDD pool, define separate CRUSH rules for SSD and SATA OSDs, then create corresponding pools.
<code>ceph osd tier add sata-pool ssd-pool</code>Set the cache mode to write‑back:
<code>ceph osd tier cache-mode ssd-pool writeback</code>Redirect client traffic to the SSD pool:
<code>ceph osd tier set-overlay sata-pool ssd-pool</code>Cache Pool Parameter Configuration
Typical parameters include:
Enable Bloom filter for fast look‑ups:
<code>ceph osd pool set ssd-pool hit_set_type bloom</code>Set hit set count and period, and maximum cache size:
<code>ceph osd pool set ssd-pool hit_set_count 1
ceph osd pool set ssd-pool hit_set_period 3600
ceph osd pool set ssd-pool target_max_bytes 1073741824</code>Control flushing and evicting thresholds:
<code>ceph osd pool set ssd-pool cache_target_dirty_ratio 0.4
ceph osd pool set ssd-pool cache_target_dirty_high_ratio 0.6
ceph osd pool set ssd-pool cache_target_full_ratio 0.8</code>Define minimum flush and evict ages:
<code>ceph osd pool set ssd-pool cache_min_flush_age 600
ceph osd pool set ssd-pool cache_min_evict_age 1800</code>Removing a Cache Pool
For a read‑only cache pool, simply disable it and remove the overlay:
<code>ceph osd tier cache-mode ssd-pool none
ceph osd tier remove-overlay sata-pool</code>For a read‑write cache pool, first switch to forward mode to flush pending data, wait for completion, optionally force flush/evict, then remove the overlay and finally delete the pools:
<code>ceph osd tier cache-mode ssd-pool forward
# wait for data to flush
rados -p ssd-pool ls
rados -p ssd-pool cache-flush-evict-all
ceph osd tier remove-overlay sata-pool
ceph osd tier remove sata-pool ssd-pool</code>Note: Ceph recommends avoiding cache pools for performance‑critical RBD workloads; they are better suited for object‑storage use cases such as RGW.
Ops Development Stories
Maintained by a like‑minded team, covering both operations and development. Topics span Linux ops, DevOps toolchain, Kubernetes containerization, monitoring, log collection, network security, and Python or Go development. Team members: Qiao Ke, wanger, Dong Ge, Su Xin, Hua Zai, Zheng Ge, Teacher Xia.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.