Databases 9 min read

Performance Tuning of Pika KV Store: How max‑write‑buffer‑size Affects I/O and QPS

This article analyzes a real‑world migration from MongoDB to the open‑source Pika KV store, demonstrating how adjusting the max‑write‑buffer‑size parameter dramatically improves disk I/O characteristics and raises query‑per‑second throughput from 3 K to 40 K.

Aikesheng Open Source Community
Aikesheng Open Source Community
Aikesheng Open Source Community
Performance Tuning of Pika KV Store: How max‑write‑buffer‑size Affects I/O and QPS

The author, a former Oracle and MySQL DBA now maintaining MySQL, MongoDB, and Redis, presents a case study of moving a write‑heavy, low‑read business from a MongoDB 4‑shard cluster to the memory‑KV database Pika (version 3.3.6) to reduce costs.

Initial load tests on a modest 8‑core, 8 GB, 200 GB instance (Machine A) showed severe time‑out errors, low QPS (~3 K), and disk %util constantly near 100 %, indicating saturation.

In contrast, the same Pika version on a larger 24‑core, 48 GB, 1.5 TB instance (Machine B) achieved 40 K QPS without errors, suggesting hardware differences.

FIO benchmarks revealed only ~20 % higher throughput on the larger machine, insufficient to explain the ten‑fold QPS gap. Copying Machine B’s configuration to Machine A restored the 40 K QPS, pinpointing a configuration issue.

Diagnosis

After iterative testing, the root cause was identified as an unreasonable max-write-buffer-size setting. The default value (13 MB) limited write performance; increasing it to 4 GB raised QPS from 3 K to 40 K.

iostat screenshots illustrate the difference:

-- max-write-buffer-size 14045392 (13 MB)

-- max-write-buffer-size 4294967296 (4 GB)

The key metrics that changed were w/s (writes per second) and avgrq‑sz (average request size). Larger buffers caused fewer, larger writes, improving throughput.

The %util metric reflects request count, not request size; a value of 100 % does not necessarily mean the disk is saturated.

Explanation using a traffic analogy: many small cars (small avgrq‑sz ) can fill the road even at high w/s , while fewer large buses (large avgrq‑sz ) move more passengers with the same or lower w/s .

Official Pika documentation on max-write-buffer-size (excerpt):

# Pika底层单个rocksdb单个memtable的大小, 设置越大写入性能越好但会在buffer刷盘时带来更大的IO负载, 请依据使用场景合理配置
[RocksDb‑Tuning‑Guide](https://github.com/facebook/rocksdb/wiki/RocksDB‑Tuning‑Guide)
write‑buffer‑size : 268435456

# pika实例所拥有的rocksdb实例使用的memtable大小上限,如果rocksdb实际使用超过这个数值,下一次写入会造成刷盘
[Rocksdb‑Basic‑Tuning](https://github.com/facebook/rocksdb/wiki/Setup‑Options‑and‑Basic‑Tuning)
max‑write‑buffer‑size : 10737418240

RocksDB uses a WAL + LSM architecture; larger memtables enable more batch writes, better utilizing disk I/O.

The original configuration lacked max-write-buffer-size ; only write-buffer-size was set. Increasing write-buffer-size alone did not improve performance, confirming the decisive role of max-write-buffer-size .

Final tuned parameters used in production:

thread‑num : 8 # matches CPU cores
thread‑pool‑size : 8
write‑buffer‑size : 268435456
max‑write‑buffer‑size : 4294967296
compression : snappy
max‑background‑flushes : 2
max‑background‑compactions : 2

Conclusion

The case deepens understanding of iostat metrics; a 100 % %util does not automatically mean disk I/O saturation—it may result from many small I/O requests, distinguishable via w/s and avgrq‑sz .

When using Pika, explicitly set max-write-buffer-size as it critically impacts performance and does not automatically follow write-buffer-size .

After migration, the application saved considerable resources compared to the previous MongoDB cluster, illustrating that there is no universally best database, only the most suitable one for a given workload.

RedisPerformance TuningRocksDBPikaiostatmax-write-buffer-size
Aikesheng Open Source Community
Written by

Aikesheng Open Source Community

The Aikesheng Open Source Community provides stable, enterprise‑grade MySQL open‑source tools and services, releases a premium open‑source component each year (1024), and continuously operates and maintains them.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.