Databases 10 min read

Performance Evaluation and Optimization of HBase 2.x Write Operations

This article presents a detailed performance test of HBase 2.x write throughput on a five‑node SSD cluster, identifies latency spikes caused by MemStore flush and ConcurrentSkipListMap size() overhead, and demonstrates how fixing the bug and applying in‑memory compaction dramatically reduces P999 and P9999 latency while preserving throughput.

Big Data Technology Architecture
Big Data Technology Architecture
Big Data Technology Architecture
Performance Evaluation and Optimization of HBase 2.x Write Operations

HBase 2.x write performance was evaluated on a five‑node cluster (each node equipped with twelve 800 GB SSDs, 24‑core CPU, 128 GB RAM) using HBase 2.1.2, HDFS 2.6.0, and OpenJDK 1.8.0_202.

The test environment mirrors a production setup where RegionServer and DataNode processes share the same machines, allowing local replica writes. Each RegionServer was configured with 50 GB on‑heap (MemStore) and 50 GB off‑heap (BucketCache) memory.

Before testing, 100 billion rows (100 bytes each) were pre‑loaded with YCSB BufferMutator, achieving up to 200 k QPS on a single node.

Normal write performance results

During continuous single‑row Put (autoflush=true) tests, the cluster sustained roughly 100 k QPS total (≈20 k QPS per node) with average latency < 4 ms and P99 latency < 20 ms. However, periodic throughput and latency spikes were observed every ~15 minutes, with occasional P999 latency reaching 150 ms and P9999 spikes exceeding 1 s.

Root‑cause analysis

Log inspection revealed that both throughput valleys and P999 spikes coincided with MemStore flush events. Two main issues were identified:

All Regions on a node could flush simultaneously due to perfectly balanced data distribution, creating a sudden disk write surge.

The MemStore snapshot step holds a write lock while invoking ConcurrentSkipListMap#size() , an O(N) operation that becomes costly for large MemStores (e.g., 256 MB).

The size() call was a long‑standing bug (HBASE‑21738) present from HBase 0.98 through 2.0, causing the observed latency spikes.

Fixing the bug simply involves removing the expensive size() call or replacing it with a more efficient mechanism. The fix has been merged into recent HBase branches.

After applying the fix, performance tests showed P999 latency consistently below 100 ms and P9999 spikes reduced to 200‑500 ms.

Further optimization with In‑Memory Compaction

To address remaining P999 spikes (often caused by G1 GC STW pauses around 100 ms), the community‑introduced In‑Memory Compaction feature was evaluated. This feature partitions a 256 MB MemStore into multiple 2 MB immutable segments, allowing the immutable parts to be stored as ordered arrays instead of a ConcurrentSkipListMap, drastically reducing heap usage and GC pressure.

The test used the same hardware but enabled CompactingMemstore . The core RegionServer configuration was:

hbase.hregion.memstore.block.multiplier=5
hbase.hregion.memstore.flush.size=268435456
hbase.regionserver.global.memstore.size=0.4
hbase.regionserver.global.memstore.size.lower.limit=0.625
hbase.hregion.compacting.memstore.type=BASIC

Results showed P999 latency under 50 ms and P9999 around 100 ms, with virtually no impact on average latency or throughput. Using off‑heap memory for the memstore pool could tighten the tail further, albeit with a slight increase in average latency.

Summary

HBase 2.1.2 delivers excellent write throughput and low average latency, but occasional flush‑induced spikes can inflate P999 latency. The HBASE‑21738 fix reduces P999 to ~100 ms (most points < 40 ms). Applying In‑Memory Compaction further lowers P999 to < 50 ms and P9999 to ~100 ms without sacrificing throughput, making HBase 2.1.3/2.2.0 strong candidates for high‑performance write workloads.

Big DataPerformance TuningHBasewrite performanceMemstoreIn-Memory Compaction
Big Data Technology Architecture
Written by

Big Data Technology Architecture

Exploring Open Source Big Data and AI Technologies

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.