Understanding HBase Compaction: Principles, Process, Throttling Strategies and Real‑World Optimizations
This article explains HBase’s LSM‑Tree compaction fundamentals—including minor and major compaction triggers, file‑selection policies, dynamic throughput throttling, and practical tuning examples that show how adjusting size limits, thread pools, and off‑peak settings can dramatically improve read latency and cluster stability.
This article provides a comprehensive overview of HBase Compaction, covering its underlying principles, execution flow, throttling mechanisms, and practical tuning cases.
1. Compaction Overview HBase stores data using an LSM‑Tree architecture. Writes go to a WAL and an in‑memory MemStore; when thresholds are met, a Flush creates an HFile. Over time many HFiles accumulate, increasing read I/O. Compaction merges small HFiles to reduce file count and improve read latency. Two types exist:
Minor Compaction – merges a subset of adjacent small HFiles.
Major Compaction – merges all HFiles in a Store, removing TTL‑expired, deleted, and over‑versioned cells.
Major Compactions are resource‑intensive; production clusters often disable automatic triggering and run them manually during off‑peak windows.
2. Compaction Triggering Compaction can be initiated by three mechanisms:
Periodic background thread ( CompactionChecker ) that evaluates thresholds such as file count and age.
MemStore Flush – after a Flush, HBase checks whether the resulting HFile count exceeds configured limits and may start a Minor or Major Compaction.
Manual invocation via HBase API, shell commands (e.g., compact , major_compact ) or UI.
Key configuration parameters include hbase.server.thread.wakefrequency , hbase.server.compactchecker.interval.multiplier , and hbase.hregion.majorcompaction .
3. Compaction Process The workflow consists of:
RegionServer starts a CompactionChecker thread (default every 10 s).
When triggered, a dedicated thread selects candidate HFiles based on size, count, and custom policies.
Selected files are read sequentially, their KeyValues are merged, and the result is written to a temporary file.
The temporary file is moved into the Region’s data directory, a WAL entry is created, and the original files are archived.
Each step is designed to be idempotent and fault‑tolerant.
4. File‑Selection Policies HBase provides several policies to decide which files to compact:
RatioBasedCompactionPolicy – slides a window from the oldest file until either the size ratio condition or the minimum file count is satisfied.
ExploringCompactionPolicy – enumerates all feasible sub‑ranges, preferring the combination with the most files (or smallest total size) that satisfies size and ratio constraints.
StripeCompactionPolicy – groups files into logical stripes (similar to LevelDB levels) to limit the scope of Major Compactions.
Representative code snippets:
// Compaction thread initialization
this.compactSplitThread = new CompactSplitThread(this);
this.compactionChecker = new CompactionChecker(this, this.threadWakeFrequency, this);
if (this.compactionChecker != null) choreService.scheduleChore(compactionChecker); @Override
protected void chore() {
for (HRegion r : this.instance.onlineRegions.values()) {
if (r == null) continue;
for (Store s : r.getStores().values()) {
try {
long multiplier = s.getCompactionCheckMultiplier();
assert multiplier > 0;
if (iteration % multiplier != 0) continue;
if (s.needsCompaction()) {
this.instance.compactSplitThread.requestSystemCompaction(r, s, getName() + " requests compaction");
} else if (s.isMajorCompaction()) {
// request major compaction with appropriate priority
}
} catch (IOException e) { LOG.warn("Failed major compaction check on " + r, e); }
}
}
iteration = (iteration == Long.MAX_VALUE) ? 0 : (iteration + 1);
} /** RatioBasedCompactionPolicy */
ArrayList
applyCompactionPolicy(ArrayList
candidates,
boolean mayUseOffPeak, boolean mayBeStuck) throws IOException {
if (candidates.isEmpty()) return candidates;
int start = 0;
double ratio = mayUseOffPeak ? comConf.getCompactionRatioOffPeak() : comConf.getCompactionRatio();
int countOfFiles = candidates.size();
long[] fileSizes = new long[countOfFiles];
long[] sumSize = new long[countOfFiles];
for (int i = countOfFiles - 1; i >= 0; --i) {
StoreFile file = candidates.get(i);
fileSizes[i] = file.getReader().length();
int tooFar = i + comConf.getMaxFilesToCompact() - 1;
sumSize[i] = fileSizes[i] + ((i + 1 < countOfFiles) ? sumSize[i + 1] : 0)
- ((tooFar < countOfFiles) ? fileSizes[tooFar] : 0);
}
while (countOfFiles - start >= comConf.getMinFilesToCompact() &&
fileSizes[start] > Math.max(comConf.getMinCompactSize(),
(long)(sumSize[start + 1] * ratio))) {
++start;
}
if (start < countOfFiles) LOG.info("Default compaction selected " + (countOfFiles - start) + " files");
candidates.subList(0, start).clear();
return candidates;
}5. Throttling (Rate Limiting) To avoid overwhelming the cluster, HBase dynamically adjusts compaction throughput based on pressure:
Two bounds: hbase.hstore.compaction.throughput.lower.bound (default 10 MB/s) and hbase.hstore.compaction.throughput.higher.bound (default 20 MB/s).
Effective throughput = lower + (higher – lower) × pressureRatio, where pressureRatio ∈ [0,1] is derived from the number of pending HFiles.
If the pending file count exceeds hbase.hstore.blockingStoreFiles , writes are blocked and throttling is disabled.
private void tune(double compactionPressure) {
double maxThroughputToSet;
if (compactionPressure > 1.0) {
maxThroughputToSet = Double.MAX_VALUE; // unlimited when blocking
} else if (offPeakHours.isOffPeakHour()) {
maxThroughputToSet = maxThroughputOffpeak;
} else {
maxThroughputToSet = maxThroughputLowerBound + (maxThroughputHigherBound - maxThroughputLowerBound) * compactionPressure;
}
this.maxThroughput = maxThroughputToSet;
}6. Real‑World Cases and Tuning Two production incidents are analyzed:
Even after disabling automatic Major Compaction, the long‑compaction queue grew, causing latency spikes. Investigation revealed that large HFiles (>2.5 GB) triggered Minor Compactions that were routed to the long‑compaction thread pool. Reducing hbase.hstore.compaction.max.size to 2 GB forced those files to be excluded from Minor Compaction, postponing them to off‑peak Major Compactions and restoring latency to ~50 ms.
A table with 578 TB of data experienced prolonged Major Compaction times. By increasing the Major Compaction thread pool from 1 to 10 and adjusting off‑peak hours, the compaction throughput rose, shrinking the table to 349 TB (‑40 %) and normalizing read/write latency.
These cases illustrate the importance of balancing compaction aggressiveness, thread‑pool sizing, and size thresholds.
7. Parameter Reference The article concludes with a table of key compaction parameters (e.g., hbase.hstore.compaction.ratio , hbase.hstore.compaction.max.size , hbase.regionserver.thread.compaction.throttle ) for operators to fine‑tune in production.
Overall, the piece serves as a detailed guide for engineers working with HBase, offering both theoretical background and actionable tuning advice.
Sohu Tech Products
A knowledge-sharing platform for Sohu's technology products. As a leading Chinese internet brand with media, video, search, and gaming services and over 700 million users, Sohu continuously drives tech innovation and practice. We’ll share practical insights and tech news here.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.