Key HBase Configuration Parameters and Production Recommendations (HBase 1.1.2)
This article categorizes and explains the most important HBase 1.1.2 configuration parameters—covering Region sizing, BlockCache strategies, Memstore thresholds, Compaction behavior, HLog handling, Call Queue tuning, and miscellaneous settings—while offering practical recommendations for optimal production deployment.
Region : hbase.hregion.max.filesize defaults to 10 GB; in production it is recommended to set it to 50‑80 GB to balance compaction overhead and region split frequency.
BlockCache : The article advises using LRUBlockCache for < 20 GB memory and BucketCache (off‑heap) for larger caches. file.block.cache.size (default 0.4) controls LRU cache size as a fraction of JVM heap, while hbase.bucketcache.size defines off‑heap capacity; recommended values are 0.05‑0.1 for LRU and appropriate off‑heap size based on physical memory.
Memstore : hbase.hregion.memstore.flush.size defaults to 128 MB; increase to 256 MB if flushes are frequent and memory is ample. hbase.hregion.memstore.block.multiplier default 4 (often set to 5). hbase.regionserver.global.memstore.size defaults to 0.4; set to 0.6‑0.65 for off‑heap mode. hbase.regionserver.global.memstore.lowerLimit stays at 0.95. hbase.regionserver.optionalcacheflushinterval defaults to 1 hour; production often uses 10 hours to reduce small‑file generation.
Compaction : Key parameters include hbase.hstore.compactionThreshold (default 3, raise to 5‑10 for high write QPS), hbase.hstore.compaction.max (default 10, usually 2‑3× threshold), thread throttling settings, and hbase.hstore.blockingStoreFiles (default 10, raise to 100 to avoid write blocking). Major compaction interval hbase.hregion.majorcompaction defaults to one week; for large tables it is often disabled (set to 0).
HLog : hbase.regionserver.maxlogs defaults to 32, often increased per HBASE‑14951. hbase.regionserver.hlog.splitlog.writer.threads defaults to 3; production clusters typically raise to 10 to speed recovery.
Call Queue : hbase.regionserver.handler.count defaults to 30; production values range 100‑200. hbase.ipc.server.callqueue.handler.factor , hbase.ipc.server.callqueue.read.ratio , and hbase.ipc.server.call.queue.scan.ratio default to 0 and should be tuned based on read/write workload distribution.
Other Important Settings : Enable online schema updates with hbase.online.schema.update.enable (set true), turn on quota ( hbase.quota.enabled ) and snapshots ( hbase.snapshot.enabled ) for production, adjust zookeeper.session.timeout (default 180 s) in line with ZK server settings, enable multi‑update via hbase.zookeeper.useMulti , and configure coprocessor classes for security and token handling.
The article emphasizes that most HBase parameters remain at defaults; only a focused subset needs adjustment based on hardware, network, and workload characteristics, helping newcomers quickly grasp the rationale behind each setting.
Big Data Technology Architecture
Exploring Open Source Big Data and AI Technologies
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.