Big Data 11 min read

Practical Experience with HBase at NetEase: Architecture, Core Use Cases, HBCK & RIT Troubleshooting, and Diagnosis Strategies

This article summarizes NetEase Hangzhou Research Institute expert Fan Xinxin's presentation on HBase, covering its role in the big‑data ecosystem, core production scenarios, RIT and HBCK troubleshooting techniques, and systematic monitoring and log‑analysis methods for diagnosing HBase issues.

Big Data Technology Architecture
Big Data Technology Architecture
Big Data Technology Architecture
Practical Experience with HBase at NetEase: Architecture, Core Use Cases, HBCK & RIT Troubleshooting, and Diagnosis Strategies

This article is based on NetEase Hangzhou Research Institute technical expert Fan Xinxin's talk at the 3rd China HBase Meetup, edited into a comprehensive guide.

The presentation is divided into four parts: the position of HBase in the big‑data field, NetEase's core HBase application scenarios, RIT & HBCK concepts, and HBase problem‑diagnosis approaches.

HBase is a versatile K‑V database that supports key‑value lookups, row‑key scans, and both small‑scale and large‑scale scans, making it suitable for a wide range of data access patterns.

NetEase's big‑data architecture includes data sources such as MySQL, log systems, app SDKs, and sensors; ingestion tools like Sqoop, DataStream, Kafka, Spark Streaming, and Flink; and storage layers comprising offline HDFS (with ORC, Parquet, CarbonData, Kudu, GP), online storage (HBase and Phoenix), and time‑series databases (OpenTSDB, Druid, InfluxDB). HBase plays a crucial role in the online storage tier.

NetEase operates a large HBase cluster (300+ machines, 3 PB data) serving services such as Kaola, Cloud Music, News client, and various cloud and big‑data services. User behavior data is collected, processed with MapReduce and Spark, bulk‑loaded into HBase, and then used for real‑time recommendation services.

Additional use cases include an internal sentinel monitoring system built on HBase (instead of OpenTSDB), e‑commerce order storage, historical messages, push notifications, low‑latency dashboards, product inventory, search history, log archiving, CDN traffic, and security user tracking.

The HBCK tool provides two main functions: consistency and completeness checks, and table repairs. Common commands include ./bin/hbase hbck , ./bin/hbase hbck --details , and ./bin/hbase hbck TableFoo TableBar . Repairs can be low‑risk (e.g., -fixAssignments , -fixMeta ) or high‑risk (region‑overlap fixes that may require manual HDFS edits).

Low‑risk fixes address missing or incorrect region assignments and metadata mismatches, following the principle that HDFS is the source of truth. High‑risk fixes involve overlapping region chains and may need manual intervention; it is recommended to run hbck --details first and avoid risky -repair or -fix commands in production.

Typical troubleshooting steps start with monitoring key metrics (CPU, I/O, network, GC, compaction, queue lengths) at environment, machine, RegionServer, and table levels, then proceed to log analysis (master logs for DDL, balance, snapshots; RegionServer logs for region operations and read/write activity). If issues persist, external help via search engines, forums, or community mailing lists is suggested.

In summary, effective HBase problem resolution combines proactive monitoring, detailed log inspection, careful use of HBCK commands, and, when necessary, source‑code analysis and community assistance.

Author: Fan Xinxin, technical expert at NetEase Hangzhou Research Institute, focusing on HBase development and operations.

monitoringArchitectureBig DataHBasetroubleshootingHBCKRIT
Big Data Technology Architecture
Written by

Big Data Technology Architecture

Exploring Open Source Big Data and AI Technologies

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.