Accelerating Multi‑Dimensional OLAP Queries in ClickHouse with Grouping Sets, RBM, and Dense Dictionary Encoding
To achieve sub‑second, multi‑dimensional analytics on Bilibili’s hundred‑million‑row datasets, the team built a ClickHouse‑based acceleration layer that combines grouping‑set pre‑aggregation, bitmap (RBM) distinct handling, and a dense dictionary encoding service, dramatically cutting CPU, memory and query latency versus traditional OLAP pipelines.
The article describes the challenges of achieving second‑level response times for multi‑dimensional analysis on Bilibili's data platform when data volumes reach hundreds of millions of rows and queries involve many dimensions.
Traditional OLAP solutions such as Kylin and ETL‑based grouping‑sets pre‑aggregation consume a disproportionate amount of CPU (10.8% of daily CPU) and memory (33.2% of daily memory) despite representing less than 1% of total tasks, leading to high operational costs.
ClickHouse is adopted as the primary OLAP engine because of its MPP architecture and columnar storage, but its join performance is weaker than Doris or StarRocks. To meet both fast query and join requirements, the team aggregates large tables into smaller, pre‑aggregated tables that support both multi‑dimensional analysis and external join scenarios.
Key innovations include:
Integration of the aggregation acceleration into the OneService (OS) platform, which handles configuration, API registration, and SQL parsing.
Two aggregation modes: a simple direct aggregation for non‑distinct metrics, and a distinct‑aggregation mode that converts distinct entity values into ClickHouse bitmap aggregates (groupBitmap / RoaringBitmap).
3.1 Model Design
For a playback table with billions of rows, the schema is defined as:
Play_dwb_tbl(
uid bigint comment '用户ID',
deviceID string comment '设备ID',
area_id bigint comment '分区ID',
area_name string comment '分区名',
uid_city string comment '用户自填城市',
uid_gender string comment '用户自填性别',
play_duration bigint comment '播放时长',
dt string comment '统计日期'
)Two aggregation tables are generated:
Play_dwb_tbl_aggr_directly(
area_id bigint comment '分区ID',
area_name string comment '分区名',
uid_city string comment '用户自填城市',
uid_gender string comment '用户自填性别',
play_duration bigint comment '播放时长',
dt string comment '统计日期'
)Distinct‑aggregation tables store bitmap representations of distinct entities (e.g., unique users or devices) and are partitioned by dimension combinations to avoid combinatorial explosion.
3.2 Schema Design
Columns dim_name and dim_value are stored as LowCardinality(String) to compress repeated strings. A projection projection_logdate pre‑computes daily distinct counts:
PROJECTION projection_logdate (
SELECT dt, ind_name, groupBitmapOrState(ind_rbm), count(1)
GROUP BY dt, ind_name
)Bitmap columns use RoaringBitmap; however, when dictionary values are generated by Snowflake IDs, the bitmap becomes sparse, leading to many Array containers and poor performance.
3.3 Data Synchronization
The synchronization pipeline consists of three steps:
Produce a temporary unified model ( Temp_1 ) that contains dt , ind_name , ind_value , and a map of dimension name/value pairs.
Obtain a dense dictionary mapping for ind_value via the Dict Service, falling back to a global auto‑increment if the key is missing.
Write the mapped values into ClickHouse shard tables, performing RBM aggregation per (indicator, dimension, shard) tuple.
3.4 Query Execution
Example queries illustrate how to compute distinct user counts and device counts using bitmap functions:
-- Query distinct user count
SELECT sum(`播放用户数`)
FROM cluster('XXX', view(
SELECT bitmapAndCardinality(a.ind_rbm, b.ind_rbm) AS `播放用户数`
FROM (
SELECT ind_rbm FROM Play_ck_tbl_local
WHERE dt='XX' AND ind_name='播放用户数' AND dim_name='uid_city' AND dim_value[1]='上海'
) a
JOIN (
SELECT ind_rbm FROM Play_ck_tbl_local
WHERE dt='XX' AND ind_name='播放用户数' AND dim_name='area_id' AND dim_value[1]='1'
) b ON 1=1
) tmp)
-- Query distinct device count
SELECT bitmapCardinality(groupBitmapOrState(ind_rbm)) `播放设备数`
FROM Play_ck_tbl
WHERE dt='XX' AND ind_name='播放设备数' AND dim_name='area_id,area_name,uid_city,uid_gender'
AND dim_value[1]='1' AND dim_value[3]='上海'4. Dictionary Service Evolution
The original Snowflake‑based dictionary generates sparse IDs, which degrade RoaringBitmap performance. To address this, a dense dictionary encoding service is built on top of ClickHouse, consisting of:
BidirectionalDictionary module inside ClickHouse.
Dict Service (stateless, distributed, backed by RocksDB KV store).
Dict Client for forward/reverse lookups with caching.
Dict UDF for Spark integration.
The service stores dictionary values in big‑endian format to align with RocksDB’s LSM ordering, improving range scans during bitmap calculations.
4.4 Performance Evaluation
Benchmarks on real Bilibili user‑behavior data (≈10 M distinct device IDs) show that the dense dictionary reduces bitmap memory usage and query latency dramatically compared with the Snowflake‑based approach.
5. Engineering Integration
Before acceleration, logical models define metrics (distinct vs. non‑distinct) and dimensions. Primary‑key dimensions are required for correct RBM aggregation. The acceleration pipeline materializes two physical models: a standard aggregation model for non‑distinct metrics and an RBM model for distinct metrics. Queries combine both models using the “star‑schema” approach.
6. Future Outlook
Planned improvements include automated dimension‑combination optimization to further reduce RBM cardinality and a real‑time pipeline that generates bitmap data directly from ClickHouse dictionary services.
References
Roaring Bitmaps: Implementation of an Optimized Software Library – https://arxiv.org/pdf/1709.07821
WeOLAP 亚秒级实时数仓 —— BitBooster 10倍查询优化实践 – https://mp.weixin.qq.com/s/tJQoNRZ5UDJ_IASZLlhB4Q
Bilibili Tech
Provides introductions and tutorials on Bilibili-related technologies.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.