Databases 19 min read

ClickHouse Projection: Design, Implementation, and Production Performance

This article presents an in‑depth overview of ClickHouse Projection, covering its background, definition, practical use cases, underlying architecture, query analysis, consistency guarantees, performance comparisons, and real‑world production results, highlighting how it enhances OLAP workloads while maintaining strong data consistency.

Kuaishou Tech
Kuaishou Tech
Kuaishou Tech
ClickHouse Projection: Design, Implementation, and Production Performance

The presentation, based on Zheng Tianqi's talk at the Kuaishou Big Data Architecture Exchange, introduces the latest ClickHouse contribution—Projection—detailing its motivation, design, and impact on large‑scale OLAP workloads.

ClickHouse Background : ClickHouse originated from a Russian web‑analytics system, offering columnar storage, vectorized execution, and a loosely coupled P2P distributed architecture, making it a popular open‑source OLAP engine.

Projection Concept : Inspired by Vertica, a Projection is a set of columns stored in a specific order or pre‑aggregated form, defined via ALTER statements. It can be normal (reordered columns) or aggregate (pre‑aggregated data), and supports lazy materialization.

Use Cases : Examples include accelerating queries on a video_log table by reordering data on device_id , defining aggregate Projections for hourly domain‑level metrics, and simplifying ETL pipelines by eliminating separate materialized view tables.

Implementation Details : A Projection consists of three components—definition, storage, and query analysis. It is stored as a sub‑Part alongside the original Part, ensuring partition pruning and consistency during merges and mutations. Query analysis automatically selects the optimal Projection without requiring query rewrites.

Consistency Guarantees : INSERT operations propagate data to all related Projections, SELECT queries read from both original and Projection parts with atomic merges, and MUTATION triggers re‑materialization of affected Projections.

Feature Comparison and Production Impact : Projections improve query latency (up to 150× faster in some cases), reduce storage overhead (≈40% for normal Projections, negligible for aggregate), and enable high‑concurrency dashboard rendering. Limitations include part‑level granularity, inability to span parts, and lack of join support.

Conclusion : Projections act as a production‑ready materialized view mechanism for ClickHouse, offering strong consistency, automatic optimization, and significant performance gains for OLAP workloads.

ClickHouseDatabase OptimizationOLAPMaterialized ViewsProjection
Kuaishou Tech
Written by

Kuaishou Tech

Official Kuaishou tech account, providing real-time updates on the latest Kuaishou technology practices.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.