Databases 16 min read

Practical Use Cases of Materialized Views and Indexes in Doris

This article shares practical experiences with Doris, covering materialized view concepts, typical use cases, index principles, performance optimizations, and real‑world scenarios such as order analysis, PV/UV aggregation, and detailed queries, while also providing operational tips and Q&A insights.

DataFunSummit
DataFunSummit
DataFunSummit
Practical Use Cases of Materialized Views and Indexes in Doris

The article introduces materialized views in Doris, explaining that they are special physical tables storing pre‑computed results, replacing the older rollup concept. Two key ideas are pre‑computation of aggregates and the fact that a materialized view occupies actual storage.

Typical scenarios for using materialized views include: (1) supporting both detailed and fixed‑dimension queries, (2) queries that involve only a small subset of rows or columns, (3) expensive aggregations such as SUM or DISTINCT, and (4) combining with prefix indexes to accelerate queries.

Advantages of materialized views are significant query speed improvements and strong data consistency, because Doris writes to the base table and the view within the same transaction.

Creation syntax follows the standard CREATE MATERIALIZED VIEW view_name AS SELECT ... pattern. Example snippets:

CREATE MATERIALIZED VIEW k1_k2_sumk3 AS SELECT k1, k2, sum(k3) FROM duplicate_table GROUP BY k1, k2;
CREATE MATERIALIZED VIEW deduplicate AS SELECT k1, k2, k3, k4 FROM duplicate_table GROUP BY k1, k2, k3, k4;
CREATE MATERIALIZED VIEW mv_1 AS SELECT k3, k2, k1 FROM tableA ORDER BY k3;

Doris stores data in an SSTable‑like structure and provides a built‑in prefix index that stores the first 36 bytes of the sorted key columns. For VARCHAR columns, the index truncates to the configured length. This index enables fast lookups when queries filter on the leading columns of the primary key.

Query examples demonstrating index usage:

SELECT * FROM table WHERE user_id=1829239 AND age=20;
SELECT * FROM table WHERE age=20;

Real‑world application cases are presented:

Order analysis – a materialized view aggregating sales amount by warehouse reduces scan volume from billions of rows to a few thousand, achieving millisecond‑level response.

PV/UV statistics – a view that groups by advertiser and channel and uses bitmap_union_count(to_bitmap(user_id)) provides fast unique‑user counts.

Detail queries – reordering columns in a view allows a query that filters only on the third key column to hit the prefix index.

Operational tips include controlling write task frequency to avoid excessive compaction, limiting the number of materialized views in high‑throughput scenarios, adjusting tablet write speeds, and choosing performant disks (e.g., SSDs). The article also discusses latency‑sensitive workloads, high QPS requirements, and considerations for Doris’s rollup (materialized view) and colocation join features.

The Q&A section addresses concerns such as view explosion, Doris’s suitability for batch‑stream integration, performance differences between open‑source and commercial editions, and best practices for local joins.

Finally, the article thanks the audience and invites readers to join the DataFunTalk community for further big‑data and AI discussions.

Performance Optimizationbig dataOLAPIndexmaterialized viewDoris
DataFunSummit
Written by

DataFunSummit

Official account of the DataFun community, dedicated to sharing big data and AI industry summit news and speaker talks, with regular downloadable resource packs.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.