Big Data 14 min read

Introducing Yunqi Lakehouse: An Integrated Cloud‑Native Data Platform with Incremental Computing and Auto Materialized Views

This article introduces Yunqi's self‑developed Lakehouse product, explaining its cloud‑native, one‑stop data platform architecture, incremental computing that balances freshness, performance and cost, and the autoMV feature that automatically creates materialized views to boost query speed up to nine times.

DataFunSummit

Jan 9, 2024

Introducing Yunqi Lakehouse: An Integrated Cloud‑Native Data Platform with Incremental Computing and Auto Materialized Views

The article presents Yunqi Technology's self‑built Lakehouse product, a cloud‑native enterprise‑grade data platform that unifies offline and real‑time processing through a single integrated architecture.

Key product features include multi‑cloud independence, compute‑storage separation with elastic scaling, a comprehensive data management suite (integration, development, asset management, monitoring), and a unified engine that supports both batch and streaming workloads.

Leveraging incremental computing, the Lakehouse can seamlessly shift from T+1 (daily) to T+0 (near‑real‑time) processing by adjusting materialized view refresh intervals from days down to minutes, balancing data freshness, query performance, and cost.

The platform replaces traditional Lambda (assembly) architectures, which require separate offline and real‑time pipelines, with a single integrated pipeline that eliminates data duplication, reduces operational complexity, and lowers storage costs.

In a real e‑commerce scenario, the integrated engine simplifies the data processing chain: raw tables are ingested, cleaned, and transformed into ADS tables for downstream applications without maintaining separate batch and streaming paths.

Incremental computation is achieved via materialized views: developers write only full‑load SQL; the system automatically generates incremental updates, handling updates and deletes efficiently.

The autoMV (automatic materialized view) feature uses AI to detect repetitive query patterns, automatically creates optimal materialized views, rewrites queries, and achieves up to 9× performance improvement (e.g., SSB benchmark runtime reduced from ~16 s to ~2 s, QPS increased from 0.77 to 6.25, CPU usage cut by tenfold).

AutoMV selects only high‑value views, discarding those with low benefit, and operates transparently to users, with built‑in AI models handling view evaluation, creation, and query rewriting.

Q&A highlights include: minute‑level refresh intervals, a self‑developed query engine, support for update/delete in materialized views, unlimited scalability on object storage, and automatic lifecycle management of materialized views.

Overall, Yunqi Lakehouse demonstrates how an integrated, cloud‑native architecture combined with incremental computing and AI‑driven autoMV can simplify data pipelines, improve developer productivity, and deliver faster, more cost‑effective analytics.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Big Data Data Platform Lakehouse Auto Materialized View incremental computing

Written by

DataFunSummit

Official account of the DataFun community, dedicated to sharing big data and AI industry summit news and speaker talks, with regular downloadable resource packs.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.