Introducing Yunqi Lakehouse: An Integrated Cloud‑Native Data Platform with Incremental Computing and Auto Materialized Views
This article introduces Yunqi's self‑developed Lakehouse product, explaining its cloud‑native, one‑stop data platform architecture, incremental computing that balances freshness, performance and cost, and the autoMV feature that automatically creates materialized views to boost query speed up to nine times.
The article presents Yunqi Technology's self‑built Lakehouse product, a cloud‑native enterprise‑grade data platform that unifies offline and real‑time processing through a single integrated architecture.
Key product features include multi‑cloud independence, compute‑storage separation with elastic scaling, a comprehensive data management suite (integration, development, asset management, monitoring), and a unified engine that supports both batch and streaming workloads.
Leveraging incremental computing, the Lakehouse can seamlessly shift from T+1 (daily) to T+0 (near‑real‑time) processing by adjusting materialized view refresh intervals from days down to minutes, balancing data freshness, query performance, and cost.
The platform replaces traditional Lambda (assembly) architectures, which require separate offline and real‑time pipelines, with a single integrated pipeline that eliminates data duplication, reduces operational complexity, and lowers storage costs.
In a real e‑commerce scenario, the integrated engine simplifies the data processing chain: raw tables are ingested, cleaned, and transformed into ADS tables for downstream applications without maintaining separate batch and streaming paths.
Incremental computation is achieved via materialized views: developers write only full‑load SQL; the system automatically generates incremental updates, handling updates and deletes efficiently.
The autoMV (automatic materialized view) feature uses AI to detect repetitive query patterns, automatically creates optimal materialized views, rewrites queries, and achieves up to 9× performance improvement (e.g., SSB benchmark runtime reduced from ~16 s to ~2 s, QPS increased from 0.77 to 6.25, CPU usage cut by tenfold).
AutoMV selects only high‑value views, discarding those with low benefit, and operates transparently to users, with built‑in AI models handling view evaluation, creation, and query rewriting.
Q&A highlights include: minute‑level refresh intervals, a self‑developed query engine, support for update/delete in materialized views, unlimited scalability on object storage, and automatic lifecycle management of materialized views.
Overall, Yunqi Lakehouse demonstrates how an integrated, cloud‑native architecture combined with incremental computing and AI‑driven autoMV can simplify data pipelines, improve developer productivity, and deliver faster, more cost‑effective analytics.
DataFunSummit
Official account of the DataFun community, dedicated to sharing big data and AI industry summit news and speaker talks, with regular downloadable resource packs.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.