Big Data Technology & Architecture
Author

Big Data Technology & Architecture

Wang Zhiwu, a big data expert, dedicated to sharing big data technology.

1.0k
Articles
0
Likes
426
Views
0
Comments
Recent Articles

Latest from Big Data Technology & Architecture

100 recent articles max
Big Data Technology & Architecture
Big Data Technology & Architecture
Oct 28, 2024 · Big Data

Key Considerations for Using Paimon Primary Key Tables

This article explains the characteristics of Paimon primary key tables, covering bucket selection, cross‑partition update issues, recommended record‑level expiration settings, and two approaches to handle file compaction, including configuration tweaks and dedicated compaction tasks.

Big DataBucketFlink
0 likes · 6 min read
Key Considerations for Using Paimon Primary Key Tables
Big Data Technology & Architecture
Big Data Technology & Architecture
Oct 22, 2024 · Big Data

Key Frameworks and Characteristics of Lakehouse Architecture: A Ground‑Level Perspective

This article reviews the emerging lakehouse architecture, outlines its core frameworks such as Hudi, Iceberg, Paimon, Flink, and Doris, discusses their storage‑compute separation, read‑write optimizations, and highlights how companies of different sizes adopt these technologies based on cost, efficiency, and specific business scenarios.

Data ArchitectureFlinkLakehouse
0 likes · 6 min read
Key Frameworks and Characteristics of Lakehouse Architecture: A Ground‑Level Perspective
Big Data Technology & Architecture
Big Data Technology & Architecture
Oct 21, 2024 · Big Data

Key New Features of Apache Doris 3.0: Storage‑Compute Separation, Lakehouse Integration, Semi‑Structured Data, ETL Enhancements, Materialized Views, and Java UDTF

Apache Doris 3.0 introduces storage‑compute separation, native lakehouse write‑back, optimized Variant handling for semi‑structured data, stronger ETL transaction support, enhanced multi‑table materialized views, and Java UDTF capabilities, providing developers with more flexible, cost‑effective, and high‑performance analytics solutions.

Apache DorisData WarehouseETL
0 likes · 7 min read
Key New Features of Apache Doris 3.0: Storage‑Compute Separation, Lakehouse Integration, Semi‑Structured Data, ETL Enhancements, Materialized Views, and Java UDTF
Big Data Technology & Architecture
Big Data Technology & Architecture
Oct 16, 2024 · Databases

Kuaishou's Lakehouse‑Integrated OLAP Architecture with Apache Doris: Design, Migration, and Optimization

The article describes how Kuaishou transformed its high‑traffic OLAP system from a separated lake‑and‑warehouse architecture using Hive/Hudi and ClickHouse into a unified lakehouse solution powered by Apache Doris, detailing the challenges, design choices, caching and automatic materialization mechanisms, and the resulting performance and governance improvements.

Apache DorisBig DataData Caching
0 likes · 18 min read
Kuaishou's Lakehouse‑Integrated OLAP Architecture with Apache Doris: Design, Migration, and Optimization
Big Data Technology & Architecture
Big Data Technology & Architecture
Sep 26, 2024 · Big Data

Key Features of Apache Paimon 0.9.0 Release

The Apache Paimon 0.9.0 release introduces production‑ready Branch support, native Iceberg compatibility, a caching catalog for faster OLAP queries, improved Bucketed Append tables with reduced small‑file issues, and full DELETE/UPDATE/MERGE‑INTO capabilities for Append tables, making the system more usable and efficient.

Apache PaimonBig DataBranch
0 likes · 5 min read
Key Features of Apache Paimon 0.9.0 Release
Big Data Technology & Architecture
Big Data Technology & Architecture
Sep 18, 2024 · Databases

Doris Performance Optimization: OLAP Query, Indexes, Vectorized Execution, and High‑Concurrency Point Queries

This article explains how Apache Doris achieves high‑concurrency OLAP and point‑query performance through MPP architecture, columnar storage, partition‑bucket pruning, various indexes, materialized views, vectorized execution, runtime filters, short‑circuit planning, and prepared‑statement caching.

IndexesOLAPdoris
0 likes · 12 min read
Doris Performance Optimization: OLAP Query, Indexes, Vectorized Execution, and High‑Concurrency Point Queries