Big Data 12 min read

Apache Hudi Asia Summit Successfully Held

The first Apache Hudi Asia Summit in Beijing attracted over 230 attendees, featuring technical discussions on data lake optimization and case studies from companies like Fastly and Meituan.

Kuaishou Tech
Kuaishou Tech
Kuaishou Tech
Apache Hudi Asia Summit Successfully Held

The first Apache Hudi Asia Summit in Beijing attracted over 230 attendees, featuring technical discussions on data lake optimization and case studies from companies like Fastly and Meituan.

Key topics included Apache Hudi's 1.0 version upgrades, such as storage format optimization, index system restructuring, and incremental processing capabilities. Speakers from Fastly, Meituan, and other companies shared best practices and technical implementations.

Fastly's data architecture team discussed AI and BI scenario implementations, highlighting the use of full-linkage vectorization, real-time subscription, and logical wide table column concatenation to optimize data organization and reduce costs.

Meituan's Beluga architecture was presented, focusing on a "one table three modes" approach that combines row-based HFile for stream writing and column-based Parquet for batch processing, improving data processing efficiency and reducing operational complexity.

Douyin's SampleCenter platform was showcased, addressing challenges in EB-level recommendation data scenarios through unified lake storage, real-time sample and tag concatenation, and dynamic bucket strategies.

Huawei's optimizations included reducing GC impact by storing raw row data as byte arrays, improving stream reading performance, and implementing column cluster concepts for sparse matrix storage.

Jingdong's data lake architecture introduced a multi-model storage approach, combining HDFS, Kafka/HBase, and other storage systems for seamless integration and enhanced data processing capabilities.

The summit concluded with a call for continued community collaboration and innovation in data lake technologies.

Data EngineeringBig Datacloud computingData Optimizationdata lakeTechnical ConferenceApache Hudi
Kuaishou Tech
Written by

Kuaishou Tech

Official Kuaishou tech account, providing real-time updates on the latest Kuaishou technology practices.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.