Tagged articles
4 articles
Page 1 of 1
DataFunSummit
DataFunSummit
May 22, 2026 · Big Data

How OPPO Accelerates Multimodal Data & AI Fusion with Gravitino and Curvine

OPPO tackles explosive multimodal data growth by unifying metadata with Gravitino and boosting I/O performance using the open‑source Curvine cache, delivering a four‑layer data‑lake architecture that resolves data islands, metadata chaos, and bandwidth bottlenecks while achieving near‑commercial query speeds.

CurvineDistributed CacheGravitino
0 likes · 11 min read
How OPPO Accelerates Multimodal Data & AI Fusion with Gravitino and Curvine
Big Data Technology & Architecture
Big Data Technology & Architecture
Apr 28, 2026 · Big Data

Inside Apache Paimon 1.4: Core Principles and Design of an AI Multimodal Data Lake

Apache Paimon 1.4 redefines itself as an AI multimodal data lake by introducing row tracking, data evolution, Blob and Vector tables, Variant shredding, and Lumina‑BTree global indexing, each explained with concrete examples, configuration flags, and storage layouts that illustrate how the new capabilities enable unified storage and efficient retrieval of diverse data types.

Apache PaimonBlob TableData Evolution
0 likes · 8 min read
Inside Apache Paimon 1.4: Core Principles and Design of an AI Multimodal Data Lake
DataFunSummit
DataFunSummit
Apr 25, 2026 · Big Data

AI‑Era Multimodal Data Lake Infrastructure: TBDS Design, Storage, Compute, and Governance

The article analyzes how Tencent Cloud's TBDS platform tackles the AI era's multimodal data lake challenges through a native storage format (Lance), elastic Ray‑based compute, standardized metadata with Gravitino, and automated governance via Lakekeeper, citing architecture details, performance numbers, and real‑world deployments.

AI infrastructureBig DataGravitino
0 likes · 13 min read
AI‑Era Multimodal Data Lake Infrastructure: TBDS Design, Storage, Compute, and Governance
ByteDance Data Platform
ByteDance Data Platform
Oct 29, 2025 · Big Data

How Volcano Engine’s Multimodal Data Lake Tackles AI Agent Challenges

The article explores how Volcano Engine’s multimodal data lake architecture addresses the storage, compute, and management challenges of AI agents by introducing new formats like Lance, upgrading engines such as Spark and Daft, and providing unified tools for processing, versioning, and querying massive multimodal datasets.

Big DataCloud ComputingDaft engine
0 likes · 13 min read
How Volcano Engine’s Multimodal Data Lake Tackles AI Agent Challenges