Tagged articles

19 articles

Page 1 of 1

Mar 23, 2026 · Databases

How Baidu’s MEG Platform Revamped ClickHouse with a Lakehouse Architecture

This article analyzes the challenges of scaling ClickHouse within Baidu’s MEG data platform and details a lake‑house solution that decouples storage and compute, integrates a meta‑service for transparent data access, optimizes query performance through caching, data roll‑up and layout tuning, and introduces a unified query gateway that gracefully falls back to Spark for complex workloads.

ClickHouseData PlatformLakehouse

0 likes · 25 min read

How Baidu’s MEG Platform Revamped ClickHouse with a Lakehouse Architecture

Big Data Technology Tribe

Jul 28, 2025 · Fundamentals

How Speculative Path Resolution Cuts Metadata Latency in InfiniFS

This article explains InfiniFS's speculative path resolution, detailing how predictable directory IDs and parallel lookups transform traditional linear RPC-based path traversal into constant‑time operations, dramatically reducing metadata access latency in large, deep directory trees.

Distributed File SystemInfiniFSmetadata service

0 likes · 8 min read

How Speculative Path Resolution Cuts Metadata Latency in InfiniFS

Big Data Technology Tribe

Jul 27, 2025 · Fundamentals

How InfiniFS Revolutionizes Metadata for Billion-File Distributed Filesystems

This article summarizes the InfiniFS paper, detailing how its access‑content decoupling, speculative path resolution, and optimistic metadata caching enable efficient metadata handling for data‑center‑scale file systems supporting billions of files.

InfiniFSdistributed file systemsfilesystem design

0 likes · 15 min read

How InfiniFS Revolutionizes Metadata for Billion-File Distributed Filesystems

ByteDance Cloud Native

Mar 13, 2025 · Backend Development

Inside DeepSeek 3FS: Architecture of a High‑Performance Parallel File System

This article dissects DeepSeek's 3FS parallel file system, detailing its four‑component architecture, high‑throughput RDMA networking, metadata handling with FoundationDB, client access methods, chain replication (CRAQ), custom FFRecord format, and recovery mechanisms, offering a deep technical perspective for storage engineers.

Distributed File SystemHigh-performance storageRDMA

0 likes · 22 min read

Inside DeepSeek 3FS: Architecture of a High‑Performance Parallel File System

Volcano Engine Developer Services

Mar 7, 2025 · Operations

Inside 3FS: How DeepSeek’s Parallel File System Powers AI Training

This article dives deep into DeepSeek's 3FS parallel file system, detailing its four-component architecture, RDMA‑based high‑speed networking, client options, metadata and storage services, replication protocols, dynamic stripe sizing, and recovery mechanisms that enable efficient AI model training and inference.

AI trainingDistributed File SystemRDMA

0 likes · 21 min read

Inside 3FS: How DeepSeek’s Parallel File System Powers AI Training

AntData

Mar 4, 2025 · Big Data

Design and Analysis of 3FS: An AI‑Optimized Distributed File System

The article provides a comprehensive English overview of 3FS, an AI‑focused distributed file system that leverages FoundationDB for metadata, CRAQ for chunk replication, and a hybrid Fuse/native client architecture, detailing its design, components, fault handling, and performance considerations for large‑scale training workloads.

AI storageCRAQ replicationDistributed File System

0 likes · 25 min read

Design and Analysis of 3FS: An AI‑Optimized Distributed File System

Didi Tech

Sep 5, 2024 · Industry Insights

How Didi Built a Multi‑Protocol, Petabyte‑Scale Storage System for AI Training

Facing petabyte‑level data, billions of small files, and the need for POSIX, S3, and HDFS compatibility, Didi designed a new generation of non‑structured storage—OrangeFS—by analyzing internal systems, combining multiple storage solutions, reusing GIFT technology, and implementing a high‑performance metadata service, multi‑protocol fusion, and robust scalability features.

AI storageBig DataCloud Native

0 likes · 27 min read

How Didi Built a Multi‑Protocol, Petabyte‑Scale Storage System for AI Training

vivo Internet Technology

Sep 27, 2023 · Big Data

Horizontal Scaling of Hive Metastore Service at Vivo: Evaluation, TiDB Migration, and Lessons Learned

Vivo’s big‑data team horizontally scaled its Hive Metastore by evaluating MySQL sharding (Waggle‑Dance) against a TiDB migration, ultimately adopting TiDB, which after a synchronized cut‑over delivered ~15% faster queries, 80% DDL latency reduction, linear scaling, low resource use, and valuable operational lessons.

Big DataHive MetastoreSQL

0 likes · 19 min read

Horizontal Scaling of Hive Metastore Service at Vivo: Evaluation, TiDB Migration, and Lessons Learned

DataFunTalk

Sep 17, 2023 · Cloud Native

REDck: A Cloud‑Native Real‑Time Data Warehouse Built on ClickHouse

REDck is a cloud‑native, storage‑compute separated real‑time OLAP data warehouse derived from ClickHouse that addresses scalability, operational cost, and reliability challenges through a unified metadata service, object‑storage optimizations, multi‑level caching, distributed task scheduling, and two‑phase commit transactions.

ClickHouseReal-time OLAPStorage Compute Separation

0 likes · 18 min read

REDck: A Cloud‑Native Real‑Time Data Warehouse Built on ClickHouse

DataFunTalk

Sep 15, 2023 · Cloud Computing

Design and Architecture of Baidu CFS Large‑Scale Distributed File System and Metadata Service

The talk from DataFun Summit 2023 explains how Baidu's CFS storage builds a trillion‑file‑scale distributed file system by revisiting file system fundamentals, POSIX limitations, historical storage architectures, and introducing a lock‑free metadata service with single‑shard primitives, data‑layout optimizations, and a simplified client‑centric architecture that achieves high scalability and performance.

CFSDistributed File SystemPOSIX

0 likes · 31 min read

Design and Architecture of Baidu CFS Large‑Scale Distributed File System and Metadata Service

ITPUB

Sep 11, 2023 · Cloud Native

How REDck Transforms ClickHouse into a Scalable Cloud‑Native Real‑Time Data Warehouse

Xiaohongshu built REDck, a cloud‑native, storage‑compute separated real‑time OLAP warehouse on ClickHouse, addressing scaling, cost, and reliability challenges through a unified metadata service, object‑storage optimizations, multi‑level caching, distributed task scheduling, bucketing, and exactly‑once transaction support.

CachingClickHouseObject Storage

0 likes · 21 min read

How REDck Transforms ClickHouse into a Scalable Cloud‑Native Real‑Time Data Warehouse

Baidu Intelligent Cloud Tech Hub

Aug 29, 2023 · Cloud Computing

How Baidu CFS Scales to Billions of Files with a Lock‑Free Metadata Service

This article explains Baidu's CFS architecture for building a billion‑file‑scale distributed file system, covering basic file system concepts, POSIX limitations, metadata service modeling, performance metrics, evolution of metadata architectures, and CFS's lock‑free design that achieves high scalability, low latency, and balanced load in cloud storage.

Distributed File SystemScalabilitycloud storage

0 likes · 32 min read

How Baidu CFS Scales to Billions of Files with a Lock‑Free Metadata Service

Baidu Geek Talk

May 29, 2023 · Backend Development

CFS: Scaling Metadata Service for Distributed File System via Pruned Scope of Critical Sections - Baidu's Implementation Journey

Baidu’s CFS metadata service scales to billions of files by shrinking critical sections through a lock‑free Namespace 2.0 design that confines conflicts to single shards, uses field‑level atomic primitives, and integrates the proxy into the client, delivering up to 76× throughput gains and significant latency reductions in production.

Baidu CFSDistributed File SystemEuroSys 2023

0 likes · 40 min read

CFS: Scaling Metadata Service for Distributed File System via Pruned Scope of Critical Sections - Baidu's Implementation Journey

Baidu Intelligent Cloud Tech Hub

May 25, 2023 · Cloud Native

How Baidu’s CFS Achieved Billion‑File Scale with a Lock‑Free Metadata Service

This article explains the design and evolution of Baidu Cloud File System's (CFS) metadata service, detailing how a novel lock‑free architecture and strategic data layout enable POSIX‑compatible, highly scalable storage that can handle billions of files while maintaining high performance and consistency.

Distributed File SystemScalabilitycloud storage

0 likes · 42 min read

How Baidu’s CFS Achieved Billion‑File Scale with a Lock‑Free Metadata Service

Baidu Intelligent Cloud Tech Hub

May 5, 2023 · Cloud Native

How Baidu’s Cloud‑Native CFS Boosts Metadata Throughput up to 75×

The EuroSys 2023 paper on Baidu’s cloud‑native file storage CFS reveals a metadata service design that trims critical sections, achieving 1.76‑75.82× higher throughput and up to 91.71% lower latency compared to HopsFS and InfiniFS, and has been production‑stable for over three years.

Cloud Native StorageDistributed File SystemEuroSys

0 likes · 6 min read

How Baidu’s Cloud‑Native CFS Boosts Metadata Throughput up to 75×

UCloud Tech

Nov 21, 2022 · Cloud Computing

How UCloud Revamped US3 Metadata Service for 80% Cost Savings and Faster Performance

UCloud’s US3 object storage metadata service, originally built on a chained MongoDB architecture, faced scalability, performance, and cost challenges, prompting a redesign that introduces a high‑compatibility DB‑Gateway, a distributed KV store (UKV) with custom RocksDB, delivering faster reads, zero list‑service latency, 80% cost reduction, and simpler operations.

Distributed KVObject StoragePerformance Optimization

0 likes · 8 min read

How UCloud Revamped US3 Metadata Service for 80% Cost Savings and Faster Performance

YunZhu Net Technology Team

Jan 26, 2022 · Databases

Graph Database Selection and NebulaGraph Architecture for a Knowledge‑Graph Platform

The article explains how the cloud‑construction platform evaluated graph‑database options based on open‑source, scalability, latency, storage capacity and import capabilities, ultimately choosing NebulaGraph, and then details NebulaGraph’s distributed meta, storage and query services as well as the overall multi‑layer knowledge‑graph platform architecture and future application scenarios.

Graph DatabaseNebulaGraphQuery Service

0 likes · 11 min read

Graph Database Selection and NebulaGraph Architecture for a Knowledge‑Graph Platform

Alibaba Cloud Native

Jan 25, 2021 · Cloud Native

How Dubbo Enables Kubernetes‑Native Service Discovery with Metadata and Revision

This article explains how Dubbo integrates with Kubernetes to provide application‑level service discovery using a metadata service and revision mechanism, detailing both API‑client and DNS‑client implementations, their workflows, advantages, drawbacks, and future directions toward cloud‑native adoption.

API ClientCloud NativeDNS

0 likes · 12 min read

How Dubbo Enables Kubernetes‑Native Service Discovery with Metadata and Revision

dbaplus Community

Aug 27, 2019 · Big Data

How eBay Scales Real‑Time Monitoring with Flink: Metadata‑Driven Streaming

This article explains how eBay’s Sherlock.IO monitoring platform processes billions of logs, events, and metrics daily using Flink Streaming jobs, detailing a metadata‑driven architecture, shared job strategies, Heartbeat‑based monitoring, job isolation, back‑pressure handling, and real‑world use cases such as Event Alerting, Eventzon, and Netmon.

Big DataFlinkReal-time Processing

0 likes · 18 min read

How eBay Scales Real‑Time Monitoring with Flink: Metadata‑Driven Streaming