Tag

metadata service

0 views collected around this technical thread.

ByteDance Cloud Native
ByteDance Cloud Native
Mar 13, 2025 · Backend Development

Inside DeepSeek 3FS: Architecture of a High‑Performance Parallel File System

This article dissects DeepSeek's 3FS parallel file system, detailing its four‑component architecture, high‑throughput RDMA networking, metadata handling with FoundationDB, client access methods, chain replication (CRAQ), custom FFRecord format, and recovery mechanisms, offering a deep technical perspective for storage engineers.

RDMAchain replicationdistributed file system
0 likes · 22 min read
Inside DeepSeek 3FS: Architecture of a High‑Performance Parallel File System
AntData
AntData
Mar 4, 2025 · Big Data

Design and Analysis of 3FS: An AI‑Optimized Distributed File System

The article provides a comprehensive English overview of 3FS, an AI‑focused distributed file system that leverages FoundationDB for metadata, CRAQ for chunk replication, and a hybrid Fuse/native client architecture, detailing its design, components, fault handling, and performance considerations for large‑scale training workloads.

AI StorageCRAQ replicationCloud Native
0 likes · 25 min read
Design and Analysis of 3FS: An AI‑Optimized Distributed File System
vivo Internet Technology
vivo Internet Technology
Sep 27, 2023 · Big Data

Horizontal Scaling of Hive Metastore Service at Vivo: Evaluation, TiDB Migration, and Lessons Learned

Vivo’s big‑data team horizontally scaled its Hive Metastore by evaluating MySQL sharding (Waggle‑Dance) against a TiDB migration, ultimately adopting TiDB, which after a synchronized cut‑over delivered ~15% faster queries, 80% DDL latency reduction, linear scaling, low resource use, and valuable operational lessons.

Big DataHive MetastorePerformance Optimization
0 likes · 19 min read
Horizontal Scaling of Hive Metastore Service at Vivo: Evaluation, TiDB Migration, and Lessons Learned
DataFunTalk
DataFunTalk
Sep 17, 2023 · Cloud Native

REDck: A Cloud‑Native Real‑Time Data Warehouse Built on ClickHouse

REDck is a cloud‑native, storage‑compute separated real‑time OLAP data warehouse derived from ClickHouse that addresses scalability, operational cost, and reliability challenges through a unified metadata service, object‑storage optimizations, multi‑level caching, distributed task scheduling, and two‑phase commit transactions.

CachingClickHouseCloud Native
0 likes · 18 min read
REDck: A Cloud‑Native Real‑Time Data Warehouse Built on ClickHouse
DataFunTalk
DataFunTalk
Sep 15, 2023 · Cloud Computing

Design and Architecture of Baidu CFS Large‑Scale Distributed File System and Metadata Service

The talk from DataFun Summit 2023 explains how Baidu's CFS storage builds a trillion‑file‑scale distributed file system by revisiting file system fundamentals, POSIX limitations, historical storage architectures, and introducing a lock‑free metadata service with single‑shard primitives, data‑layout optimizations, and a simplified client‑centric architecture that achieves high scalability and performance.

Big DataCFSPOSIX
0 likes · 31 min read
Design and Architecture of Baidu CFS Large‑Scale Distributed File System and Metadata Service
Baidu Geek Talk
Baidu Geek Talk
May 29, 2023 · Backend Development

CFS: Scaling Metadata Service for Distributed File System via Pruned Scope of Critical Sections - Baidu's Implementation Journey

Baidu’s CFS metadata service scales to billions of files by shrinking critical sections through a lock‑free Namespace 2.0 design that confines conflicts to single shards, uses field‑level atomic primitives, and integrates the proxy into the client, delivering up to 76× throughput gains and significant latency reductions in production.

Baidu CFSEuroSys 2023POSIX compatibility
0 likes · 40 min read
CFS: Scaling Metadata Service for Distributed File System via Pruned Scope of Critical Sections - Baidu's Implementation Journey
YunZhu Net Technology Team
YunZhu Net Technology Team
Jan 26, 2022 · Databases

Graph Database Selection and NebulaGraph Architecture for a Knowledge‑Graph Platform

The article explains how the cloud‑construction platform evaluated graph‑database options based on open‑source, scalability, latency, storage capacity and import capabilities, ultimately choosing NebulaGraph, and then details NebulaGraph’s distributed meta, storage and query services as well as the overall multi‑layer knowledge‑graph platform architecture and future application scenarios.

NebulaGraphQuery Servicedistributed architecture
0 likes · 11 min read
Graph Database Selection and NebulaGraph Architecture for a Knowledge‑Graph Platform