Cloud Native 24 min read

Design and Implementation of a Next‑Generation Multi‑Protocol Unstructured Storage System for Machine Learning

This article presents the challenges of storing massive machine‑learning datasets, evaluates existing storage solutions, and details the design of OrangeFS—a cloud‑native, multi‑protocol, multi‑tenant unstructured storage system that integrates object and file interfaces, optimizes metadata services, supports hot upgrades, and provides robust scalability and reliability for AI workloads.

DataFunSummit

Sep 12, 2024

Design and Implementation of a Next‑Generation Multi‑Protocol Unstructured Storage System for Machine Learning

With the rapid growth of artificial‑intelligence technologies, Didi faces unprecedented storage demands for machine‑learning training data, requiring petabyte‑scale capacity, high‑throughput, low‑latency metadata services, and support for multiple protocols (POSIX, S3, HDFS).

The existing in‑house solutions (GIFT object storage, Ceph, HDFS, GlusterFS) each lacked one or more required features such as multi‑protocol access, atomic rename, or efficient random writes, prompting an exploration of combined approaches and third‑party options like JuiceFS.

OrangeFS was designed by synthesizing lessons from GIFT and open‑source projects (Ceph, CubeFS, JuiceFS, SeaweedFS). Its architecture includes:

A unified entry service that parses S3 and GIFT V2 protocols.

A metadata service (MDS) built on RDS with multi‑Raft, in‑memory transactions, and dynamic configuration.

A storage engine that stores data blocks (Blocks) in self‑developed BS engines or public‑cloud object stores, using a Chunk‑Blob‑Block hierarchy to enable high concurrency.

Separate VFS (POSIX) and PathFS (S3/HDFS) layers to provide seamless multi‑protocol file operations.

Key innovations include:

Optimized metadata latency through single‑RPC operations, queue‑based processing, TCP transport, and Raft batching.

Correctness guarantees via optimistic in‑memory transactions and serialized writes for high‑conflict operations.

Scalability achieved with follower reads, learner nodes, and dynamic load‑balancing.

Stability improvements such as snapshot‑based recovery, log‑compaction control, and busy/slow queues.

POSIX client features: high throughput, read/write decoupling via memory snapshots, and second‑level hot upgrades without service interruption.

Multi‑tenant isolation, TTL‑based automatic deletion, read‑only modes, and fine‑grained permission control.

A recycle‑bin mechanism that preserves deleted files for recovery while handling TTL expiration.

OrangeFS now supports tens of petabytes across dozens of teams, offering lossless multi‑protocol access, robust QoS, cloud‑native deployment via CSI, and seamless integration with both private and public cloud environments.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Cloud Native Machine Learning metadata storage high-performance Multi-Protocol

Written by

DataFunSummit

Official account of the DataFun community, dedicated to sharing big data and AI industry summit news and speaker talks, with regular downloadable resource packs.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.