Cloud Computing 29 min read

High‑Performance Computing Storage Challenges and Baidu Canghai Storage Solutions

This article explains the storage problems faced by traditional HPC, AI‑driven HPC and high‑performance data analysis, describes Baidu's internal high‑performance storage practices, and introduces the Baidu Canghai solution—including object storage BOS, parallel file system PFS, RapidFS, data‑flow mechanisms and a customer case—demonstrating how these technologies meet the demanding throughput, latency and cost requirements of modern high‑performance workloads.

DataFunTalk

Aug 17, 2022

High‑Performance Computing Storage Challenges and Baidu Canghai Storage Solutions

1. Storage Issues in High‑Performance Computing

High‑Performance Computing (HPC) encompasses traditional supercomputing, AI‑driven HPC and high‑performance data analysis (HPDA), each presenting distinct storage challenges such as random I/O inefficiency, process coordination, massive small‑file metadata overhead, and stringent throughput and latency demands.

1.1 What Is HPC?

HPC refers to supercomputers that deliver performance one to two orders of magnitude higher than contemporary personal computers, used in scientific simulation, weather forecasting, and increasingly in industry‑wide scenarios.

1.2 Traditional HPC Storage Problems

Random small I/O caused by scattered matrix fragments leads to poor I/O efficiency.

Process coordination is required to ensure all nodes finish before data can be persisted.

Two‑stage I/O aggregates small requests into large sequential I/O, relying on POSIX file interfaces and MPI‑I/O.

1.3 AI HPC Storage Problems

AI training workloads generate massive read I/O for small files, require checkpointing, and benefit from POSIX, K8s CSI, and optionally GPU Direct Storage interfaces; metadata performance is critical due to the prevalence of tiny files.

1.4 HPDA Storage Problems

HPDA workloads are dominated by large files, demanding very high throughput but tolerating higher latency; they typically use Hadoop‑compatible HCFS interfaces.

1.5 Summary of HPC Storage Requirements

Common needs include high throughput, low latency (for HPC/AI HPC), support for massive small‑file metadata, POSIX compatibility, MPI‑I/O for traditional HPC, and cost‑effective, reliable storage.

2. Baidu’s Internal High‑Performance Storage Practice

Baidu operates a unified storage base that provides high reliability, low cost, and high throughput, supporting POSIX and HCFS interfaces and offering SDKs for custom development.

Two runtime storage solutions address different scenarios:

Local‑disk or parallel file system (PFS) for small‑file intensive AI training, delivering fast metadata and I/O performance.

Direct access to the storage base for long‑running, throughput‑critical jobs.

A distributed training platform abstracts mounting, capacity allocation, and data movement, simplifying user experience.

3. Baidu Canghai High‑Performance Storage Solution

The solution combines the object storage service BOS (with tiered storage and lifecycle management) as the storage base with two runtime systems: PFS (a Lustre‑like parallel file system) and RapidFS (a cache‑accelerated system).

3.1 Parallel File System PFS

PFS provides a short I/O path via dedicated metadata (MDS) and data (OSS) nodes, deployed close to compute nodes using RDMA or high‑speed TCP.

3.2 Distributed Cache Accelerated RapidFS

RapidFS leverages idle memory and disk on compute nodes to form a P2P cache, offering hierarchical namespace caching and data caching to accelerate access to BOS.

3.3 Efficient Data Transfer

Lifecycle policies automatically migrate cold data from PFS to BOS, reducing cost.

Bucket Link binds PFS/RapidFS namespaces to BOS paths, enabling seamless data pre‑loading and hot‑data caching.

3.4 Unified Scheduling

Bucket Link is integrated into Kubernetes via the open‑source Fluid project, allowing data‑preload pipelines to run in parallel with GPU training, improving GPU utilization.

3.5 Test Results

Experiments show that using RapidFS or PFS with Bucket Link achieves near‑100% GPU utilization, whereas direct BOS access leaves GPUs under‑utilized due to I/O bottlenecks.

3.6 Customer Case

A leading autonomous‑driving client collects petabyte‑scale road‑test data, uploads it via Baidu’s “Moonlight Box” hardware to BOS, and uses PFS to feed massive GPU clusters for model training, completing the data‑collect‑train‑iterate loop efficiently.

—END—

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

AI High Performance Computing cloud storage Baidu parallel file system

Written by

DataFunTalk

Dedicated to sharing and discussing big data and AI technology applications, aiming to empower a million data scientists. Regularly hosts live tech talks and curates articles on big data, recommendation/search algorithms, advertising algorithms, NLP, intelligent risk control, autonomous driving, and machine learning/deep learning.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.