Big Data 21 min read

Kingsoft Cloud's Big Data Compute‑Storage Separation Practices and KS3‑HDFS Solution

This article presents Kingsoft Cloud's comprehensive practice on big data compute‑storage separation, detailing the challenges of modern data platforms, comparing HDFS with object storage, describing three separation modes, and explaining the architecture, core modules, performance optimizations, and advantages of the KS3‑HDFS solution.

DataFunSummit
DataFunSummit
DataFunSummit
Kingsoft Cloud's Big Data Compute‑Storage Separation Practices and KS3‑HDFS Solution

The article shares Kingsoft Cloud's practical experience with big data compute‑storage separation, organized into three main parts: an introduction to the concept, the company's specific separation solution, and the KS3‑HDFS implementation.

It first outlines the challenges faced by large‑scale data platforms—rising compute workloads, declining efficiency, and ever‑growing storage demands—while emphasizing cost reduction through elastic compute, data governance, and low‑frequency storage.

A comparison between traditional HDFS and object storage highlights that object storage uses erasure coding (e.g., 8+4) for lower per‑GB cost, offers superior elasticity and cross‑region reliability (up to 11 nines), but lacks directory semantics and rename operations, requiring copy‑and‑delete workflows.

The evolution of big data architecture is traced from Hadoop 1.x (fully fused) through Yarn‑enabled separation in Hadoop 2.x, to cloud‑native containerized engines (Spark, Flink, Presto) and the emergence of data‑lake architectures that favor object storage.

Kingsoft Cloud proposes three compute‑storage separation modes: Direct mode (SDK‑based S3 connector translating HDFS APIs), Object mode (adds metadata and client caching to improve performance), and Block mode (splits large objects into blocks with independent metadata to support POSIX‑like semantics).

The KS3‑HDFS solution features a logical architecture with Raft‑based metadata nodes backed by RocksDB, a unified client that first contacts the metadata cluster before accessing data, and a caching layer (metadata cache and client cache) that unifies HDFS and KS3 under a single namespace.

Core modules include a three‑node Master for high‑availability metadata, the Gaea SDK that bridges Hadoop APIs to KS3 while optimizing list operations, a cache subsystem for hot data, and a two‑phase commit workflow that separates metadata and data streams to ensure consistency and enable graceful failure recovery.

The SDK also supports real‑time streaming (e.g., Flink) by leveraging multipart upload to achieve exactly‑once semantics, and implements shadow‑copy techniques to accelerate rename‑like operations.

Performance enhancements comprise single‑connection acceleration (up to 100 MB/s), server‑side block caching for files larger than 5 MB, shadow‑copy for metadata operations, and asynchronous handling of large‑scale metadata changes, resulting in high QPS, low latency, and multi‑GB/s bandwidth.

Product advantages include one‑click enablement, compatibility with standard HDFS semantics, multi‑replica metadata for high reliability, cross‑region disaster recovery, and lower storage costs compared to traditional HDFS.

The Q&A section emphasizes KS3's cross‑region resilience and cost benefits, differences from Alluxio (removal of worker nodes), challenges of single‑connection acceleration (memory pressure and potential OOM), and primary use cases such as migrating PB‑scale workloads from HDFS to object storage while preserving seamless access.

performance optimizationBig Datacloud computingobject storageCompute-Storage SeparationKS3-HDFS
DataFunSummit
Written by

DataFunSummit

Official account of the DataFun community, dedicated to sharing big data and AI industry summit news and speaker talks, with regular downloadable resource packs.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.