Cloud Computing 13 min read

Research on the Unified Storage Platform for the Supercomputing Internet

This article presents a comprehensive overview of the challenges, key technologies, and future applications of a unified storage platform built on Alluxio for China's national supercomputing internet, detailing its architecture, data flow strategies, deployment status, and industry use cases across multiple sectors.

DataFunTalk
DataFunTalk
DataFunTalk
Research on the Unified Storage Platform for the Supercomputing Internet

Speaker Wang Chunxiao, associate researcher at Shandong Provincial Computing Center, introduces his work since 2022 on the supercomputing internet project, focusing on the development of a unified storage platform using Alluxio.

The presentation covers three main topics: (1) problems and challenges in building the supercomputing internet, such as heterogeneous compute platforms, resource heterogeneity, and uneven compute distribution; (2) key technologies of the unified storage platform, including Alluxio-based storage adapters, service bus design, and three data flow strategies—real‑time, scheduled, and automatic migration; (3) applications and future development of the supercomputing internet across industries like oceanography, remote sensing, digital government, healthcare, and education.

The platform currently pilots in Shandong, connecting 16 cities with core nodes in Jinan and Qingdao, integrating 28 compute clusters and 45 storage systems, and exposing around 130 APIs for command‑line, client, and service portal access. All storage accesses are routed through an Alluxio master, with plans for distributed master deployment.

Key technical achievements include unified storage auto‑mount, multi‑mode data access, user‑level data isolation, and optimized data migration based on intelligent models. The system supports real‑time migration for user‑specified transfers, scheduled migration for large edge datasets, and rule‑based automatic migration for compute‑storage separation scenarios.

Future work aims to distribute Alluxio master nodes, implement unified scheduling, improve data prefetching and caching, and develop hierarchical storage. The platform has already been applied in ocean‑atmosphere coupled modeling, remote sensing data processing, digital government services, and smart campus projects, demonstrating its broad impact.

cloud computingHigh Performance ComputingData FlowAlluxioUnified StorageSupercomputing
DataFunTalk
Written by

DataFunTalk

Dedicated to sharing and discussing big data and AI technology applications, aiming to empower a million data scientists. Regularly hosts live tech talks and curates articles on big data, recommendation/search algorithms, advertising algorithms, NLP, intelligent risk control, autonomous driving, and machine learning/deep learning.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.