Cloud Computing 11 min read

How Alibaba’s Qi Tian Platform Secures Large-Scale Cloud Networks

This article examines Alibaba Cloud’s Qi Tian integrated operation‑management platform, detailing the challenges of massive cloud network management and the innovative data‑fusion, automated change, intent‑aware monitoring, and multi‑plane self‑healing technologies that enable secure, high‑performance operation at million‑device scale.

Alibaba Cloud Infrastructure
Alibaba Cloud Infrastructure
Alibaba Cloud Infrastructure
How Alibaba’s Qi Tian Platform Secures Large-Scale Cloud Networks

Introduction

To implement the Ministry of Industry and Information Technology’s 2025 network security plan, the China Academy of Information and Communications Technology hosted a cloud‑service security conference in Hangzhou, where Alibaba Cloud’s ultra‑large‑scale cloud computing network integration platform – Qi Tian – won the “Cloud Service Operation Security Innovation Award” and its team leader received a “Full‑Stack” expert certification.

Core Challenges

Large‑scale cloud networks must handle massive data, million‑level device inventories, high‑frequency topology changes, and heterogeneous equipment while maintaining real‑time monitoring and rapid fault recovery.

Balancing fine‑grained decision data needs with storage and compute costs.

Managing millions of devices with limited human resources.

Meeting sub‑millisecond monitoring requirements amid highly dynamic network topologies.

Detecting and repairing faults across diverse, multi‑plane device architectures efficiently.

Key Technologies

1. High‑Performance Data Management through Intelligence‑Fusion

Qi Tian unifies multi‑modal network data storage, employs a stateless cloud‑native analysis engine, and builds spatiotemporal knowledge graphs, achieving petabyte‑scale storage, million‑level virtual network modeling, and millisecond‑level data analysis.

2. Unattended Multi‑Tenant Dynamic Change

By orchestrating ultra‑high‑dimensional tasks, leveraging micro‑cluster caching, and applying collaborative multi‑metric evaluation, the system performs zero‑loss, zero‑downtime changes on million‑scale devices, dramatically reducing manual effort.

3. Intent‑Aware Adaptive High‑Precision Monitoring

Using user‑intent‑driven virtual network measurement and machine‑learning prediction, the platform attains packet‑level accuracy, millisecond timing, instance‑level traffic visibility, and user‑level alert precision.

4. Multi‑Plane Anomaly Detection and Full‑Link Self‑Healing

Combining formal verification, visual diagnostics, and a trained anomaly library, the system rapidly classifies and isolates faults across physical, virtual, and tenant planes, employing programmable NIC back‑pressure and software‑controlled traffic scheduling for swift recovery.

Conclusion & Outlook

After a decade of development, Qi Tian now powers Alibaba Cloud’s commercial network services for millions of customers, supporting major events such as the 20th Party Congress and the Paris Olympics. With over 40 patents and 20 high‑impact papers, the platform has been recognized by Gartner for unique network performance visualization. Future work will deepen the “intelligence‑fusion, operation‑as‑one” strategy, integrating AI to achieve autonomous, closed‑loop network management from perception to self‑optimizing policy execution.

cloud computingAIdata managementlarge-scale systemsnetwork operations
Alibaba Cloud Infrastructure
Written by

Alibaba Cloud Infrastructure

For uninterrupted computing services

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.