Big Data 13 min read

Understanding Data Middle Platform: Concepts, Architecture, and Real‑Time Implementation

The article explains the data middle platform concept, its distinction from traditional big‑data platforms, the architectural principles behind Alibaba's implementation, and how real‑time ingestion, processing, and service layers enable efficient, collaborative, and scalable data-driven applications.

Architecture Digest
Architecture Digest
Architecture Digest
Understanding Data Middle Platform: Concepts, Architecture, and Real‑Time Implementation

Data middle platform is hailed as the next step for big data, originating from Alibaba’s "big‑middle platform, small front‑end" strategy in 2015 and revived by Tencent in 2018.

Although many talk about it, the term is often misunderstood; it is not a platform or a system that can be bought, but rather a middle‑layer concept that bridges data development and application development.

Data Middle Platform Is Not a Big Data Platform!

It is not a product; it is a technical middle layer. Using Gartner’s Pace Layer model helps clarify its role: core data models change slowly, while business demands evolve rapidly, creating a mismatch that the middle platform aims to resolve.

Key challenges addressed include:

Efficiency : Reducing the long lead time for adding reports or real‑time recommendations.

Collaboration : Avoiding duplicated data development across teams.

Capability : Providing specialized data engineering resources for data‑centric tasks.

The solution aggregates and governs cross‑domain data, exposing it as services (Data API) rather than raw databases, thus decoupling front‑end development speed from back‑end data changes.

Alibaba’s Data Middle Platform Details

Business‑wide Data Landscape

Data is collected from various business lines (e.g., Taobao, Tmall, Hema) into a unified "OneData" layer, forming public data centers for consumer, enterprise, and content domains, which are then processed and served via the "OneService" middleware.

Three Core Systems

Alibaba’s cloud data middle platform is built on three pillars:

OneData : Standardizes data as assets.

OneEntity : Unifies entities to eliminate data silos.

OneService : Provides reusable data services.

Six Data‑Technology Domains

The platform originally defined six domains: data model, storage governance, data quality, security & permission, platform operation, and R&D engineering. Over time, these evolved into broader areas such as data asset management and data trust, with ongoing work in model and quality domains and emerging intelligent black‑box capabilities.

How to Build a Real‑Time Data Middle Platform

The following logical architecture illustrates a real‑time implementation, emphasizing the real‑time model layer.

1. Real‑Time Ingestion

Different data types use appropriate ingestion methods; Flume + Kafka is the default, alongside file and database connectors.

2. Computing Framework

A Kappa architecture enables unified batch and stream processing, leveraging Flink for high‑throughput, low‑latency, and seamless batch‑stream integration.

3. Real‑Time Model

Similar to data‑warehouse models, real‑time models are business‑oriented and consist of DWD (standardized, filtered data) and DW layers, which include dynamic, event, and time‑series models, each stored in suitable systems (Kafka/HBase, MQ/Redis, HBase/TSDB).

4. Real‑Time Service

A unified data‑development platform provides graphical, workflow‑driven tools to manage both offline and real‑time data, avoiding isolated stream‑processing scripts and reducing development overhead.

5. Real‑Time Application

By supporting rapid orchestration, development cycles shrink from weeks to days, delivering high‑impact real‑time services; Alibaba processes EB‑scale data, handling 94 million events per second during peak events with end‑to‑end latency of 2.5 seconds.

Overall, the growing demand for real‑time data capabilities makes building a real‑time data middle platform essential for modern enterprises.

Author: 数据分析不是个事儿 https://www.jianshu.com/p/05a8db84e454
Alibababig dataReal-time Processingdata platformdata architecturedata middle platform
Architecture Digest
Written by

Architecture Digest

Focusing on Java backend development, covering application architecture from top-tier internet companies (high availability, high performance, high stability), big data, machine learning, Java architecture, and other popular fields.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.