Big Data 14 min read

Fundamentals of Data Middle Platform: Logic, Principles, and Practice

This article explains what a data middle platform is, why organizations need it, its core principles, technical architecture, and practical implementation guidelines, highlighting how it solves issues like inconsistent metrics, duplicate data construction, low query efficiency, poor data quality, and high development costs.

DataFunTalk

Feb 19, 2022

Fundamentals of Data Middle Platform: Logic, Principles, and Practice

Introduction

Since 2015, the term “data middle platform” has become popular, and after Alibaba’s 2020 debate on dismantling the middle platform, this talk explores the concept from both a popular‑science and enterprise‑practice perspective.

What Is a Data Middle Platform?

Inspired by Supercell’s small‑team, high‑revenue model, the middle platform sits between data sources and business applications, providing shared data services across the organization. Alibaba defines it as a combination of methodology, organization, and tools: OneID + OneModel + OneService, a talent structure that includes data product managers, data engineers, and data scientists, and a suite of tools for data collection, construction, management, and service.

Why Is a Data Middle Platform Needed?

Inconsistent metric definitions across thousands of indicators.

Duplicate data construction by different teams or projects.

Low data‑retrieval efficiency due to fragmented tables and layers.

Poor data quality caused by lack of end‑to‑end lineage.

High construction and maintenance costs.

These problems can be mitigated by adopting a unified data middle platform.

Principles of a Data Middle Platform

Allocate core resources to core projects rather than a “race‑horse” approach.

Build a generic platform instead of business‑specific “BP” solutions.

Avoid short‑term, rapid‑change tactics; focus on steady, long‑term development.

Technical Principles

The platform typically follows a “three‑horizontal, one‑vertical” architecture: data ingestion, data development, and data application (the three horizontals) and data management (the vertical) covering metadata, resource, asset management, governance, and security.

Practical Implementation

Common pain points in OneData include unclear data sources, inconsistent metric definitions, and lack of standards. Solutions involve:

Standardizing naming conventions for atomic and derived metrics.

Defining clear production, review, authorization, and governance processes.

Mapping business lines to domains, topics, and metric dimensions.

Example: In an e‑commerce scenario, the transaction domain defines atomic metrics (e.g., sales count) and derived metrics (e.g., 7‑day hot‑item rate) with consistent naming and dimension definitions.

Summary

A data middle platform unifies data collection, computation, storage, and processing while standardizing metrics.

It addresses metric inconsistency, duplicate construction, low query efficiency, data quality issues, and high costs.

It is suitable for companies with multiple business lines, cost‑reduction or efficiency‑improvement needs, and a willingness to invest in long‑term development.

Key organizational and methodological principles include centralized resources, generic platform design, and a patient, steady‑state approach.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Big Data Data Platform Data Architecture Data Middle Platform

Written by

DataFunTalk

Dedicated to sharing and discussing big data and AI technology applications, aiming to empower a million data scientists. Regularly hosts live tech talks and curates articles on big data, recommendation/search algorithms, advertising algorithms, NLP, intelligent risk control, autonomous driving, and machine learning/deep learning.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.