Master Data Management Architecture and Practices for Baidu Smart Mini Programs
This article presents a comprehensive overview of master data management concepts, maturity levels, and the challenges faced by Baidu smart mini‑programs, followed by a detailed practical architecture design—including domain modeling, high‑availability microservice implementation, performance optimization, and data synchronization—while also discussing future extensions and team capability building.
1. Master Data Concept
Master data (MD) refers to shared data across systems such as customers, accounts, and organizations, aiming to solve data sharing and consistency issues. Six maturity levels (Level 0–Level 3) are described, with Level 0 having no MDM, Level 1 providing manual list registration, Level 2 enabling unified access via interfaces, and Level 3 introducing a centralized bus for strong data governance.
Why Master Data? Four main problems are identified: data fragmentation and redundancy, inconsistency and calibration difficulty, low business collaboration efficiency, and loss of data due to frequent business changes. These stem from a lack of top‑down data governance.
2. Master Data Architecture Practice Summary
2.1 Business Background Analysis
Rapid growth of Baidu mini‑program modules creates high data change and retrieval demand.
SLA standards across services are inconsistent, hindering high‑availability delivery.
Network‑style data storage leads to redundancy, inconsistency, security, and data‑island issues.
Cross‑system data interaction is difficult due to divergent data models.
2.2 Overall Design Approach
The design follows an analysis‑to‑solution flow: use Data Flow Diagrams (DFD) for requirement analysis, apply event‑storming for domain boundary identification, and then model domains (Customer, Product, User, Base Data) with separate sub‑services.
2.3 Detailed Design and Implementation
Transaction/Compensation : limit transaction size, batch 100‑500 rows per commit, use retry/compensation for failures.
High‑Performance Read/Write : multi‑level caching (distributed + local), database optimizations (indexing, pagination, sub‑query tuning), and Elasticsearch for fuzzy/multi‑table searches.
Availability : microservice architecture with service governance, load balancing, retry, rate‑limit, circuit‑breaker, and automatic fault‑self‑healing.
Process Mechanisms : dedicated MDM team for design review, coding standards, code reviews, comprehensive system testing, gray‑release with automated rollback, and operational monitoring.
Real‑Time Data Sync : binlog listening, concurrent MQ writes, version‑controlled distribution service, and compensation mechanisms to ensure timely and reliable data propagation.
The architecture diagram (not shown) illustrates the master‑data service deployment and data‑sync distribution model.
3. Master Data Extension Thoughts
Beyond real‑time online services, master data can support data‑asset auditing, monitoring, and business profiling, improving decision‑making and reducing profiling costs.
3.2 Enhancing Team MDM Capability
Promote mandatory master‑data learning and alignment across business units.
Establish an independent, neutral master‑data team to ensure professional, standardized construction.
Strengthen model and architecture review, share summaries, and conduct regular knowledge sharing.
Overall, Baidu’s master‑data service, launched in 2019, supports over 9,000 QPS, achieves >99.99% SLA, and maintains data consistency at four‑nines through monitoring and compensation mechanisms.
High Availability Architecture
Official account for High Availability Architecture.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.