Big Data 14 min read

Tencent Music's Data Asset Management and Governance Practices

The article details Tencent Music's data governance journey, describing the background of rapid resource growth, challenges in cost management, a multi‑layered governance methodology—including metadata, tiered storage, and a Lego metadata platform—and the resulting improvements in resource utilization and data quality.

DataFunSummit
DataFunSummit
DataFunSummit
Tencent Music's Data Asset Management and Governance Practices

Overview : This article shares Tencent Music's practical experience in data asset management and governance, highlighting how systematic data governance enables internal resource and cost management, leading to reduced expenses and increased efficiency.

1. Background of Data Governance

Rapid business growth at Tencent Music has caused a sharp increase in data, storage, and compute resources. Without governance, resource costs could become uncontrollable. The company follows a staged model where, after reaching the fifth stage of rapid growth, data governance begins to stabilize resource usage.

Key Challenges

Massive data volume involving many teams, leading to high communication costs.

Lack of data‑warehouse construction standards, resulting in siloed ("chimney") development.

Data developers often lack cost awareness, causing unnecessary storage waste.

Insufficient tooling for resource optimization and data quality monitoring.

2. Governance Solution and Practice

2.1 Macro Governance – Methodology : The team introduced a layered responsibility model, splitting resources by business line (e.g., QQ Music, 全民K歌), then by data domain (traffic, membership, live), further by department, and finally by storage mode, assigning ownership to specific owners or centers.

Data health is evaluated by lifecycle rationality and total table size. Storage tiering is based on access frequency, recent usage, and data value.

Table Classification (5 levels) :

Disabled tables – long‑inactive.

Zombie tables – updated but never accessed or downstream‑dependent.

Normal tables – low access and fan‑out.

Core tables – high access or fan‑out.

Key tables – most critical assets with high access and fan‑out.

2.2 Improving Resource Utilization : By analyzing ~70 application groups, the team found average daily utilization below 10%. Using a resource‑time‑rental system, they dynamically allocated compute resources by hour, raising average utilization to about 80%.

2.3 Tiered Storage : Data is divided into three categories:

Sample data – large volume, long retention; replica factor reduced from 3 to 2.

Streaming data – usually retained for up to 3 years; older data moved to cheaper cold storage.

Metric data – managed by lifecycle, recent access, and hotness to cut storage cost.

2.4 Governance Scope : Covers data‑map governance, data‑warehouse governance, data quality, and security. Emphasis is on metadata management, lineage, and value assessment.

2.5 Lego Metadata Platform : A data bus continuously pulls and backfills metadata, enriching it with cost and security information. It provides recommended lifecycle based on recent access, enables sampling for non‑sensitive data, and integrates with visualization tools (e.g., Superset) for lineage inspection.

3. Effects of Data Governance

After governance, both compute and storage resources decreased despite business growth, utilization rose sharply, and the total number of tables dropped dramatically due to removal of invalid metadata.

4. Q&A

Q1: Are table classifications and tiered storage defined before data standards and quality management? A1: They are developed in parallel; standards evolve alongside governance practices.

Thank you for reading.

big dataResource Optimizationdata governancemetadata managementTencent Music
DataFunSummit
Written by

DataFunSummit

Official account of the DataFun community, dedicated to sharing big data and AI industry summit news and speaker talks, with regular downloadable resource packs.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.