Big Data 13 min read

Data Governance Practices and Product Strategy at NetEase: Challenges, Solutions, and Future Plans

The article presents NetEase's internal data governance experience, outlining past challenges, current pain points, a comprehensive product strategy covering scope, value quantification, and feature implementation, and shares initial results and future plans to build an automated, end‑to‑end big‑data optimization platform.

DataFunTalk
DataFunTalk
DataFunTalk
Data Governance Practices and Product Strategy at NetEase: Challenges, Solutions, and Future Plans

Introduction

NetEase's data product team shares the common internal dilemmas of data availability, quality, and cost, and introduces a product‑driven approach to data governance focusing on computing, storage, and overall value.

1. Past Data Governance Review

Review of data governance efforts across business lines such as Yanxuan, Media, and Music, highlighting issues like resource bottlenecks, unclear ownership, lack of standards, and difficulty assessing data value.

2. Current Governance Pain Points

Key pain points include non‑standard development practices, low motivation for governance, lack of a closed‑loop process, and coarse quantification of governance impact.

3. Product Overall Strategy

The strategy is divided into three stages: defining the governance scope, quantifying governance value through a health‑score system, and systematizing governance to create a long‑term closed loop.

3.1 Define Governance Scope

Governance covers the entire data lifecycle—production, consumption, and management—addressing cost, standards, quality, security, and value at each stage.

3.2 Quantify Governance Value

A health‑score system evaluates five dimensions: cost, value, standards, quality, and security, each with measurable indicators.

3.3 Systematic Governance

Tools provide problem discovery, automated recommendations, asset billing, red‑black lists, and notifications to drive continuous improvement.

4. Product Feature Implementation

Task/Table ownership assignment

Useless data deprecation

Table lifecycle management

Computation cost analysis

Owner red‑black ranking

Cost and deprecation metrics

Email & internal tool notifications

5. Initial Results

In 2020, optimization reduced table count by 47.6% for Cloud Music and 61% for Yanxuan, and saved ~38% of compute resources for Media, demonstrating tangible cost and efficiency gains.

6. Future Planning

The vision is a fully automated, end‑to‑end big‑data evaluation and optimization tool that integrates health scores, notification mechanisms, and a layered support system to continuously improve data quality, security, and standards.

7. Q&A Highlights

Answers cover embedding governance into development, the composition of health scores, responsible parties for governance, the scope beyond raw tables, the ongoing nature of governance, and how compute costs are calculated.

Conclusion

The session emphasizes that data governance is a continuous, product‑driven effort that requires clear ownership, measurable value, and systematic tooling to achieve cost reduction and quality improvement.

Big Datacloud computingdata qualityCost Optimizationproduct strategydata governance
DataFunTalk
Written by

DataFunTalk

Dedicated to sharing and discussing big data and AI technology applications, aiming to empower a million data scientists. Regularly hosts live tech talks and curates articles on big data, recommendation/search algorithms, advertising algorithms, NLP, intelligent risk control, autonomous driving, and machine learning/deep learning.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.