Big Data 32 min read

Design and Evolution of a R&D Measurement Platform: Architecture, Data Governance, and Interactive Analytics

This article details the purpose, technical evolution, architecture, data‑source unification, dimensional modeling, data‑warehouse layering, SQL‑as‑metric approach, and interactive design of a measurement platform built to improve R&D efficiency through systematic data collection and visualization.

Zhuanzhuan Tech
Zhuanzhuan Tech
Zhuanzhuan Tech
Design and Evolution of a R&D Measurement Platform: Architecture, Data Governance, and Interactive Analytics

1 What the Measurement Platform Does

The platform systematically collects key R&D data and presents it intuitively, helping users understand delivery value, efficiency, and quality, while providing a reliable observation system to identify problems and support business improvements.

2 Technical Construction History

2.1 Platform V0.1 (2019‑2022)

Started in 2019 to collect requirements, bugs, project delays, and release data for efficiency analysis. Simple maintenance without continuous updates.

2.2 Platform V0.5 (2022‑2023)

Growth revealed performance bottlenecks: long query times, low metric accuracy, and slow production metrics. A complete reconstruction was decided in 2022.

2.2.1 Reconstruction Idea

Data Model and Business Logic Decoupling

The original system tightly coupled data models with business logic, making changes costly. The redesign separates them, records detailed business data, and introduces four table types: detail, dimension, bridge, and summary tables.

Detail tables store raw, granular data (e.g., individual requirements, work hours, bugs).

Dimension tables hold attribute information (project, personnel, module).

Bridge tables manage many‑to‑many relationships.

Summary tables aggregate data for fast queries.

Data flows through glue code that connects tables. The V0.5 architecture added data‑dependency management and pre‑computed metrics but kept the production metric approach unchanged.

2.2.2 Basic Data Governance

Data cleaning and processing were required because most source data were new. Project‑process data from TAPD were standardized through two measures: promoting a unified project‑process standard and embedding the process into the development workflow.

After governance, the platform achieved a "one standard, one platform" goal, improving data uniformity and collection efficiency.

2.2.3 Achieved Effects

Data pre‑computation solved slow queries.

Metric accuracy and traceability improved, but overall effect fell short of expectations; calculation logic remained scattered in code.

2.3 Platform V1 (2023‑Present)

V0.5 computed data one day after collection (T+1). V1 focuses on data construction to achieve one‑hour data availability, unified data sources, SQL‑as‑metric, and efficient interactive design.

Unified data source to ensure consistency.

SQL‑as‑metric for traceable, maintainable indicators.

Interactive UI with cards, chart linking, auxiliary fields, correlation analysis, and drill‑down.

3 Platform V1 Details

3.1 Technical Architecture

Offline processing using Hive, StarRocks, Spark, and internal platforms (星河, 星火).

星河 is a self‑developed PaaS providing data ingestion, development, quality, assets, services, and metric management. 星火 is an intelligent BI platform for low‑code data analysis.

3.2 Unified Data Source (Data‑Warehouse Construction)

3.2.1 Preparation – Bottom‑up Inventory

Identify all existing data sources and fields, map them to metrics, and build a metric‑to‑source matrix.

3.2.2 Dimensional Modeling

Adopt a four‑step dimensional modeling process to ensure DW compliance, defining facts, dimensions, granularity, and grain.

3.2.3 Data‑Warehouse Layers

Four layers: ODS (raw), DW (wide tables), DWS (light aggregation), ADS (OLAP layer). This structure supports efficient querying and analysis.

3.3 SQL‑as‑Metric

Metrics are defined directly by SQL statements, ensuring transparency, traceability, and rapid one‑hour data delivery.

3.4 Efficient Interactive Design

Charts are linked and drillable; a theme‑domain approach groups related data, enabling correlation analysis and one‑click navigation to related charts.

4 Summary

The platform improves data accuracy, fast data availability, and rapid analysis by leveraging a data‑warehouse and BI system, freeing resources to focus on metric management and interactive visualization.

5 Future Plans

Improve backend configuration UI to lower the learning curve.

Provide training and guides to reduce user onboarding cost.

Open‑source alternatives such as Superset, Apache DevLake, Druid, ClickHouse, and Pinot are suggested for teams without commercial DW/BI solutions.

R&D metricsData Warehouseplatform architecturedata governanceBI
Zhuanzhuan Tech
Written by

Zhuanzhuan Tech

A platform for Zhuanzhuan R&D and industry peers to learn and exchange technology, regularly sharing frontline experience and cutting‑edge topics. We welcome practical discussions and sharing; contact waterystone with any questions.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.