Data Service Platform Architecture and Design
The article outlines a standardized data‑service platform built atop a warehouse, detailing its construction, query, and gateway layers—supporting model definition, acceleration, reusable APIs, unified DSL/SQL interfaces, and observability—to solve ingestion, definition, and lineage issues, achieving 500+ APIs, sub‑day creation, and 18% cost reduction.
Background: With growing business data demands, the company seeks a standardized, middle‑platform data service solution (data SaaS) to provide data via APIs, RPC, files, etc. Pain points include diverse data ingestion methods, inconsistent data definitions, lack of sharing, unclear data lineage, and uneven service quality.
Key functional requirements: standardized interface definitions, a generic data service gateway, manageable data lineage, observable and operable services, and reusability.
Desired characteristics: flexibility (downstream unaffected by upstream changes), convenience (high integration efficiency), and low cost (reuse instead of duplication).
Architecture Design: The platform sits on top of a data warehouse and consists of a data construction layer, data query layer, service interfaces, service gateway, and supporting modules such as data standard management, security, and monitoring. The core service chain is Data Construction → Data Query → Service Interface & Gateway.
2.1 Data Construction: Transforms raw Hive tables into business‑oriented data models. It serves data producers (warehouse developers) and provides model definition, model acceleration, and API construction capabilities.
2.1.1 Model Definition: Supports classic dimensional modeling (single, star, snowflake, constellation) to satisfy various analytical scenarios.
2.1.2 Model Acceleration: Provides two acceleration modes—detail acceleration (mirroring data from cold to hot engine) and pre‑calculation acceleration (aggregating to target granularity). Recommended engine combinations:
Online: pre‑calc + KV store
Near‑online: pre‑calc + TiDB/MySQL
OLAP: detail + ClickHouse or Iceberg
Offline: direct Hive access
2.1.3 API Construction: Standardizes API elements (ID, name, method, path, parameters, QoS, etc.). Supports three ways to define data retrieval logic:
select a.field1 AS alias_1, a.field2 as alias_2, a.field3 as alias_3, b.field1 as alias_4 from fact_table a left outer join table_dim b on a.id = b.id where a.field = ${ input_1,type = number } and b.field = ${ input_2,type = number }
Model‑based construction (visual configuration) and metric‑dimension construction (auto‑recommend models) are also provided.
2.2 Data Query: Acts as the middle layer between service interfaces and data models. Supports atomic computation (single‑engine queries) and composite computation (post‑processing such as trend, ratio, correlation).
Atomic computation workflow includes scheduling, translation, and engine execution. Scheduling parses a DSL, matches APIs to models, splits tasks, and merges results. Translation builds an abstract syntax tree (AST) and generates engine‑specific SQL.
Composite computation performs secondary calculations on the atomic result matrix (e.g., year‑over‑year, funnel, covariance).
2.3 Service Gateway & Interface: Provides a unified entry point with authentication, rate‑limiting, and monitoring. Supports synchronous and asynchronous queries, DSL, template, and raw SQL interfaces.
Example DSL request definition:
message OpenApiReq {
OsHeader osHeader = 1;
repeated OperatorVo filters = 2;
repeated string metrics = 3;
repeated string dims = 4;
repeated string orderFields = 5;
PageVo pageVo = 6;
repeated OperatorVo metricFilters = 7;
}Template SQL request:
message SqlQueryReq {
OsHeader osHeader = 1;
repeated OperatorVo filters = 2;
}Asynchronous SQL request:
message AsyncSqlQueryReq{
string appKey = 1;
string secret = 2;
string engine = 3;
string sql = 4;
}General Solutions: Unified metric definition and export, cost reduction through model and API reuse, and high availability via service isolation, active‑active deployment, and caching (local and distributed).
Results: After one year, the platform hosts over 500 APIs with tens of thousands of QPS, reduces API creation time from ~5 days to <1 day, and cuts production cost by ~18%.
Future Plans: Improve stability, introduce more automation and intelligence, and strengthen service governance.
Bilibili Tech
Provides introductions and tutorials on Bilibili-related technologies.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.