Big Data 12 min read

Alibaba Cloud DataWorks Intelligent Data Modeling: Practices and Insights

This article introduces Alibaba Cloud DataWorks' intelligent data modeling tool, outlines the data demand flow, shares best practices and practical demonstrations of data warehouse modeling, discusses model application and data asset management, and answers common questions while highlighting its commercial availability.

DataFunSummit
DataFunSummit
DataFunSummit
Alibaba Cloud DataWorks Intelligent Data Modeling: Practices and Insights

DataWorks, Alibaba Cloud's big data development governance platform, has evolved for 14 years and recently launched the Intelligent Data Modeling tool at the 2021 Cloud Expo. The tool is built with contributions from Alibaba's internal data warehouse teams such as Cainiao, Taobao, and Tmall.

1. Alibaba Data Demand Flow

Roles involved in data warehouse construction include data demand owners (operations, BI, product managers), data product managers who translate business needs into data requirements, and data development engineers responsible for designing data models and metrics.

2. Data Warehouse Modeling Best Practices

Business classification: based on Kimball dimensional modeling, adding a "business classification" layer to separate models by business team.

Data domains: define domains by aggregating business processes and key entities.

Data marts: include business, product, and public marts in the application layer, with optional modeling based on business needs.

Standardization of naming conventions and storage strategies is enforced through built‑in templates and validation checks.

3. Practical Demonstration of Data Warehouse Modeling

The demo covers four aspects: warehouse planning, data standards, metric design, and dimensional modeling. It shows how to batch generate derived metrics, create DWD tables by importing ODS structures, and handle field redundancy for efficient model design.

Code mode supports MaxCompute DDL and Hive DDL, and can generate ETL code from SELECT statements.

4. Data Model Application and Data Asset

After models are materialized, they can be published to the Data Asset catalog, enabling zero‑code SQL analysis and field selection. The Data Asset 3D panorama visualizes the enterprise's data assets for better governance.

5. Q&A

Q: Does DataWorks support generating slowly changing dimension (SCD) tables? A: Automatic SCD generation is not yet publicly available. Q: How is data asset sharing handled? A: Administrators publish assets to the Data Asset module; sharing between business users is currently done via direct product links.

DataWorks Intelligent Data Modeling is commercially available on Alibaba Cloud, with a personal version priced at 60 CNY for six months, including a retail e‑commerce template and tutorials.

For more details, visit: https://www.aliyun.com/product/bigdata/ide

DataWorksBigDataDataWarehouseDataGovernanceAlibabaCloudIntelligentDataModelingModelingBestPractices
DataFunSummit
Written by

DataFunSummit

Official account of the DataFun community, dedicated to sharing big data and AI industry summit news and speaker talks, with regular downloadable resource packs.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.