How Alibaba Generates Frontend Code Automatically with AI Design2Code
This article explains Alibaba's front‑end intelligent system that automatically generates UI code by extracting design metadata, recognizing basic components with a YOLO‑based model, and refining predictions, detailing the pipeline from sample creation to model evaluation and future enhancements.
Background
As one of the four technical directions of Alibaba's front‑end committee, the front‑end intelligence project proved its value during the 2019 Double‑11 event, with 79.34% of the new modules' code generated automatically.
Design2Code Approach
We initially used design‑tool plugins to extract images, text, shapes, etc., from design files. However, many core UI components such as forms, tables, and switches are not represented in design tools, so we turned to deep‑learning methods to recognize these components accurately.
Layer: Basic Component Recognition
The capability layer focuses on identifying predefined basic components in images, enabling downstream optimization such as generating compliant component trees and improving semantic representation.
Overall Solution
The end‑to‑end pipeline includes sample acquisition , sample evaluation , model training , model evaluation , and model prediction .
Sample Acquisition & Evaluation
High‑quality samples are crucial. We generate samples by encoding UI libraries that are widely used, ensuring balanced data across component types, constructing special scenarios (e.g., text‑over‑image), and adding padding around line‑based components to avoid edge‑feature learning.
Data types are diverse and balanced, covering various attributes and styles.
Special scenarios such as overlayed text or images are also created.
For line‑based components like Input , surrounding padding is added to prevent edge bias.
Model Selection
Based on recent PASCAL VOC benchmarks, we chose the top‑ranking YOLO one‑stage algorithm for transfer learning.
YOLO (You Only Look Once) works in three steps:
Resize the image to 416 × 416.
Use a convolutional network to learn classification features.
Apply non‑maximum suppression to filter bounding boxes.
Model Evaluation
We use mean Average Precision (mAP) based on COCO to measure accuracy, comparing predictions with ground truth. Results show lower precision for small targets (e.g., text elements), indicating a need for additional preprocessing for such cases.
Prediction Processing
During inference we resize images to 416 × 416, then refine the predicted boxes using OpenCV gradient edge detection, which improves IoU by over 10% on the test set.
Conclusion
The front‑end intelligence team currently supports more than 20 basic component types. Future work will focus on finer‑grained sample classification, attribute measurement, and standardized data management, enabling external developers to leverage the open‑source sample set for customized component processing.
Taobao Frontend Technology
The frontend landscape is constantly evolving, with rapid innovations across familiar languages. Like us, your understanding of the frontend is continually refreshed. Join us on Taobao, a vibrant, all‑encompassing platform, to uncover limitless potential.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.