Frontend Development 21 min read

From UI Sketch to Code: Frontend Intelligence Generates 79% of Double‑11 Modules

This article explains how Alibaba's Front‑End Intelligent project automatically converts UI design images into production‑ready code, covering layout analysis, background and foreground processing, a fusion of traditional image algorithms with deep‑learning detection, GAN‑based complex‑background extraction, experimental results and real‑world deployment.

Taobao Frontend Technology

Dec 5, 2019

From UI Sketch to Code: Frontend Intelligence Generates 79% of Double‑11 Modules

Overview

The Front‑End Intelligent project, one of the four technical directions of Alibaba's Front‑End Committee, proved its value during the 2019 Double‑11 event, automatically generating 79.34% of the online code for new modules on Tmall and Taobao. This series shares the techniques and thoughts behind that achievement.

Why Use Images as Input

Images are the final deliverable, intuitive and deterministic, without upstream constraints.

Layout differences (e.g., listview, gridview) do not exist in visual drafts.

Image‑based pipelines support broader scenarios such as automated testing and competitor‑image reuse.

Layer stacking issues in design drafts are easier to handle when starting from images.

Layer Processing

In the D2C stack, the layer handling layer identifies element categories and extracts styles, providing data for the subsequent layout algorithm layer.

Layout Analysis

Layout analysis splits UI images into foreground and background. Background analysis uses machine‑vision algorithms to detect color, gradient direction, and connected regions, while foreground analysis employs deep‑learning models to merge and recognize GUI fragments.

Background analysis: analyze background color, gradient direction, and connected areas. Foreground analysis: use deep‑learning to organize, merge, and recognize GUI fragments.

Background Analysis

Step 1: Detect background blocks with edge detectors (Sobel, Laplacian, Canny) to separate solid‑color and gradient regions. The Laplacian template is illustrated below.

If a gradient background is detected, step 2 applies a flood‑fill algorithm to refine it.

def fill_color_diffuse_water_from_img(task_out_dir, image, x, y, thres_up=(10,10,10), thres_down=(10,10,10), fill_color=(255,255,255)):
    # Obtain image height and width
    h, w = image.shape[:2]
    # Create a (h+2, w+2) single‑channel mask required by OpenCV
    mask = np.zeros([h+2, w+2], np.uint8)
    # Perform flood fill with specified thresholds and fixed‑range mode
    cv2.floodFill(image, mask, (x, y), fill_color, thres_down, thres_up, cv2.FLOODFILL_FIXED_RANGE)
    cv2.imwrite(task_out_dir + "/ui/tmp2.png", image)
    return image, mask

Resulting images show the original and the processed output.

Foreground Analysis

Foreground processing focuses on component integrity: connected‑component analysis prevents fragmenting, followed by machine‑learning classification and merging until no residual fragments remain. An example of a complete item in a waterfall‑flow layout is shown.

Traditional edge‑gradient methods (CLAHE, Canny, morphological dilation, Douglas‑Peucker) are compared with deep‑learning detectors (Faster‑RCNN, YOLO, SSD). The fusion of both approaches yields high precision, recall, and localization (IOU).

Fusion Process

Run traditional image processing and deep‑learning detection in parallel, obtaining trbox and dlbox .

Filter trbox : keep boxes whose IOU with dlbox exceeds a threshold (e.g., 0.8).

Filter dlbox : discard boxes whose IOU with the retained trbox exceeds the threshold.

Adjust remaining dlbox edges to the nearest straight line within a pixel limit, without crossing trbox boundaries.

Output the union of filtered trbox and adjusted dlbox as the final result.

Metrics

True Positive (TP), True Negative (TN), False Positive (FP), False Negative (FN) are defined, and the standard formulas for Precision = TP/(TP+FP), Recall = TP/(TP+FN), and IOU = intersection/union are used to evaluate the methods.

Experimental Results

On 50 randomly sampled Xianyu waterfall‑flow images (96 cards total), traditional methods detected 65 cards, deep‑learning detected 97, and the fused approach detected 98 with superior precision, recall, and IOU. Detailed tables and charts illustrate the comparison.

Complex Background Content Extraction

Extracting specific content from complex backgrounds is challenging for both traditional image processing (low recall) and semantic segmentation (no pixel‑level restoration). The proposed pipeline uses a detection network for content recall, gradient‑based region judgment, and a SR‑GAN to restore elements in complex regions.

Why GAN?

The SR‑GAN incorporates a feature‑map loss to preserve high‑frequency details, an adversarial loss to reduce false detections, and can reconstruct pixel values behind semi‑transparent overlays—something pure segmentation cannot achieve.

Training Flow

Business Deployments

The solution is already used in the imgcook image pipeline (≈73% accuracy for generic scenes, >92% for specific card layouts) and in Taobao’s automated testing for major promotions, achieving >97% precision and recall.

Future Work

Planned improvements include richer layout identification (listview, gridview, waterfall), higher accuracy for small objects via Feature Pyramid Networks and Cascade R‑CNN, broader page coverage beyond Xianyu and Taobao, and an image‑sample generator to lower onboarding effort.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

code generation Machine Learning automation GaN image processing Layout Analysis

Written by

Taobao Frontend Technology

The frontend landscape is constantly evolving, with rapid innovations across familiar languages. Like us, your understanding of the frontend is continually refreshed. Join us on Taobao, a vibrant, all‑encompassing platform, to uncover limitless potential.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.