Frontend Development 7 min read

Image-to-UI Code Generation Demo and Architecture Overview

The Xianyu team’s new “black‑tech” system automatically transforms UI mockup images into production‑ready code by detecting components with deep‑learning models, extracting layouts via OpenCV, and employing a modular stream‑oriented architecture of units, tasks, and server layers that enables rapid testing, flexible composition, and future enhancements such as improved container recognition and semantic understanding.

Xianyu Technology
Xianyu Technology
Xianyu Technology
Image-to-UI Code Generation Demo and Architecture Overview

For a long time, converting visual mockups into exact UI code has been a labor‑intensive task for front‑end developers, requiring extensive communication with designers.

The Xianyu team built a "black‑tech" system that translates images directly into UI code, and a demo video showcases the result.

Selection Background – Images are chosen as input because they are the final, deterministic artifact of design, avoid upstream constraints, and enable broader scenarios such as automated testing and competitor screenshot analysis.

Process Overview – The pipeline first uses deep‑learning models to detect UI elements (basic components like ImageView, TextView, custom BI components, and business components). Detected elements are then extracted using OpenCV‑based rendering analysis.

The overall workflow is illustrated by the accompanying diagrams.

Architecture Evolution – The original linear pipeline suffered from tight coupling; a new stream‑oriented architecture introduces three layers: unit (fine‑grained functions), tasks (combinations of units), and server (service provision). This modular design enables mock‑based testing, rapid composition, and reduced upstream/downstream impact.

Additionally, a unified client‑server execution model allows developers to focus on UI‑service communication without worrying about deployment details.

Layout Problem Analysis – The system converts a static DSL into a layout‑property tree by analyzing element positions, spacing, and container locations, referencing Flex and Grid standards. Semantic similarity is used to resolve ambiguous cases and produce more human‑like code.

Current Status and Future – The project is already running in production, has attracted attention from Google, and aims to further improve container and background recognition, semantic understanding, and eventually replace manual slicing entirely. Future work includes extending to weak‑interaction, high‑visual scenarios and collaborating with D2C projects.

frontendDeep Learningimage analysislayout detectionUI generation
Xianyu Technology
Written by

Xianyu Technology

Official account of the Xianyu technology team

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.