Frontend Development 23 min read

How Web Front‑End Teams Build Next‑Gen Digital Humans for the Metaverse

This article, originally presented at the 16th D2 Front‑End Technology Forum, explores the current state, market potential, and core technologies behind virtual digital humans—covering their appearance, behavior, and intelligence—while detailing practical pipelines, rendering techniques, and future challenges for web‑based implementations.

Taobao Frontend Technology
Taobao Frontend Technology
Taobao Frontend Technology
How Web Front‑End Teams Build Next‑Gen Digital Humans for the Metaverse

This article, delivered at the 16th D2 Front‑End Technology Forum, outlines the industry status of virtual digital humans, showcases notable examples, introduces the hot metaverse concept, and examines core technical points while envisioning future business value and strategic layout.

Digital Human Overview

Digital humans are defined by three core elements: Shape – a human‑like appearance with distinct facial features; Motion – behavior similar to humans expressed through language, facial expressions, and body movements; Intelligence – the ability to perceive the environment and interact intelligently.

Market Status

Virtual idols have rapidly grown in e‑commerce, finance, film, and gaming. For example, China’s virtual idol market reached ¥3.46 billion in 2020 and is projected to hit ¥6.22 billion in 2021. The market has passed three phases: Startup (high entry barrier, uncertain tech), Growth (more competitors, maturing tech, lower barrier), and Platform (red‑sea market, platform maturity, leading players plus niche).

Solution Layers

Infrastructure Layer : Provides hardware (displays, optics, sensors, chips) and software (modeling tools, rendering engines). Only a few top tech firms have strong capabilities.

Platform Layer : Offers integrated hardware‑software systems, production service platforms, and AI capability platforms for virtual avatar creation.

Application Layer : Enterprises with strong marketing and operation skills apply these platforms to create innovative experiences.

Team Structure

In early 2023, Alibaba’s Front‑End Committee formed a Virtual Character Group comprising the Taobao Interactive Team, DAMO Academy Digital Human Team, Youku Digital Human Production Team, Kaola Interactive & Content Shopping Team, and Ant Financial Content Community Team. They focus on three scenarios: games, video, and live streaming.

Shape (形) Pipeline

The shape stage defines the avatar’s basic body proportions (e.g., 7‑head or 5‑head models) and gender or anthropomorphic style. Production occurs mainly in 3D DCC tools, requiring close collaboration between artists and engineers.

Create white‑box model and rig in Maya, store assets in OSS, and provide preview tools.

Generate textures in Photoshop and upload to CDN.

Develop a custom GLTF exporter plugin in Maya to export model, skeleton, materials, and textures.

Adjust materials via a web‑embedded material editor.

Import GLTF into the EVA Figure engine, customize shaders, and render.

Face Editing (捏脸) Techniques

Bone Skinning

Uses a skeleton to drive vertex transformations (translation, rotation, scaling). Approximately 20 facial bones allow adjustments of head size, eye position, cheekbones, etc.

Morph Targets (Blend Shapes)

Provides fine‑grained vertex deformation by defining a base shape and an extreme target, then interpolating with a weight. This enables detailed facial expressions but incurs higher computational cost.

Clothing (换装) and Mesh Interaction

Clothing meshes share the same skeleton as the body, enabling synchronized movement. To avoid “clipping” where body geometry penetrates clothing, hidden body parts are masked during rendering.

Rendering Styles

PBR (Physically Based Rendering)

PBR simulates real‑world light interaction, producing photorealistic results. It supports subsurface scattering for realistic skin tones.

NPR (Non‑Photorealistic Rendering)

NPR focuses on artistic styles such as cartoon outlines, cel shading, and sketch effects, commonly used for anime‑style avatars.

Motion (动) Pipeline

Expressions and Actions

Facial animation combines bone skinning for coarse movements and morph targets for detailed expressions (≈50 blend shapes). Body motion relies on skeletal animation where mesh vertices are weighted to bones.

Motion Capture

To reduce manual keyframe effort, motion capture captures real‑world movements. The captured data can be categorized into four quadrants (e.g., optical recognition with mobile cameras) and integrated into the animation pipeline.

Director System

Pre‑produced motion clips are orchestrated like a script, allowing flexible composition of virtual performances. Transition between clips uses animation blending (linear or Bézier interpolation).

Intelligence (神) and Speech Synthesis

Natural language capabilities rely on TTS engines (e.g., Alibaba DAMO’s TTS). To convey emotion, style‑aware synthesis combines large‑scale data and deep learning, producing expressive speech beyond generic TTS.

Web Challenges and Future Directions

WebGL lags behind native APIs (Vulkan, DirectX, Metal) in performance, and device fragmentation creates compatibility issues. To bridge the gap, Alibaba explores serverless cloud rendering, EVA Figure combined with Puppeteer, and emerging WebGPU/WASM technologies for higher‑fidelity rendering and offline image generation.

References

《艾媒咨询|2021中国虚拟偶像行业发展及网民调查研究报告》

《三维软件知多少》

《三维文件格式知多少》

《glTFTutorial》

《Vertex Transformation》

《WebGL Skinning》

《MorphTarget》

《Real-time Rendering》

《Moving Frostbite to PBR 3.0》

《Stylized Rendering in Game》

《基于物理的渲染(PBR)白皮书》

《ARkit Face Blendshapes》

《OpenGL Skeletal Animation》

《类卡通效果与写实人脸的52个blendshape效果对比及变化说明》

《Motion Capture》

《Motion Capture System》

《Game Engine Architecture 3rd – Animation Blending》

《游戏引擎动画系统阅读笔记》

《语音识别技术》

《语音合成技术》

《MetaHuman》

animationfrontend development3D Renderingdigital humansWeb Graphicsvirtual avatars
Taobao Frontend Technology
Written by

Taobao Frontend Technology

The frontend landscape is constantly evolving, with rapid innovations across familiar languages. Like us, your understanding of the frontend is continually refreshed. Join us on Taobao, a vibrant, all‑encompassing platform, to uncover limitless potential.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.