Li Mu Returns to Bilibili with a Real-Time AI Avatar
Li Mu (沐神) returns to Bilibili after a year to showcase Higgs Avatar v1, a fully AI‑generated real‑time digital human that can listen, speak, lip‑sync and display facial expressions, with performance metrics showing 16 ms per frame on a single H100 GPU and potential applications ranging from customer service to training, while also raising ethical considerations about identity and trust.
Li Mu Returns to Bilibili with a Real‑Time AI Avatar
After a year away from Bilibili, Li Mu (referred to as “沐神”) released Higgs Avatar v1, a real‑time AI avatar that can listen, speak, generate lip‑sync, and display expressive facial movements directly from a static image.
Traditional digital humans fall into two categories: pre‑rendered video clips and virtual anchors whose motions are driven by preset templates. Higgs Avatar v1 departs from this model by attaching a “face” to a speech‑enabled AI agent, generating each frame on the fly, fully AI‑generated and unscripted, without any pre‑rendered animation pipeline.
The system’s pipeline runs on a single NVIDIA H100 GPU, producing a frame in roughly 16 ms. This meets the real‑time dialogue threshold of 62.5 ms and allows up to eight concurrent conversations on one H100, indicating a focus on deployable real‑time services rather than demo‑only showcases.
A responsive face matters because human conversation relies on non‑verbal cues—head nods, eye contact, smiles—that convey attention and build trust. Adding a real‑time facial interface transforms an AI voice assistant from a simple button into a service window that can engage users more naturally.
Early target scenarios include insurance consultation, interview coaching, sales assistance, corporate training, and customer support—situations that require repeated interactions but not deep emotional trust. More advanced use cases such as remote medical advice or emotional companionship are also envisioned, though they raise additional ethical and legal questions.
Ethical challenges arise as avatars become more human‑like: issues of identity rights, user awareness that they are interacting with AI, and the handling of emotional dependence cannot be solved by model metrics alone and require product‑level rules and guidelines.
In summary, Higgs Avatar v1 represents a shift from text‑based interfaces to voice, and now to a real‑time, expressive face, moving AI agents closer to genuine “face‑to‑face” service interactions.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Machine Learning Algorithms & Natural Language Processing
Focused on frontier AI technologies, empowering AI researchers' progress.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
