Artificial Intelligence 12 min read

Design and Implementation of a Multimodal Real-Time Voice AI Teammate for Naraka: Bladepoint

This article explains the design, implementation, and underlying Agent‑Oriented‑Programming framework of NetEase Fuxi’s multimodal real‑time voice AI teammate for the mobile game ‘Naraka: Bladepoint’, highlighting its capabilities such as autonomous navigation, combat assistance, natural dialogue, teaching, and broader applications of voice technology in games.

DataFunSummit
DataFunSummit
DataFunSummit
Design and Implementation of a Multimodal Real-Time Voice AI Teammate for Naraka: Bladepoint

The article introduces NetEase Fuxi, China’s first game AI research institute, which has published over 270 academic papers and holds more than 600 patents across AI, metaverse, digital twin, and intelligent decision‑making fields.

It describes the launch of a novel game Copilot – a multimodal real‑time voice AI teammate for the mobile game Naraka: Bladepoint . The AI teammate can autonomously navigate maps, execute combat actions, respond to voice commands, report battle status, engage in free conversation, and provide emotional support, especially benefiting socially anxious players.

Key technical capabilities are detailed: (1) robust speech recognition without wake‑word activation, handling noise, accents, and domain‑specific terms; (2) a data‑closed‑loop training pipeline built on an Agent‑Oriented‑Programming (AOP) framework that enables autonomous model evolution; (3) integration of large language models with text‑to‑speech for natural dialogue; (4) diverse persona options (e.g., cute girl, gentle lady, warm male) to enrich player interaction; (5) a knowledge‑base powered Q&A system using embedding retrieval, RAG, and advanced LLMs; (6) reinforcement‑learning‑based combat decision making for executing player commands.

The AOP framework is explained as a programming paradigm that models tasks as Markov Decision Processes, creating a closed data loop between agents and environment to continuously improve policies. An example IDL definition leads to generated runtime code with synchronous and asynchronous interfaces, enabling rapid deployment of agents such as ASR services.

Beyond the AI teammate, the article surveys other game‑scene voice applications: real‑time synthesis of NPC dialogue in Justice Online , customizable skill voice shout‑outs, singing voice synthesis, and voice conversion technologies (DualVC series). It also mentions NetEase Fuxi’s research on speech codecs and Speech LLMs that achieve high‑fidelity, controllable voice generation.

Finally, the article thanks the audience and provides links for further reading and upcoming AOP SDK beta testing.

Multimodalreal-time interactiongame AIvoice AIagent-oriented programmingNaraka Bladepoint
DataFunSummit
Written by

DataFunSummit

Official account of the DataFun community, dedicated to sharing big data and AI industry summit news and speaker talks, with regular downloadable resource packs.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.