Applying Large Language Models to NPC Role‑Playing and Game Localization at Tencent
This article details Tencent's practical exploration of large language model deployment in overseas game scenarios, covering the design of customized NPC role‑playing models, multilingual localization pipelines, data construction, training, evaluation frameworks, multi‑agent improvement loops, and insights from a comprehensive Q&A session.
Tencent shares its experience of deploying large language models (LLMs) in overseas game contexts, focusing on two main scenarios: NPC role‑playing and game localization translation.
For NPC role‑playing, generic LLMs often sound overly formal and lack personality, so Tencent builds a specialized model with a million‑scale dataset sourced from novels, scripts, and games, using a "5+3" schema (name, gender, age, personality, background, plus actions, dialogue style, knowledge). Training involves targeted fine‑tuning, safety‑question datasets, and DPO optimization to enforce knowledge boundaries.
The evaluation framework consists of three tiers: basic language ability, identity‑consistent style and skills, and advanced subjective traits. Benchmark results show that even strong models like GPT‑4o can appear stiff, illustrated by a Socrates dialogue case.
In the localization translation track, three translation categories are identified (in‑game UI, storyline, user‑generated content and operational events). Challenges include missing game‑specific terminology, evolving slang, and contextual nuances. Tencent enhances LLMs with retrieval‑augmented generation (RAG), term‑embedding, and negative‑sample training to filter noisy retrieval results.
A continuous improvement loop—translation → evaluation → correction—is implemented using a multi‑agent chain and MQM scoring, complemented by offline model iteration and online A/B testing to monitor user feedback, usage frequency, and quality metrics, especially for low‑resource languages.
The Q&A section addresses online vs. offline correction, expert evaluation dimensions, benefits of multi‑agent designs, NPC memory mechanisms, multilingual curse mitigation strategies, and practical data collection methods for low‑resource languages.
DataFunSummit
Official account of the DataFun community, dedicated to sharing big data and AI industry summit news and speaker talks, with regular downloadable resource packs.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.