Operations 12 min read

How Tencent Transformed Game Operations: From Manual Tasks to Microservice Automation

This article chronicles Tencent's game operations service evolution—from early manual processes and script‑driven releases to automated tooling, microservice‑based design, and data‑driven metrics—highlighting key service definitions, eight product‑focused service categories, real‑world case studies, and the roadmap toward intelligent, 4.0‑level operations.

Efficient Ops
Efficient Ops
Efficient Ops
How Tencent Transformed Game Operations: From Manual Tasks to Microservice Automation

Operation Service Definition

Before discussing the service system, it is essential to define what operation service means.

Release, change, incident handling + SLA constitute operation service, but this is only the basic layer.

We redefine operation service as a value‑added, billable service that is monitored by the product or service team. It has three characteristics:

User focus

Value creation

Billable

Tencent Game Operations Service Evolution

The service system has continuously evolved alongside technological advances.

1.0 Manual Era and 2.0 Script Era

Before 2008, server migrations and releases were performed manually, making operations labor‑intensive.

From 2008 to 2012, extensive use of shell scripts ushered in the script era; Tencent Game accumulated over 50,000 scripts. These two eras focused on supporting basic release, change, incident, and migration services.

3.0 Automation Tool Era

With standardization, roles became more specialized. The emergence of the Blue Whale platform in 2013 marked the start of the automation tool era, enabling large‑scale business growth, automated operation tools, and rapid team development.

Designing operation services that attract product attention and showcase technical expertise and benefits involves:

Starting from product operation scenarios.

Integrating technical solutions to deliver measurable value through a unified service window.

Based on eight product scenarios, we defined corresponding services:

Business architecture planning → Architecture Design Service

Version release/change → Version Service

Marketing activities → Marketing Service

Availability assurance → High‑Availability Service

User experience → UX Optimization Service

Cost control → Cost Optimization Service

Security assurance → Security Service

Operational decision support → Data View Service

A set of quality metrics is used to evaluate the value of these services.

Case Study 1: Version Service

Version release is a daily operation task. Operations focus on reducing release duration through standardization and automation, while product teams care about online recovery time, which directly impacts DAU and session length.

Key factors affecting online recovery time include player activity timing and the duration required to distribute update packages, which depends on distribution volume, user growth, deployment automation, and push automation.

Deep Analysis of Factors Influencing Online Recovery Time:

Player gaming time affects version release timing and update package deployment.

Package delivery time is determined by distribution scale, user increment, update cost, deployment automation, and push automation.

Measures taken (see diagram) reduced online recovery time by 90% while halving bandwidth usage, despite a 200% increase in users and package volume since 2013.

The version service logic framework includes deployment, gray‑release control, unpacking, user data analysis, push timing, and automated push, all supported by the Blue Whale PaaS platform and its data channel and scheduled task system.

While the three service attributes (user focus, value creation, billability) are achieved, several challenges remain:

Inter‑service dependencies make it hard to reuse capabilities.

Reliance on individual operation engineers for data acquisition.

Complex product team demands for granular user needs.

Cost control trade‑offs: reduced operational cost but increased manpower for diverse game titles.

Case Study 2: Download Service

Applying microservice principles, we decomposed the download service into seven independent sub‑services, later expanding to eight, enabling flexible composition, independent evolution, and precise pricing.

After adopting microservices, download completion rates improved by over 20%, download time decreased by more than 60%, and overall conversion rates rose by 10% without additional CDN costs, while enabling per‑user, regional, and IP‑level tracking.

Advanced Game Operations Service System

The service system is reorganized into six major categories—Version, Operational Activity, Cost, User Experience, Operations Consulting, and Security—each further divided into sub‑services that can be combined on demand.

Service‑oriented and product‑oriented approaches reflect the professional value of the operations team, continuously supporting business growth through the internal “cloud ladder” framework.

Cloud adoption, Blue Whale’s rapid growth, and integration of big data, Docker, and other technologies are driving the industry toward a 4.0, intelligent operations era.

Intelligent operations require six capabilities—perception, analysis, decision, execution, presentation, and protection—supported by cultural and talent transformation, paving the way for the next generation of operation services.

microservicesTencentGame OperationsService Automationoperational metrics
Efficient Ops
Written by

Efficient Ops

This public account is maintained by Xiaotianguo and friends, regularly publishing widely-read original technical articles. We focus on operations transformation and accompany you throughout your operations career, growing together happily.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.