Big Data 11 min read

How Tencent’s Multi‑Dimensional Monitoring Turns Big Data Into Real‑Time Business Insights

This article explains how Tencent’s ZhiYun multi‑dimensional monitoring system evolves from the Mobile Monitor platform, outlines its design principles, data‑factory capabilities, storage choices, and intelligent features, and demonstrates how it enables real‑time, multi‑dimensional analysis and alerting for large‑scale business operations.

Efficient Ops
Efficient Ops
Efficient Ops
How Tencent’s Multi‑Dimensional Monitoring Turns Big Data Into Real‑Time Business Insights

Background

In recent years, big data technologies have matured, enabling reliable collection, processing, and storage. Common big‑data applications include recommendation, BI reporting, profiling, log search, and machine learning. Real‑time monitoring is a critical scenario for fast statistics and anomaly alerts.

From Mobile Monitor to Multi‑Dimensional Monitoring

The original Mobile Monitor (MM) system collected dimensions such as region, carrier, version, command, SET, APN, and metrics like request count, success rate, latency, using Storm for real‑time analysis and alerting. Limitations of MM—single data source, fixed processing logic, outdated stack, poor scalability—prompted its reconstruction as the Hubble platform, now called ZhiYun Multi‑Dimensional Monitoring.

Design Principles

Backward compatibility with existing functions.

Component‑based, reusable real‑time processing.

Low‑code configuration: users can build Storm topologies via UI.

Optimized architecture for accuracy and low latency.

Improved user experience with unified UI and helpful error messages.

Data Factory

Common big‑data operations are classified and exposed as UI configuration items:

Filtering : ensure data completeness, remove redundant records.

Formatting : time conversion, type conversion, URL encode/decode.

Translation : IP lookup, dictionary mapping, delimiter split, arithmetic, UDF.

Forwarding : send to SNG DC, CDB, Kafka.

Grouping : define window, time field, group fields.

Aggregation (with optional filtering) : count, distinct count, min, max, first, last, sum, average.

These functions generate a Storm topology that performs initial aggregation (e.g., a 1‑minute sliding window) and stores results in an OLAP engine.

Storage Engine

ZhiYun primarily uses Druid, a time‑series database optimized for multi‑dimensional analysis, while smaller datasets may be stored in PostgreSQL/MySQL and full‑text search data in Elasticsearch.

Application Ecosystem

Processed data flows from the Data Factory into the monitoring system and downstream applications, enabling multi‑dimensional drill‑down analysis and alerting.

Multi‑Dimensional Drill‑Down Analysis

The analysis UI consists of a business tree, dimension filters, metric trend chart, and data panel. Users can drill down by selecting abnormal time points, sorting by request count, and isolating problematic dimension combinations (e.g., specific AppID, command, return code) to pinpoint faults.

Multi‑Dimensional Alerting

Beyond visual analysis, the system supports configurable alert rules that trigger notifications when complex multi‑dimensional conditions are met. Users can also set subscription, convergence, and suppression rules to streamline incident response.

Intelligent Features

Machine‑learning models provide root‑cause recommendation and threshold‑free anomaly detection by learning from historical data.

Current Status

More than 200 internal Tencent services and over a thousand servers are already using ZhiYun Multi‑Dimensional Monitoring.

big datadata pipelineReal-time MonitoringDruidstormmultidimensional analysis
Efficient Ops
Written by

Efficient Ops

This public account is maintained by Xiaotianguo and friends, regularly publishing widely-read original technical articles. We focus on operations transformation and accompany you throughout your operations career, growing together happily.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.