Design and Implementation of Qunar Flight Ticket Intelligent Alert (Radar) System
This article presents a comprehensive analysis and engineering of Qunar's flight‑ticket intelligent pre‑warning (Radar) system, covering the business need, value analysis, architectural redesign, feature extraction, indicator classification, accuracy quantification, multi‑algorithm anomaly detection, automatic parameter tuning, observed effects, and future plans to incorporate large‑model techniques.
1. Introduction
The article outlines the motivation for building an intelligent pre‑warning system for Qunar's flight‑ticket business, describing the challenges of massive metric volumes, the low detection rate of manual alerts, and the need for a sustainable, high‑accuracy monitoring solution.
2. Value Analysis
2.1 Background
In the digital era, the ticketing platform faces unprecedented monitoring challenges, with hundreds of thousands of business metrics and millions of system metrics, making manual detection infeasible.
2.1.1 Fault Detection Methods
Prior to the Radar system, alerts relied on manually configured thresholds, resulting in a 50% alarm loss rate; only 38% of incidents were caught by alerts.
2.2 Goals of the Radar System
The new system aims to cover >50,000 core ticket metrics and all core app‑code error metrics, achieve >75% detection accuracy, reduce human effort, and enable sustainable operations.
3. Radar System Analysis
3.1 Difficulty Analysis
Existing models suffered from limited feature extraction, coarse metric classification, and reliance on a single 3‑sigma rule, leading to poor accuracy.
3.2 Architecture
The redesigned Radar model consists of five core modules—data collection, feature extraction, metric classification, anomaly detection, and alert triggering—plus an accuracy‑validation module.
3.2.1 Feature Extraction Module
Extensive statistical features (max, min, mean, variance, period, etc.) are computed over a 7‑day window to better characterize metric behavior.
3.2.2 Metric Classification
Metrics are classified by waveform type (stable, periodic, discrete, jitter) and business type (error, rate_fail, success, rate_success, count), enabling targeted detection strategies.
3.2.3 Accuracy Quantification Model
A two‑part validation framework selects representative fault and IVR‑alert test sets and verifies model performance against persisted data.
3.2.4 Anomaly Detection Algorithms
Multiple algorithms are applied based on waveform type:
Continuous waveforms: BoxPlot, KDE, 3‑Sigma, Z‑score, LOF, Isolation Forest, etc.
Discrete waveforms: Density‑based anomaly detection using densityStd and densityAvg thresholds.
Sharp rise/fall detection: Trend‑based rules with configurable duration and threshold parameters.
3.2.5 Automatic Parameter Tuning
Parameter ranges are auto‑generated, tested against the fault test set, and the best‑performing configurations are deployed, dramatically improving detection accuracy for each metric category.
4. Effects and Future Plans
4.1 Effects
During the first quarter of 2024, the Radar system achieved an average accuracy of 87%, discovered ~10 online issues per week, and eliminated new small‑traffic faults.
4.2 Future Plans
The next step is to integrate large‑model AI techniques for deeper metric classification, noise reduction, and alert suppression, further enhancing precision and scalability.
Qunar Tech Salon
Qunar Tech Salon is a learning and exchange platform for Qunar engineers and industry peers. We share cutting-edge technology trends and topics, providing a free platform for mid-to-senior technical professionals to exchange and learn.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.