Backend Development 23 min read

Design and Evolution of Ctrip Flight Search System: High‑Throughput Caching, Real‑Time Computing, Load Balancing and AI

Ctrip’s flight search service processes two billion daily queries by employing a multi‑level Redis cache, machine‑learning‑driven TTLs, distributed pooling and overload protection, AI‑based anti‑scraping, and robust load‑balancing across three data centers, delivering sub‑second latency, up to three‑fold throughput gains and significant cost reductions.

Tencent Cloud Developer

Apr 26, 2020

Design and Evolution of Ctrip Flight Search System: High‑Throughput Caching, Real‑Time Computing, Load Balancing and AI

In this talk, Song Tao, Technical Director of Ctrip's flight business, shares the architecture and performance‑optimization techniques of Ctrip's flight search service, which handles 2 billion queries per day with strict low‑latency and high‑throughput requirements.

Business characteristics : the service must support massive traffic, sub‑second response time, high success rate, multi‑engine aggregation (own pricing engine + external GDS/SLA), compute‑intensive and I/O‑intensive workloads, and diverse user scenarios (e.g., student discounts, regional preferences). Approximately 9 % of traffic comes from crawlers, and 28 % is from international users.

Infrastructure : Ctrip operates three independent data centers with disaster‑recovery capability. The stack is based on Spring Cloud, Kubernetes and public cloud services, complemented by an open‑source DevOps toolchain. Storage solutions include MySQL, Redis and MongoDB. Network reliability is ensured through extensive SRE practices such as circuit‑breaking and rate‑limiting.

System architecture : A gateway routes requests to an aggregation service, which calls multiple engine services. Distributed caching (Redis) is heavily used, and the aggregation results are streamed via Kafka to an AI data platform for analytics and traffic replay. A data‑filtering layer in the cloud reduces inbound traffic by about 90 % before it reaches downstream services.

Cache evolution : The system moved from local caches to a multi‑level distributed cache. L1 (Redis) stores final results; L2 (originally MongoDB, later migrated to Redis) stores intermediate engine results. Multi‑level caching reduces database load, protects external partners, and improves latency (read latency < 3 ms). TTL is dynamically adjusted using machine‑learning models to balance freshness and hit‑rate (typically < 5 min, sometimes seconds).

Pooling and overload protection : A custom pooling mechanism uses Redis as a distributed queue to schedule long‑running sub‑tasks, preventing thread blockage and reducing tail latency. Overload protection discards requests that exceed a configurable waiting‑time threshold, avoiding avalanche effects.

AI applications : Three AI‑driven use cases are highlighted—intelligent anti‑scraping (blocking ~9 % of traffic), query‑filtering to route high‑value requests to expensive engines, and TTL prediction using ML models. These techniques filter > 80 % of requests during peak traffic, saving ~80 % of resource cost while keeping order conversion stable.

Performance impact : The multi‑level cache increased overall throughput by up to 3×, improved hit‑rate by 27 %, and reduced average engine latency by 20 %. Switching the secondary cache from MongoDB to Redis cut costs by 90 % and improved read/write performance by 30 %.

Q&A highlights : The audience asked about cache usage scenarios, iteration from L1 to L2, distributed‑cache key design, Redis latency, queue implementation, cache consistency, hot‑key handling, pooling details, monitoring (ClickHouse, Grafana, Prometheus), and how caching interacts with pricing and user‑specific results.

Conclusion : By employing layered caching, robust load‑balancing, dynamic pooling, and targeted AI models, Ctrip’s flight search service achieves high availability, low tail latency, and cost‑effective scaling under extreme traffic conditions.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Distributed Systems AI load balancing Caching Real‑Time Computing flight search high-throughput

Written by

Tencent Cloud Developer

Official Tencent Cloud community account that brings together developers, shares practical tech insights, and fosters an influential tech exchange community.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.