Databases 10 min read

Practical Experience of Data Storage in Ctrip Flight Big Data Platform: From Redis/MySQL to CrateDB

This article shares the Ctrip flight big‑data platform’s journey of evaluating and migrating data storage from Hive, MySQL and Redis to CrateDB, covering performance requirements, query patterns, maintenance challenges, containerization, and production results that reduced interface latency and resource consumption.

Ctrip Technology
Ctrip Technology
Ctrip Technology
Practical Experience of Data Storage in Ctrip Flight Big Data Platform: From Redis/MySQL to CrateDB

The author, a data analysis manager at Ctrip, introduces the challenges faced by the flight big‑data platform in storing, querying, and maintaining massive datasets for various customer‑facing services.

Initially, the platform relied on Hive, MySQL, and Redis, but growing QPS and latency requirements (>1 s response, <1 QPS) exposed limitations, especially the memory‑intensive Redis clusters that reached tens of terabytes.

A detailed comparison table shows the performance and feature gaps of Hive, MySQL, and Redis against the platform’s needs, highlighting that none could satisfy the 100 ms–500 ms latency window for multi‑keyword and time‑range queries.

After evaluating several NoSQL options, the team selected CrateDB, a distributed SQL database built on Elasticsearch, because it offers sub‑10 ms query speed, SQL support, structured data handling, and a hybrid disk‑plus‑memory storage model.

CrateDB’s SQL layer enables multi‑dimensional operations on a single data source, reducing data duplication and simplifying time‑range queries. Its sharding and partitioning improve resource utilization, and its support for replicas enhances reliability compared to pure in‑memory Redis.

The platform built a generic API system that translates data‑access logic into SQL, allowing configuration‑driven interface development. Data synchronization pipelines were automated using the Zeus platform and Spark, cutting import times for tens of millions of rows to minutes.

To improve operability, CrateDB was containerized on Kubernetes and managed with Rancher, providing automated scaling, monitoring, and efficient resource usage.

Production metrics from two CrateDB clusters (12 VMs each) show that queries on over 1 billion rows achieve 10–200 ms latency with QPS up to 1500, dramatically reducing interface rollout time from 2–3 days to 2–3 hours.

The article concludes that there is no perfect storage solution; teams should choose the technology that best fits their specific performance, scalability, and maintenance requirements.

Performance Optimizationbig dataDatabase MigrationData StorageCtripCrateDB
Ctrip Technology
Written by

Ctrip Technology

Official Ctrip Technology account, sharing and discussing growth.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.