Databases 15 min read

Rebuilding Ctrip Train Ticket Metrics Platform with StarRocks: Architecture, Data Synchronization, and Performance Gains

The article details how Ctrip's train ticket business revamped its multi‑engine OLAP metrics platform by consolidating to the StarRocks MPP database, describing the new architecture, query workflow, data synchronization strategies, practical lessons, and the resulting dramatic improvement in query latency and reliability.

Ctrip Technology
Ctrip Technology
Ctrip Technology
Rebuilding Ctrip Train Ticket Metrics Platform with StarRocks: Architecture, Data Synchronization, and Performance Gains

Author Introduction Kane, senior data‑warehouse manager at Ctrip, focuses on data‑warehouse construction, data applications, and analysis; Wn is a big‑data platform development expert.

The Ctrip Train Ticket Business Group operates multiple ticketing services and needed an efficient metrics platform for quick KPI access.

1. Early OLAP Architecture and Pain Points

The original platform used three storage engines—ClickHouse, Apache Kylin, and Presto—resulting in mixed data layers, inconsistent metric definitions, high maintenance cost, steep learning curve, and poor performance due to a long Java‑based aggregation chain.

Figure 1: Legacy architecture using ClickHouse, Kylin, and Presto.

To address these issues, the team decided to rebuild the platform on a single, fast, and easy‑to‑use distributed database.

2. Overall Design of the Rebuilt Metrics Platform

After evaluating alternatives, StarRocks was chosen for its strong performance, MySQL‑compatible protocol, high concurrency support, flexible data models, materialized view capability, and multiple ingestion methods (StreamLoad, SparkLoad, RoutineLoad).

Figure 2: StarRocks advantages.

The new architecture consolidates all metrics into a single StarRocks instance. Detailed data are first loaded into temporary tables; subsequent aggregations read from these temporaries, eliminating repeated scans of large tables.

Figure 3: Rebuilt platform architecture.

2.1 Metric Query Process

When a query is issued, the request is split into multiple sub‑queries based on metric type (detail or aggregate).

Detail‑type Sub‑queries

a. Accumulative metrics generate a temporary table containing current, last‑year, and 2019‑same‑period detail rows; the table is partitioned by T+1 to avoid daily partition failures.

b. Non‑accumulative or distinct‑count metrics store only the required detail rows without creating reusable temporary data.

c. When multiple metrics share dimensions, their data are joined into a wide table for storage.

Aggregate‑type Sub‑queries

a. Metric card aggregation and period‑over‑period calculations.

b. Line‑chart generation and dimension drill‑down.

3. Caching Mechanism

To avoid re‑scanning massive dimension‑heavy queries, the platform records the first query’s parameters, MD5 of the original SQL, and success flag. Subsequent identical queries reuse the previous result via a unique group key, effectively acting as a query‑result cache.

2.2 Data Synchronization

The legacy platform’s 300+ metric SQLs were analyzed to extract 51 common DWD and DIM tables, which are now synchronized to StarRocks. Different synchronization strategies are applied:

a. Full sync for small tables (e.g., shareout_trn.dim_ibu_alliance , 608 KB).

b. Incremental partition sync for daily T‑1 partitions.

c. Update sync for tables with frequent status changes, using hash comparison to detect differences.

d. Monthly partition sync for low‑traffic tables to reduce bucket overhead.

-tarpresqls "ALTER TABLE ${table} ADD PARTITION if not exists p${partition} VALUES [('${zdt.addDay(1).format("yyyy-MM-dd")}' ),('${zdt.addDay(2).format("yyyy-MM-dd")}')];" \

Model key setting example:

UNIQUE KEY(`order_id`)

Finding rows with differing hashes:

select t1.* from (select … where d='${cur_day}') as t1 left join (select … where d='${pre_day}') as t2 on t1.business_pk_id=t2.business_pk_id where t1.hash_code!=t2.hash_code or t2.order_id is null

Partition configuration and DataX commands:

startdate='${zdt.format("yyyy-MM-01")}'
endDate='${zdt.add(2,1).format("yyyy-MM-01")}'
PARTITION BY RANGE(dt)(Start("2019-01-01") End("2023-03-01") Every(Interval 1 month))
DISTRIBUTED BY HASH(partition_field) BUCKETS N
PROPERTIES ("dynamic_partition.enable" = "true", "dynamic_partition.prefix" = "p", "dynamic_partition.time_unit" = "month", "dynamic_partition.end" = "1");

For UBT‑type data (large‑scale PV/UV/duration), three summary tables are created to keep daily aggregates small (tens of KB).

Monitoring jobs alert when daily data volume deviates beyond 3σ, ensuring sync completes within 15 minutes.

3. StarRocks Usage Experience

3.1 Table Design – Avoid excessive bucket numbers; aim for 500 MB–1 GB per bucket. Use monthly partitions for cold data and daily partitions for hot data.

3.2 Query Optimization – Leverage StarRocks indexes; avoid functions on filter columns (e.g., replace upper(col) with direct equality or IN ).

3.3 Function Differences – split in StarRocks is 1‑based, unlike SparkSQL’s 0‑based indexing, which can cause silent errors.

select split('a,b,c',',')[0]  -- returns NULL in StarRocks (should be 1)
select split('a,b,c',',')[1]  -- returns 'a' in StarRocks (should be 2)

4. Query Performance Improvements

Before reconstruction, complex queries took several minutes (up to 30 minutes for some ticket‑related metrics). After migration to StarRocks, typical query latency dropped to ~10 seconds, with P99 under 2 seconds, and query volume doubled.

New features such as period‑over‑period calculations and dimension drill‑down now run almost instantly thanks to StarRocks’ concurrency and materialized view support.

5. Future Optimization Directions

a. Replace large detail storage for UV distinct‑count metrics with Bitmap indexes to reduce space and speed up queries.

b. Apply aggregate models to full‑/incremental tables for pre‑aggregation, decreasing scanned rows.

c. Transition temporary‑table based aggregations to materialized views to simplify the pipeline.

Recommended reading links are provided at the end of the original article.

performance optimizationSQLStarRocksdata warehouseETLmetrics platform
Ctrip Technology
Written by

Ctrip Technology

Official Ctrip Technology account, sharing and discussing growth.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.