Practical Experience of Using ClickHouse for Ctrip Hotel Data Warehouse
This article details Ctrip's practical implementation of ClickHouse for its hotel data warehouse, covering background, technical selection, integration tooling, code optimization techniques, exception and server fault handling, and the resulting performance gains.
Author Introduction
Xiaoqin, senior data manager at Ctrip, is responsible for hotel BI and data warehouse work, focusing on big data applications for many years.
1. Background
With rapid business growth, Ctrip hotel data has accumulated to about 3 TB of daily traffic, plus a massive amount of order, price, volume, and status data. Although the existing Hive (Spark engine) execution speed is relatively fast, overseas business requires data to be ready several hours earlier than domestic, making performance improvement urgent. In early 2020, we began researching ClickHouse for data‑warehouse applications.
This article shares ClickHouse practice from five aspects: technical selection, integration development environment packaging, ClickHouse code optimization techniques, exception handling, and server fault handling.
2. Technical Research and Solution Selection
1) No internal ClickHouse cluster was available; the original verification cluster was being decommissioned.
2) A ClickHouse cluster was built on VMware for practice, with some colleagues focusing on single‑node syntax verification and others on cluster deployment, configuration, and IDE packaging.
3) By March 2020, a VMware‑based ClickHouse cluster was completed and four physical servers (256 GB RAM, 40 cores, 3.5 TB disk) were ready. To ensure a smooth production transition without adding pressure to the production DB, we synchronized data from the Hive ODS layer to the ClickHouse ODS layer. The architecture is shown in the diagram below.
3. Integration Development Environment Packaging
3.1 Data Synchronization Tool Packaging
We found that data synchronization consumed far more time than data computation, so we investigated ClickHouse import methods. One approach is:
cat filename.orc | clickhouse-client --query="INSERT INTO some_table FORMAT ORC"Based on this, we created a new orc2ck.sh tool with caching and batch processing, improving synchronization speed by more than 500%.
3.2 Execution Tool Packaging
To improve development efficiency and reduce code redundancy, we packaged a ClickHouse execution tool ck.sh . The execution environment is illustrated below.
4. ClickHouse Code Optimization Techniques
1) Place small tables on the right side of a JOIN to reduce memory consumption.
2) Use IN instead of JOIN to improve execution speed.
3) Reduce data scanning by adding filter logic, which lowers memory usage and speeds up execution.
5. Exception Handling
Code: 252, e.displayText() = DB::Exception: Too many parts (301). Merges are processing significantly slower than inserts.The issue requires analysis of the merge process (see diagram below). Insert speed should be limited to about 1 M rows/s.
Code: 241, e.displayText() = DB::Exception: Memory limit (for query) exceededThis error occurs when the query exceeds allocated memory. Solutions:
When server memory is abundant, increase the memory quota via max_memory_usage .
When memory is limited, offload excess data to disk using max_bytes_before_external_group_by and max_bytes_before_external_sort , though this reduces speed.
If these methods fail, review and optimize the code as described in the optimization section.
6. Server Fault Handling
Fault Background: A fault drill forced a ClickHouse server restart, causing the service to fail to start.
Solution: Examine the ClickHouse error log ( clickhouse-server.err.log ) to locate the issue. Two approaches were used:
Delete or move the table’s data files (illustrated in the error‑log screenshot).
Rebuild the table metadata (the more proper method).
7. Summary
By the first half of 2020, Ctrip hotel order themes and P1 system reports were fully implemented, achieving performance improvements of over 200% (average around 400%), effectively solving most application scenarios. Future work includes integrating more themes, further leveraging ClickHouse’s performance, and exploring Flink + ClickHouse for real‑time data warehousing.
Ctrip Technology
Official Ctrip Technology account, sharing and discussing growth.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.