Migrating a Game Data Platform to StarRocks: Architecture, Performance Gains, and Operational Benefits
This article describes how the gaming company Boke City rebuilt its comprehensive data service platform by replacing a CDH‑based Impala solution with StarRocks, detailing the architectural changes, performance benchmark results, and the resulting improvements in query speed, real‑time data updates, and operational simplicity.
Wave City (波克城市) is a gaming company that, after years of using a CDH‑based data platform built on Apache Impala and Hive, faced scalability, performance, and operational challenges as data volume and business complexity grew.
To address these issues, the company evaluated several OLAP engines (Impala, Druid, ClickHouse, StarRocks) and selected StarRocks for its high‑performance read/write capabilities, strong community support, and MySQL‑compatible protocol.
Key architectural changes include:
Replacing the CDH stack with StarRocks as the core analytical engine.
Consolidating data ingestion to three pipelines: DataX → StarRocks for batch loads, Canal + Routine Load for real‑time MySQL binlog, and Flink → StarRocks for Kafka streams.
Implementing a layered data model (ODS, DWD, DIM, DWS, ADS) directly in StarRocks, while retaining HDFS and Hive only for cold backup.
Simplifying permission management by using StarRocks’ built‑in user authentication, eliminating the need for Sentry + LDAP.
Performance benchmarks comparing StarRocks with Impala, a generic cloud DB, and ClickHouse on six representative SQL queries (including point lookups, date filters, multi‑dimensional aggregations, and window functions) showed substantial speed improvements:
SQL1
Impala 5.2s
CloudDB 0.3s
ClickHouse 0.16s
StarRocks 0.12s
SQL2
5.2s
138s
0.02s
0.04s
SQL3
0.18s
6.9s
0.04s
0.05s
SQL4
41.8s
46s
18.8s
7.5s
SQL5
95s
103s
58.5s
9.5s
SQL6
47.2s
62s
33.3s
7.2s
After migration, query latency improved by more than three times, real‑time data ingestion became seconds‑level, and operational overhead dropped dramatically thanks to StarRocks’ simple FE/BE architecture, automatic load balancing, and rich documentation.
Additional benefits include strong MySQL compatibility (reducing learning curve), flexible data modeling (wide tables, star/snowflake schemas, materialized views), and robust DML support (delete‑and‑insert model handling >95% of update/delete scenarios).
The company plans to extend StarRocks to other business areas such as real‑time advertising analytics, user profiling, and unified interactive query services, further consolidating a "big‑middle‑small front" data strategy across its global gaming portfolio.
DataFunTalk
Dedicated to sharing and discussing big data and AI technology applications, aiming to empower a million data scientists. Regularly hosts live tech talks and curates articles on big data, recommendation/search algorithms, advertising algorithms, NLP, intelligent risk control, autonomous driving, and machine learning/deep learning.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.