Big Data 13 min read

iQIYI’s Adoption of Apache Kylin for OLAP: Architecture, Optimizations, and Future Plans

The article details iQIYI’s migration from a Hive + MySQL OLAP stack to Apache Kylin, describing the system’s architecture, typical use cases, performance gains, independent HBase deployment, service platform for monitoring, and future plans such as automated cube building and clustering.

Big Data Technology Architecture
Big Data Technology Architecture
Big Data Technology Architecture
iQIYI’s Adoption of Apache Kylin for OLAP: Architecture, Optimizations, and Future Plans

During the "718 Apache Kylin Meetup" livestream, iQIYI senior R&D engineer Lin Hao shared how iQIYI replaced the traditional Hive + MySQL model with Apache Kylin for more than 20 BI, recommendation and other business scenarios, discussing concrete applications, achieved effects, encountered pitfalls, optimization experiences, and future plans such as simplifying cube adjustments.

iQIYI OLAP Service Evolution

The evolution of iQIYI's big‑data OLAP service is illustrated by an architecture diagram (see image). Data processing is divided into several layers: a collection platform gathers business event logs; data is ingested into HDFS for offline and Kafka for real‑time streams; various analysis engines sit on top—Hive for PB‑scale offline analysis, Kylin for daily reports on relatively fixed dimensions, Impala for ad‑hoc queries, Kudu for real‑time updates and analysis, and Druid for event‑stream processing. Above these engines, a unified SQL router (Pilot) directs queries to the appropriate engine, enabling BI, workflow, custom SQL, and real‑time analysis platforms.

Kylin Typical Requirements

A typical scenario is user‑behavior analysis, e.g., counting homepage displays in the last day. The following SQL demonstrates the query:

SELECT COUNT(1) as cnt
FROM hive_table_user_act
WHERE act_type = 'display'
AND page = 'home'
AND dt = '2020-07-18';

Traditional Hive analysis for such queries has fixed dimensions, low timeliness (usually T+1), high interaction latency (seconds), and massive data volume (hundreds of billions of rows per day).

Limitations of the Classic Hive + MySQL Architecture

Pre‑processing is slow: Hive/Spark jobs grow with the number of pre‑compute SQLs, sometimes taking more than a day.

Poor scalability: Large result sets cannot fit into a single MySQL instance, causing cube‑related scalability issues.

Difficult to change: New analyst requirements require engineers to modify pre‑process SQLs, leading to slow iteration and heavy development effort.

Resource waste: Manually written pre‑compute SQLs often duplicate work and are not optimally tuned.

Kylin Introduction and Benefits

To overcome these drawbacks, iQIYI introduced Kylin, a Hadoop‑based SQL analysis engine for fixed‑report scenarios that pre‑computes cubes and stores them in HBase, enabling sub‑second query response. Using Kylin for user‑behavior analysis yielded:

Pre‑processing speed up to 1/10 (from a full day to 2.5 hours).

Good scalability: building a 9‑dimensional cube on a table with billions of rows per day is straightforward.

Easy adjustment: dimensions and measures can be changed via a drag‑and‑drop UI without code changes.

Cost reduction: about 50 % of compute resources saved.

Independent HBase Cluster Deployment

Initially Kylin shared community HBase clusters, leading to stability issues and resource waste. iQIYI switched to a dedicated HBase cluster for Kylin cubes, configuring cross‑cluster jobs. This isolation improved stability (unavailable time reduced by 75 %) and query speed (average 30 % faster), and allowed HBase‑specific read‑optimizations such as larger read caches and fixed region splits.

Kylin Service Platform

The platform aggregates metadata, tasks, and query information from all clusters, providing a unified API and web UI for diagnosis and optimization. It offers features like cube size monitoring, failure analysis, and intelligent diagnostics (e.g., detecting excessive cube growth, frequent job failures, or query slow‑downs).

Cube Lifecycle Management

Without management, Kylin can pressure HBase with many tables, regions, and large cubes. The platform automatically suggests TTL settings, merge strategies, and identifies oversized cubes, helping to keep HBase healthy.

Task Intelligent Diagnosis

Previously, task failures were shown only as raw error messages in the Kylin UI, requiring operators to open tickets. The new diagnosis module matches failures against 18 known error patterns, presenting clear causes, remediation steps, and links to documentation, enabling self‑service troubleshooting.

Parameter Optimizations

Global dictionary building in Kylin 3.0 is now parallelized, reducing build time to one‑third of the previous version.

High‑cardinality dimension dictionary construction can be configured with kylin.engine.mr.uhc-reducer-count=5 to run five reducers concurrently, dramatically shortening build time for large dimensions.

Future Outlook

Automatic cube construction: analyze existing Hive queries, discover high‑affinity tables, and build cubes automatically.

Clustered deployment: consolidate instances to improve resource utilization and stability.

Platformization: further lower the cost of cube building by offering a managed service for business teams.

Author: Lin Hao, senior R&D engineer at iQIYI, leading the big‑data OLAP team since 2016.

performance optimizationBig DataHiveHBaseOLAPApache KylinCube
Big Data Technology Architecture
Written by

Big Data Technology Architecture

Exploring Open Source Big Data and AI Technologies

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.