Databases 20 min read

iQIYI’s Database Selection, Optimization, and Management Practices

This article discusses iQIYI’s approach to database selection, covering evaluation criteria, the variety of databases used—including MySQL, TiDB, Redis, Couchbase, and the in‑house HiKV—along with detailed optimization, high‑availability, auditing, and operational management techniques for each system.

Architecture Digest
Architecture Digest
Architecture Digest
iQIYI’s Database Selection, Optimization, and Management Practices

1. Dimensions of Database Technology Selection

When choosing a database, iQIYI first asks who is making the decision—procurement, DBA, or application developers—and what their primary concerns are. Procurement focuses on cost (storage, network). DBAs prioritize operational cost, stability, performance, scalability, security, and audit requirements. Developers care about stability, performance, scalability, and ease of integration.

2. iQIYI’s Database Portfolio

The company uses a wide range of databases:

MySQL – core relational database.

TiDB – HTAP database, discussed in a separate talk.

Redis – KV store, standard in internet companies.

Couchbase – less common in China, used extensively at iQIYI.

Other systems such as MongoDB, graph databases, and the self‑developed KV store HiKV.

Big‑data analysis platforms like Hive and Impala.

These databases are classified by interface (SQL/NoSQL) and workload type (OLTP/OLAP). MySQL and TiDB fall into the OLTP‑SQL category, Redis into NoSQL, while Hive/Impala belong to OLAP. TiDB exemplifies HTAP, offering both OLTP and OLAP capabilities.

3. iQIYI’s Database Optimization and Enhancements

MySQL

Basic Architecture : master‑slave with semi‑synchronous replication, weekly full backup plus daily incremental backup.

Backup Optimization : Improved Xtrabackup to reduce full‑restore time from 5 hours to ~100 minutes and added single‑table recovery.

DDL/DML Tools : Integrated gh‑ost and oak‑online‑alter‑table with latency monitoring to pause operations when replication lag is high.

High Availability : Replaced default MHA with a master‑agent model; agents heartbeat to master, failover triggers binlog compensation and DNS‑independent failover via Raft groups, supporting intra‑region, inter‑region, and cross‑region scenarios.

Scalability : Adopted ShardingSphere SDK and proxy solutions; some workloads migrated to TiDB due to complexity of proxy scaling.

Auditing : Developed a plugin that streams full SQL statements to Kafka, then to ClickHouse for analysis; implemented ring‑buffer based buffering to keep performance impact below 2 %.

Tiered Storage : Combined MySQL with TiDB or TokuDB for hot‑cold data separation, using SDK + proxy as a unified access layer.

Redis

Deployed master‑slave with Sentinel across multiple data centers; added custom Sentinel configurations to avoid split‑brain. Implemented a real‑time backup that mirrors Redis data to ScyllaDB for recovery. Optimized failover handling by shortening DNS TTL and introducing a Redis Name Service (RNS) that obtains topology from Sentinel and provides direct IP addresses to clients. Enhanced Jedis client to rebuild connections only for failed shards, reducing impact on overall QPS. Added automatic scaling and health‑check mechanisms, and built a proxy that buffers writes to Kafka for cross‑cluster replication.

Couchbase

Used primarily as a high‑performance KV store with two bucket types: Memcached (no persistence) and Couchbase (persistent with JSON documents and replicas). Client‑side vBucket mapping enables transparent failover. Managed clusters with Erlang‑based tools, supporting various replication topologies (single‑, bi‑, star‑, ring‑, chain‑). Implemented dual‑cluster XDCR for active‑active failover. Developed a Java SDK to switch writes between clusters during failures.

HiKV (Self‑Developed KV Store)

Built on ScyllaDB’s architecture but replaces the LSM‑Tree engine with a custom SSD‑backed KV engine: keys reside in memory, values on disk, with a fixed‑length index (64 bytes) and red‑black‑tree lookup. Supports multi‑replica, multi‑DC, and multi‑master writes. HiKV now serves ~30 % of iQIYI’s former Couchbase workload, reducing storage costs.

4. Database Operations and Management Platform

iQIYI evolved its DB operations from manual DBA scripts to a self‑service private cloud portal, then to a fully web‑based UI that automates 90 % of routine tasks. Experienced DBAs encapsulated troubleshooting procedures into one‑click diagnostic tools. Additional capabilities include proactive alerting, intelligent chatbot assistance, instance tagging for load‑balancing, and automated resource scheduling.

5. Practical Database Selection Guidance

A decision tree (provided in the original article) helps teams choose between relational and NoSQL options based on data volume, QPS, latency, backup needs, storage engine preferences, and proxy requirements.

Key considerations include validating true requirements, avoiding over‑engineering, being willing to abandon unsuitable solutions, evaluating the need for custom development, and embracing open‑source technologies.

RedismysqlCouchbasedatabase selectionHiKVoperational optimization
Architecture Digest
Written by

Architecture Digest

Focusing on Java backend development, covering application architecture from top-tier internet companies (high availability, high performance, high stability), big data, machine learning, Java architecture, and other popular fields.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.