iQIYI Technical Product Team
Dec 21, 2018 · Artificial Intelligence
CPU-Based Optimization of Deep Learning Inference Services
To alleviate GPU scarcity, iQIYI’s cloud platform migrated deep‑learning inference to CPUs and applied system‑level (MKL‑DNN, OpenVINO), application‑level, and algorithm‑level optimizations—tuning threads, batch size, NUMA, pruning and quantization—delivering 1‑9× speedups across thousands of cores while preserving latency and accuracy.
CPUMKL-DNNOpenVINO
0 likes · 14 min read