Tag

Machine Learning Inference

0 views collected around this technical thread.

Xiaohongshu Tech REDtech
Xiaohongshu Tech REDtech
May 15, 2023 · Artificial Intelligence

GPU-Accelerated Inference Optimization for Large-Scale Machine Learning at Xiaohongshu

Xiaohongshu transformed its recommendation, advertising, and search inference pipeline by migrating to GPU‑centric hardware, deploying a custom TensorFlow‑Core Lambda service, and applying system‑level, virtualization, and compute‑level optimizations—including NUMA binding, kernel fusion, dynamic scaling, and FP16 quantization—achieving roughly 30× compute capacity growth, over 10% user‑metric gains, and more than 50% cluster‑resource savings.

Deep LearningGPU optimizationMachine Learning Inference
0 likes · 20 min read
GPU-Accelerated Inference Optimization for Large-Scale Machine Learning at Xiaohongshu