Artificial Intelligence 15 min read

Evolution and Practice of the 58.com AI Algorithm Platform (WPAI)

The article details the development, architecture, and optimization of 58.com’s AI algorithm platform (WPAI), covering its background, overall design, large‑scale distributed machine learning, deep‑learning platform features, inference performance enhancements, GPU resource scheduling improvements, and future directions.

58 Tech

Nov 20, 2020

Evolution and Practice of the 58.com AI Algorithm Platform (WPAI)

Background The AI wave drives industry transformation, prompting 58.com’s AI Lab to build the WPAI platform since late 2017 to improve AI R&D efficiency across product lines, initially supporting large‑scale distributed models such as XGBoost, FM, and LR.

Overall Architecture WPAI comprises three major functions—deep learning, traditional machine learning, and vector retrieval—supporting TensorFlow, PyTorch, Caffe, PaddlePaddle, and Faiss. It runs on Kubernetes and Docker for unified GPU/CPU resource management, and integrates the WubaNLP NLP platform and the Phoenix image algorithm platform.

Large‑Scale Distributed Machine Learning The platform provides feature engineering, model training, and online prediction services. It supports distributed XGBoost (via RABIT), distributed FM (via parameter server), and hybrid models, enabling training on massive datasets and high‑dimensional features.

Deep Learning Platform Built on Kubernetes and Docker, the platform runs on GPU (P40, T4, 2080ti) and CPU machines, offering experiment environments, multi‑node training, and inference services for TensorFlow, PyTorch, Caffe, and PaddlePaddle. It also hosts the WubaNLP platform for text classification, matching, and sequence labeling, with models such as TextCNN, RNN, Transformer, BERT, RoBERTa, ALBERT, and a custom lightweight SPTM model.

Inference Performance Optimization GPU inference is accelerated using TensorRT (TF‑TRT, ONNX conversion, or custom C++/Python APIs) achieving up to 3.2× speedup on ResNet‑50 and 60% QPS increase on OCR models. CPU inference is optimized with Intel MKL‑DNN and OpenVINO, reducing latency and enabling model migration from GPU to CPU.

GPU Resource Scheduling Optimization To improve GPU utilization, the platform adopts mixed‑deployment for low‑traffic models and integrates GPU virtualization (GPU Manager) to slice GPUs into virtual units, achieving a 40% reduction in GPU usage and a 150% increase in utilization.

Summary WPAI now supports over 4,000 offline training models and 600+ online models, handling more than 41 billion daily inference requests. Ongoing work includes incorporating cutting‑edge technologies to further enhance efficiency for 58.com’s business lines.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Inference Optimization Kubernetes TensorRT GPU scheduling distributed machine learning AI Platform

Written by

58 Tech

Official tech channel of 58, a platform for tech innovation, sharing, and communication.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.