Tag

offline inference

1 views collected around this technical thread.

ByteDance Cloud Native
ByteDance Cloud Native
Jun 13, 2023 · Artificial Intelligence

How Ray and Cloud‑Native Tech Supercharge Large‑Model Offline Inference

This article explains the challenges of large‑model offline (batch) inference, such as GPU memory limits and distributed scheduling, and shows how Ray’s cloud‑native architecture, model partitioning, and Ray Datasets can be used to build efficient, elastic inference frameworks deployed with KubeRay.

Cloud NativeGPU memoryRay
0 likes · 18 min read
How Ray and Cloud‑Native Tech Supercharge Large‑Model Offline Inference
Ctrip Technology
Ctrip Technology
Jan 5, 2017 · Artificial Intelligence

Practical Approaches to Deploying Machine Learning Models: PMML, Rserve, and Spark in Production

This article shares practical engineering experiences for deploying machine learning models in production, covering three typical scenarios—real‑time small data, real‑time large data, and offline predictions—and detailing how to use PMML, Rserve, Spark, shell scripts, and related tools to meet performance and operational requirements.

Model DeploymentPMMLRserve
0 likes · 12 min read
Practical Approaches to Deploying Machine Learning Models: PMML, Rserve, and Spark in Production