Tag

mobile inference

0 views collected around this technical thread.

JD Retail Technology
JD Retail Technology
Feb 28, 2024 · Artificial Intelligence

Edge AI at JD Retail: Architecture, Challenges, and Business Practices

This article details JD Retail's edge AI (on‑device intelligence) platform, covering its definition, performance and security challenges, three‑layer cloud‑edge‑device architecture, key components such as high‑performance inference engine, data pipeline, Python VM container, and real‑world applications in traffic distribution and image recognition.

AI architectureJD RetailOn‑device Intelligence
0 likes · 15 min read
Edge AI at JD Retail: Architecture, Challenges, and Business Practices
Kuaishou Tech
Kuaishou Tech
Oct 21, 2022 · Artificial Intelligence

Real-time Short Video Recommendation on Mobile Devices: System Design, Model Architecture, and Experimental Evaluation

The paper presents a lightweight on‑device re‑ranking system for short‑video recommendation that leverages real‑time user feedback and context‑aware generative ranking, detailing its architecture, feature engineering, beam‑search optimization, and both offline and online experimental results showing significant performance gains.

Feature Engineeringbeam searchcontext-aware
0 likes · 12 min read
Real-time Short Video Recommendation on Mobile Devices: System Design, Model Architecture, and Experimental Evaluation
DaTaobao Tech
DaTaobao Tech
Jul 15, 2022 · Artificial Intelligence

Edge AI Model Evaluation and Optimization with TensorFlow, JAX, and TVM

The article demonstrates how to evaluate, compress, and convert deep‑learning models for edge devices using TensorFlow, JAX, and TVM—showing a faster iPhone‑based MNIST training benchmark, FLOPs measurement scripts, TFLite/ONNX/CoreML conversion, TVM compilation with auto‑tuning, and up to 50 % speed improvements on mobile NPU hardware.

JAXTVMTensorFlow
0 likes · 29 min read
Edge AI Model Evaluation and Optimization with TensorFlow, JAX, and TVM
Liulishuo Tech Team
Liulishuo Tech Team
Sep 3, 2016 · Artificial Intelligence

Optimizing Deep Neural Network Inference for Offline Speech Evaluation on Mobile Devices

This article describes how the English fluency app leverages deep neural network (DNN) models for real‑time speech scoring on smartphones, detailing offline inference challenges, BLAS‑based matrix‑vector optimizations, sparsity exploitation, cache‑friendly implementations, fixed‑point and NEON acceleration, as well as model compression techniques to improve accuracy and latency.

BLASDNN optimizationdeep learning
0 likes · 11 min read
Optimizing Deep Neural Network Inference for Offline Speech Evaluation on Mobile Devices