Tag

Inference Engine

1 views collected around this technical thread.

DataFunSummit
DataFunSummit
Dec 24, 2024 · Artificial Intelligence

Considerations and Practices for Domesticating Large‑Model Inference Engines

This article examines the importance of domestic large‑model inference engines, compares Chinese and international chips, evaluates four architectural approaches, discusses practical challenges such as performance loss and model support, and outlines future expectations for high‑performance, heterogeneous‑chip inference solutions.

AI infrastructureDomestic ChipInference Engine
0 likes · 9 min read
Considerations and Practices for Domesticating Large‑Model Inference Engines
DataFunSummit
DataFunSummit
Sep 11, 2023 · Artificial Intelligence

Challenges and Insights for Deploying Large Models on Edge with MNN

The talk presents an overview of the MNN inference engine, outlines the end‑to‑end workflow for deploying large language models on mobile devices, discusses technical challenges and practical solutions, and concludes with future directions for edge AI deployment.

AIInference EngineLarge Models
0 likes · 2 min read
Challenges and Insights for Deploying Large Models on Edge with MNN
OPPO Kernel Craftsman
OPPO Kernel Craftsman
Oct 28, 2022 · Artificial Intelligence

ShaderNN: A GPU Shader‑Based Lightweight Inference Engine for Mobile AI Applications

ShaderNN is an open‑source, sub‑2 MB GPU‑shader inference engine that runs TensorFlow, PyTorch and ONNX models directly on mobile graphics textures via OpenGL fragment and compute shaders, delivering real‑time, low‑power AI for image‑heavy tasks while eliminating third‑party dependencies and achieving up to 90 % speed gains.

GPUInference EnginePerformance
0 likes · 11 min read
ShaderNN: A GPU Shader‑Based Lightweight Inference Engine for Mobile AI Applications
ByteDance Terminal Technology
ByteDance Terminal Technology
Jul 29, 2022 · Artificial Intelligence

Pitaya: ByteDance’s End‑Side AI Engineering Platform Overview

Pitaya, built by ByteDance’s Client AI and MLX teams, is a comprehensive end‑side AI engineering platform that provides a full workflow from model development and data preparation to deployment, monitoring, and federated learning, supporting large‑scale commercial scenarios across multiple apps.

AI PlatformFederated LearningInference Engine
0 likes · 14 min read
Pitaya: ByteDance’s End‑Side AI Engineering Platform Overview
DataFunTalk
DataFunTalk
Apr 14, 2022 · Artificial Intelligence

PaddlePaddle Deep Learning Platform: Architecture, Core Technologies, and Real‑World Applications

The article presents a comprehensive overview of Baidu's open‑source deep learning platform PaddlePaddle, detailing its full‑stack architecture, core technologies such as unified dynamic‑static graph, large‑scale distributed training, multi‑platform inference, an extensive model zoo, hardware adaptation, and showcases a real‑world deployment case in power‑grid monitoring.

AI FrameworkInference EnginePaddlePaddle
0 likes · 15 min read
PaddlePaddle Deep Learning Platform: Architecture, Core Technologies, and Real‑World Applications