Tagged articles

14 articles

Page 1 of 1

May 25, 2026 · Artificial Intelligence

EdgeRazor Delivers 15× Faster Decoding on PC & Mobile, Solving Low-Bit Collapse

EdgeRazor, an open‑source framework from Nanjing University and Microsoft AI, uses mixed‑precision quantization‑aware distillation to compress large language models to as low as 1.58‑bit, achieving up to 15× faster decoding on PC and mobile, 10× fewer training tokens, and 7× model size reduction while preserving benchmark performance.

LLM QuantizationModel Compressionedge deployment

0 likes · 12 min read

EdgeRazor Delivers 15× Faster Decoding on PC & Mobile, Solving Low-Bit Collapse

DataFunTalk

May 13, 2026 · Artificial Intelligence

How a 0.5 MB AI Model Tackles Global Supply‑Chain Challenges: Li‑Net in Action

Li‑Net, a 0.5 MB multi‑channel time‑series model co‑developed by SF Technology and Chinese universities, achieves state‑of‑the‑art accuracy with linear‑complexity attention, runs on edge devices, and has been deployed across SF's global supply‑chain for demand forecasting, inventory optimization, and capacity planning, delivering measurable cost reductions.

AILi-NetSupply Chain

0 likes · 4 min read

How a 0.5 MB AI Model Tackles Global Supply‑Chain Challenges: Li‑Net in Action

DataFunSummit

May 12, 2026 · Artificial Intelligence

How a 0.5 MB AI Model Tackles Global Supply‑Chain Challenges: Li‑Net Technology and Applications

The article presents Li‑Net, a 0.5 MB lightweight time‑series model co‑developed by SF Technology and universities, accepted at ICDE 2026, which overcomes multi‑channel, non‑stationary, multimodal forecasting difficulties, achieves state‑of‑the‑art accuracy with low latency, and is deployed across SF’s global logistics to improve demand, inventory and capacity planning while cutting costs.

Li-NetSupply Chainedge deployment

0 likes · 4 min read

How a 0.5 MB AI Model Tackles Global Supply‑Chain Challenges: Li‑Net Technology and Applications

DaTaobao Tech

Apr 22, 2026 · Artificial Intelligence

How MNN‑Sana‑Edit‑V2 Brings Comic‑Style Image Editing to Your Phone in 15 seconds

MNN‑Sana‑Edit‑V2, a collaborative effort between Taobao’s Meta team and Hangzhou University, combines a frozen Qwen3‑0.6B LLM, Learnable Query, Connector, Linear DiT and Deep Compression Autoencoder with 4/8‑bit quantization to run fully on mobile devices, delivering 512×512 comic‑style conversions in about 15 seconds—2.5× faster than cloud alternatives—while providing open‑source code, detailed training stages, and extensive performance benchmarks.

Model Quantizationdiffusionedge deployment

0 likes · 13 min read

How MNN‑Sana‑Edit‑V2 Brings Comic‑Style Image Editing to Your Phone in 15 seconds

ZhiKe AI

Apr 15, 2026 · Artificial Intelligence

From Sci‑Fi to Reality: How AI Large Models Are Reshaping Our World

The article explains what AI is, traces its three historical waves—from rule‑based expert systems to statistical learning and deep learning—focuses on the current large‑language‑model era, surveys leading domestic and overseas models, and highlights key trends such as open‑source competition, reasoning capabilities, multimodality, and edge deployment.

AIMultimodalOpen Source

0 likes · 4 min read

From Sci‑Fi to Reality: How AI Large Models Are Reshaping Our World

Machine Heart

Apr 3, 2026 · Artificial Intelligence

How Foundation Models Are Transforming Embodied Navigation from Task‑Specific to General Intelligence

This survey systematically reviews how foundation models reshape embodied navigation, covering problem definition, taxonomy of tasks and robot forms, system architecture from perception to control, data sources and training strategies, edge deployment techniques, benchmark metrics, and future research directions.

benchmarkdata collectionedge deployment

0 likes · 11 min read

How Foundation Models Are Transforming Embodied Navigation from Task‑Specific to General Intelligence

Sohu Tech Products

Mar 19, 2026 · Frontend Development

How Void Turns Vite Projects into One‑Click Cloud Deployments

Void is a Vite‑native deployment platform that lets developers enable a plugin, run a single command, and automatically provision Cloudflare Edge, databases, storage, authentication, queues, and AI services, bridging the gap between local development and production with end‑to‑end type safety.

CLICloudflareFull‑stack

0 likes · 9 min read

How Void Turns Vite Projects into One‑Click Cloud Deployments

AI Engineering

Jan 14, 2026 · Artificial Intelligence

NovaSR: A 52KB Audio Super-Resolution Model that Upscales 16kHz Audio to Clear 48kHz

NovaSR is a 52KB open-source audio super-resolution model that can convert blurry 16kHz recordings into clearer 48kHz output, processing up to 3600 seconds of audio per second on a single GPU, and offering fast, lightweight enhancement for TTS, dataset cleaning, and edge devices.

NovaSRTTS enhancementaudio super-resolution

0 likes · 3 min read

NovaSR: A 52KB Audio Super-Resolution Model that Upscales 16kHz Audio to Clear 48kHz

AIWalker

Sep 23, 2025 · Artificial Intelligence

DIDB‑ViT Achieves SOTA Binary ViT Results, Outperforms Full‑Precision ResNet‑34 on ADE20K

The paper introduces DIDB‑ViT, a high‑fidelity differential‑information‑driven binary Vision Transformer that closes the performance gap with full‑precision models while keeping the original ViT architecture, and demonstrates state‑of‑the‑art results on image classification and ADE20K segmentation, even surpassing full‑precision ResNet‑34.

Model Compressionbinary neural networksedge deployment

0 likes · 28 min read

DIDB‑ViT Achieves SOTA Binary ViT Results, Outperforms Full‑Precision ResNet‑34 on ADE20K

Java Captain

Feb 7, 2025 · Artificial Intelligence

DeepSeek: Disruptive Innovations in Large Language Model Architecture, Efficiency, and Ecosystem

DeepSeek reshapes the AI landscape by replacing brute‑force compute scaling with algorithmic breakthroughs such as a novel MoE architecture, memory compression, active‑learning data pipelines, and open‑source tooling, delivering dramatically lower training and inference costs while enabling edge deployment and a vibrant developer ecosystem.

Algorithmic EfficiencyDeepSeekMoE

0 likes · 11 min read

DeepSeek: Disruptive Innovations in Large Language Model Architecture, Efficiency, and Ecosystem

Sohu Tech Products

May 21, 2024 · Artificial Intelligence

OPPO Multimodal Pretrained Model Deployment in Cloud-Edge Scenarios: Practices and Optimizations

OPPO details how it deploys multimodal pretrained models on resource‑constrained edge devices by compressing CLIP‑based image‑text retrieval, adapting Chinese text‑to‑image generation with LoRA and adapters, and lightweighting diffusion models through layer pruning and progressive distillation, achieving sub‑3‑second generation while preserving cloud‑level quality.

CLIPLoRAModel Compression

0 likes · 18 min read

OPPO Multimodal Pretrained Model Deployment in Cloud-Edge Scenarios: Practices and Optimizations

DaTaobao Tech

Jan 5, 2024 · Mobile Development

Edge Deployment and Performance Optimization of Large Language Models with MNN

The upgraded mnn‑llm framework adds a unified llm‑export pipeline, cross‑platform inference with tokenizers and disk‑embedding, and ARM‑focused linear‑layer optimizations—including SIMD, hand‑written assembly and 4‑bit quantization—that dramatically speed up prefilling and achieve real‑time LLM conversation on mobile devices within a 2 GB memory budget, outperforming llama.cpp, fastllm and mlc‑llm.

ARM CPULLMMNN

0 likes · 17 min read

Edge Deployment and Performance Optimization of Large Language Models with MNN

DataFunSummit

Sep 11, 2023 · Artificial Intelligence

Challenges and Insights for Deploying Large Models on Edge with MNN

The talk presents an overview of the MNN inference engine, outlines the end‑to‑end workflow for deploying large language models on mobile devices, discusses technical challenges and practical solutions, and concludes with future directions for edge AI deployment.

AIInference EngineLarge Models

0 likes · 2 min read

Challenges and Insights for Deploying Large Models on Edge with MNN

Baidu Geek Talk

Mar 9, 2022 · Artificial Intelligence

Communication Tower Recognition Using PaddlePaddle: An Industrial AI Practice

The article describes an industrial AI system that uses PaddlePaddle’s PP‑PicoDet model, enhanced with COCO pre‑training and quantization, to accurately recognize communication towers in diverse outdoor conditions, achieving 94.5% mAP at 78 ms inference and supporting edge deployment via PaddleLite and ONNX.

PP-PicoDetPaddlePaddlecommunication tower

0 likes · 6 min read

Communication Tower Recognition Using PaddlePaddle: An Industrial AI Practice