Tagged articles

76 articles

Page 1 of 1

May 17, 2026 · Mobile Development

How Gemini Intelligence Turns Android Phones into Personal Assistants

Google's Gemini Intelligence upgrades Android from an operating system to an AI-driven platform, enabling cross‑app automation, Chrome‑based browsing tasks, intelligent autofill, spoken‑to‑text messaging, and natural‑language widget creation, while reshaping hardware strategy and developer interfaces.

AIAndroidCross-app automation

0 likes · 6 min read

How Gemini Intelligence Turns Android Phones into Personal Assistants

Machine Heart

May 12, 2026 · Artificial Intelligence

How DreamLite Enables Real-Time Text-to-Image Generation and Editing on Mobile Devices

DreamLite, a 0.39 B‑parameter diffusion model from ByteDance, unifies text‑to‑image generation and text‑guided editing in a single on‑device network, delivering 1024×1024 results in about three seconds on an iPhone 17 Pro while surpassing existing mobile and even many server‑side baselines.

DreamLiteModel CompressionRLHF

0 likes · 9 min read

How DreamLite Enables Real-Time Text-to-Image Generation and Editing on Mobile Devices

java1234

May 12, 2026 · Artificial Intelligence

Google AI Edge Gallery Goes Open‑Source and Racks Up 22K Stars

Google AI Edge Gallery is an on‑device generative‑AI platform that lets users run and compare open‑source large language models on phones, offering features like chat, image Q&A, audio transcription, benchmark testing, and modular skills, while its open‑source nature and easy installation have quickly earned it over 22,000 GitHub stars.

AI Edge GalleryGoogle AI EdgeLiteRT

0 likes · 10 min read

Google AI Edge Gallery Goes Open‑Source and Racks Up 22K Stars

AI Explorer

Apr 29, 2026 · Artificial Intelligence

Tencent Open‑Sources Hy‑MT: Offline Translation for 33 Languages Beats Google Translate

Tencent’s Hy‑MT1.5‑1.8B‑1.25bit model, now open‑source, runs entirely offline on smartphones, supports 33 languages, and—according to internal tests—delivers translation quality that surpasses Google Translate’s online service, highlighting the impact of 1.25‑bit quantization on model size and performance.

1.25bit quantizationHy-MTLanguage Models

0 likes · 6 min read

Tencent Open‑Sources Hy‑MT: Offline Translation for 33 Languages Beats Google Translate

DaTaobao Tech

Apr 22, 2026 · Artificial Intelligence

How MNN‑Sana‑Edit‑V2 Brings Comic‑Style Image Editing to Your Phone in 15 seconds

MNN‑Sana‑Edit‑V2, a collaborative effort between Taobao’s Meta team and Hangzhou University, combines a frozen Qwen3‑0.6B LLM, Learnable Query, Connector, Linear DiT and Deep Compression Autoencoder with 4/8‑bit quantization to run fully on mobile devices, delivering 512×512 comic‑style conversions in about 15 seconds—2.5× faster than cloud alternatives—while providing open‑source code, detailed training stages, and extensive performance benchmarks.

Model Quantizationdiffusionedge deployment

0 likes · 13 min read

How MNN‑Sana‑Edit‑V2 Brings Comic‑Style Image Editing to Your Phone in 15 seconds

Geek Labs

Apr 11, 2026 · Mobile Development

PhoneClaw and PokeClaw: Turning Your Phone into a Private AI Agent

PhoneClaw and PokeClaw are open‑source, on‑device AI agents for iOS and Android that run Gemma 4 locally, offering offline privacy, zero‑cost operation, and native tool calling through iOS APIs or Android Accessibility services.

AndroidGemmaPhoneClaw

0 likes · 11 min read

PhoneClaw and PokeClaw: Turning Your Phone into a Private AI Agent

AI Explorer

Apr 10, 2026 · Artificial Intelligence

Google AI Edge Gallery: Offline Mobile AI Model Playground

Google’s open‑source AI Edge Gallery lets Android and iOS devices run large language models such as Gemma 4 entirely offline, eliminating network latency and privacy concerns; the app showcases six modular AI features, offers a simple install path, and signals Google’s push toward a standardized edge‑AI ecosystem.

Edge AIGemma 4Google AI Edge Gallery

0 likes · 8 min read

Google AI Edge Gallery: Offline Mobile AI Model Playground

Xiaomi Tech

Apr 10, 2026 · Artificial Intelligence

Xiaomi AI’s 8× Faster Mobile Inference and OCR‑Free 80‑Page Document Understanding at ACL 2026

Xiaomi’s AI team announced seven ACL 2026 papers that span low‑bit KV‑cache quantization for 8.3× faster LLM inference, OCR‑free multi‑page document VQA, a new attention‑basin analysis, non‑autoregressive spoken dialogue generation, a comprehensive mobile‑agent benchmark, a success‑rate‑aware training policy, and a progressive universal information‑extraction framework.

Inference Optimizationbenchmarkdialogue generation

0 likes · 12 min read

Xiaomi AI’s 8× Faster Mobile Inference and OCR‑Free 80‑Page Document Understanding at ACL 2026

AI Explorer

Apr 8, 2026 · Artificial Intelligence

Exploring Google AI Edge Gallery: Running Large Models Locally on Your Phone

Google’s AI Edge Gallery lets you run cutting‑edge large language models such as Gemma 4 entirely offline on Android or iOS devices, offering absolute privacy, zero‑latency responses, and a modular platform with agent skills, thinking mode, multimodal input, and a prompt‑lab for on‑device AI experimentation.

GemmaGoogle AI Edge GalleryKotlin

0 likes · 6 min read

Exploring Google AI Edge Gallery: Running Large Models Locally on Your Phone

vivo Internet Technology

Mar 18, 2026 · Artificial Intelligence

How Ada-RefSR Eliminates Hallucinations in Single‑Step Diffusion Super‑Resolution

This article presents Ada-RefSR, a novel single‑step diffusion‑based reference super‑resolution framework that introduces a "Trust but Verify" paradigm, adaptive implicit correlation gating, and lightweight architecture to robustly suppress hallucinations and achieve state‑of‑the‑art performance on multiple benchmarks, while being suitable for mobile deployment.

Ada-RefSRICLR2026diffusion

0 likes · 10 min read

How Ada-RefSR Eliminates Hallucinations in Single‑Step Diffusion Super‑Resolution

AI Engineering

Jan 21, 2026 · Artificial Intelligence

Running Large Language Models on Phones: Liquid AI’s LFM2.5‑1.2B‑Thinking Fits in 900 MB

Liquid AI’s LFM2.5‑1.2B‑Thinking model runs entirely on a smartphone with only 900 MB of memory, scores 88 on MATH‑500, 69 on Multi‑IF, and 57 on BFCLv3 benchmarks, outperforms larger rivals, and achieves real‑time speeds on Snapdragon 8 Elite and AMD Ryzen 9 3950X, signaling a shift toward edge AI.

LFM2.5Large Language ModelRyzen

0 likes · 4 min read

Running Large Language Models on Phones: Liquid AI’s LFM2.5‑1.2B‑Thinking Fits in 900 MB

AI Engineering

Jan 10, 2026 · Artificial Intelligence

Building Agent-Native Apps After Claude Code: Principles and Practices

The article outlines the Agent‑native architecture introduced by Claude Code, detailing five core principles, file‑as‑interface design, mobile considerations, dynamic capability discovery, implementation patterns, product implications, and anti‑patterns, providing a concrete roadmap for building self‑improving AI agents.

AI AgentsAgent-native architectureClaude Code

0 likes · 18 min read

Building Agent-Native Apps After Claude Code: Principles and Practices

HyperAI Super Neural

Jan 9, 2026 · Artificial Intelligence

How HY-MT1.5 Achieves 1 GB Mobile Translation with a 1.8B Model

The article explains how Tencent's open‑source HY‑MT1.5 tackles the high‑cost, large‑parameter barrier of neural machine translation by offering a 1.8 B‑parameter model that runs on roughly 1 GB of RAM, processes 50 tokens in 0.18 s, supports 33 languages, and uses on‑policy distillation to retain top‑tier accuracy, while providing a step‑by‑step online demo and free compute credits for new users.

HY-MT1.5Tencentlarge language models

0 likes · 5 min read

How HY-MT1.5 Achieves 1 GB Mobile Translation with a 1.8B Model

DataFunSummit

Dec 23, 2025 · Artificial Intelligence

What Core Capabilities Do Mature GUI Agents Need? Expert Insights from the Agentic AI Summit

In a live discussion hosted by Prof. Yang Jian with experts Zhang Xi and Cui Chen, the panel explores the essential abilities of mature GUI agents, the role of multimodal models in visual understanding, the transfer of code‑agent techniques to GUI tasks, edge‑device performance trade‑offs, complex planning, tool ecosystems, deployment challenges, and future breakthrough scenarios.

Code AgentGUI AgentModel Compression

0 likes · 22 min read

What Core Capabilities Do Mature GUI Agents Need? Expert Insights from the Agentic AI Summit

PaperAgent

Dec 10, 2025 · Artificial Intelligence

How AI Agents Like UFO, Mobile-Agent, and UI-TARS Are Shaping 2025 Smartphones

The article examines the underlying GUI‑Agent technologies behind the 2025 “Doubao” smartphone, comparing Microsoft’s UFO series, Alibaba’s Mobile‑Agent v2/v3, and ByteDance’s UI‑TARS, detailing their model foundations, input modalities, action spaces, planning mechanisms, learning strategies, open‑source status, and multi‑agent frameworks.

AI AgentsGUI automationOpen Source

0 likes · 8 min read

How AI Agents Like UFO, Mobile-Agent, and UI-TARS Are Shaping 2025 Smartphones

PMTalk Product Manager Community

Dec 4, 2025 · Artificial Intelligence

Is Doubao’s AI Phone the Future iPhone?

The article evaluates Doubao’s AI phone by testing everyday scenarios, highlighting its screen‑recognition‑driven automation, high latency, privacy risks, and comparing its performance and usability against Honor’s YOYO assistant and other AI‑enabled smartphones.

AI assistantPerformanceProduct Comparison

0 likes · 9 min read

DataFunSummit

Oct 31, 2025 · Artificial Intelligence

How OPPO’s AndesVL Is Revolutionizing On‑Device Multimodal AI

OPPO AI Center introduces AndesVL, an open‑source, fully‑adapted multimodal large model ranging from 0.6B to 4B parameters, designed for high‑performance, privacy‑preserving, low‑latency AI on mobile devices, with advanced architecture, training pipelines, on‑device optimizations, and state‑of‑the‑art benchmark results.

Large Language ModelModel Compressionmobile AI

0 likes · 21 min read

How OPPO’s AndesVL Is Revolutionizing On‑Device Multimodal AI

Sohu Smart Platform Tech Team

Aug 9, 2025 · Artificial Intelligence

Deploying Large Language Models Offline on Mobile Devices: A Practical Guide

This article explains the challenges of running large language models on mobile devices, reviews recent industry efforts, and provides a step‑by‑step guide—including code snippets—for integrating a distilled GPT‑2 model with Sohu's Hybrid AI Engine using TensorFlow Lite and Keras‑NLP for on‑device inference.

Hybrid AIKerasLLM

0 likes · 10 min read

Deploying Large Language Models Offline on Mobile Devices: A Practical Guide

DaTaobao Tech

Jul 7, 2025 · Artificial Intelligence

How Alibaba’s TaoAvatar Brings Real‑Time 3D Digital Humans to Your Phone

TaoAvatar, Alibaba’s new 3D digital‑human platform, enables lifelike, real‑time avatars on mobile and XR devices by combining 3D Gaussian splatting, on‑device AI dialogue, and a lightweight MNN inference engine, and the full source code is now open‑sourced as MNN‑TaoAvatar.

3D digital humanMNN inferenceOpen Source

0 likes · 15 min read

How Alibaba’s TaoAvatar Brings Real‑Time 3D Digital Humans to Your Phone

DataFunTalk

Jul 3, 2025 · Artificial Intelligence

How Vivo’s Blue Heart XiaoV Leverages LLMs to Transform Conversational Recommendations

In an interview with Vivo AI engineer Liang Tianan, the article explores the challenges of post‑Q&A recommendation, the integration of large language models into recall, ranking and evaluation pipelines, and the engineering trade‑offs required to deliver high‑quality, diverse suggestions on mobile devices.

LLMModel CompressionMultimodal

0 likes · 15 min read

How Vivo’s Blue Heart XiaoV Leverages LLMs to Transform Conversational Recommendations

AIWalker

Jan 18, 2025 · Artificial Intelligence

SnapGen Generates 1024px Images in 1.4 s with Lightweight On‑Device Architecture

SnapGen is a 379 M‑parameter text‑to‑image diffusion model that produces 1024 px images on mobile devices in about 1.4 seconds, using a compact U‑Net design, multi‑stage knowledge distillation, step distillation, and optimized training tricks to outperform much larger models on standard benchmarks.

Knowledge DistillationModel CompressionSnapGen

0 likes · 22 min read

SnapGen Generates 1024px Images in 1.4 s with Lightweight On‑Device Architecture

AIWalker

Jan 12, 2025 · Artificial Intelligence

SnapGen Generates 1024px Images in 1.4 s with Lightweight On‑Device Architecture

SnapGen introduces a compact 379M‑parameter diffusion model that produces 1024‑pixel text‑to‑image results in about 1.4 seconds on a mobile device, achieving competitive FID scores and outperforming much larger models through a series of architecture refinements, advanced training tricks, and multi‑level knowledge distillation.

Knowledge DistillationModel CompressionSnapGen

0 likes · 23 min read

AIWalker

Jan 11, 2025 · Artificial Intelligence

CAS-ViT: The Fastest, Strongest Vision Transformer for Mobile Image Classification & Detection

CAS‑ViT introduces a convolutional additive self‑attention mechanism that dramatically reduces the computational cost of Vision Transformers, achieving state‑of‑the‑art accuracy on image classification, object detection, and segmentation while being deployable on mobile devices.

Efficient ModelsSelf-AttentionVision Transformer

0 likes · 19 min read

CAS-ViT: The Fastest, Strongest Vision Transformer for Mobile Image Classification & Detection

iKang Technology Team

Dec 12, 2024 · Mobile Development

How to Build AI-Powered iOS Apps with Core ML, Create ML, and Vision

This article explains how to integrate artificial‑intelligence capabilities such as image classification, speech‑to‑text, and facial‑expression analysis into iOS applications using Apple’s Core ML, Create ML, and Vision frameworks, providing step‑by‑step guidance, code samples, and future‑direction insights.

Core MLCreate MLMachine Learning

0 likes · 16 min read

How to Build AI-Powered iOS Apps with Core ML, Create ML, and Vision

DaTaobao Tech

Nov 20, 2024 · Mobile Development

MNN-Transformer: Efficient On‑Device Large Language and Diffusion Model Deployment

MNN‑Transformer provides an end‑to‑end framework that enables large language and diffusion models to run efficiently on modern smartphones by exporting, quantizing (including dynamic int4/int8 and KV cache compression) and executing via a plugin‑engine runtime, achieving up to 35 tokens/s decoding and 2‑3× faster image generation compared with existing on‑device solutions.

LLMMNNdiffusion

0 likes · 15 min read

MNN-Transformer: Efficient On‑Device Large Language and Diffusion Model Deployment

21CTO

May 21, 2024 · Artificial Intelligence

How Google’s Edge AI Makes On‑Device Large Language Models a Reality

Google I/O highlighted the rise of on‑device AI, showing how new neural processors, Edge TPU, and tools like the Edge AI SDK and TensorFlow Lite enable developers to run large language models locally, reducing latency, cost, and privacy concerns while integrating with cloud resources.

AIEdge AIGoogle I/O

0 likes · 9 min read

How Google’s Edge AI Makes On‑Device Large Language Models a Reality

Sohu Tech Products

Mar 6, 2024 · Mobile Development

On‑Device Deployment of Large Language Models Using Sohu’s Hybrid AI Engine and GPT‑2

The article outlines how Sohu’s Hybrid AI Engine enables on‑device deployment of a distilled GPT‑2 model by converting it to TensorFlow Lite, detailing the setup, customization with Keras, inference workflow, and core SDK calls, and argues that this approach offers fast, private, and cost‑effective AI for mobile devices despite typical LLM constraints.

GPT-2Hybrid AIKeras

0 likes · 9 min read

On‑Device Deployment of Large Language Models Using Sohu’s Hybrid AI Engine and GPT‑2

Huolala Tech

Sep 28, 2023 · Artificial Intelligence

How Mobile AI Transforms Logistics: Real‑World Image Algorithms at Huolala

This article explores Huolala's deployment of mobile AI image algorithms for driver document verification and vehicle sticker inspection, detailing model design, lightweighting, hybrid processing, data stream handling, and on‑device deployment that boost efficiency, privacy, and real‑time performance in logistics operations.

LogisticsModel Compressionedge computing

0 likes · 13 min read

How Mobile AI Transforms Logistics: Real‑World Image Algorithms at Huolala

DataFunSummit

Sep 11, 2023 · Artificial Intelligence

Challenges and Insights for Deploying Large Models on Edge with MNN

The talk presents an overview of the MNN inference engine, outlines the end‑to‑end workflow for deploying large language models on mobile devices, discusses technical challenges and practical solutions, and concludes with future directions for edge AI deployment.

AIInference EngineLarge Models

0 likes · 2 min read

Challenges and Insights for Deploying Large Models on Edge with MNN

HelloTech

Aug 9, 2023 · Artificial Intelligence

Device Intelligence: Concepts, Architecture, and Applications

Device intelligence brings on-device reasoning and real-time inference to smartphones and IoT gateways, delivering low-latency, privacy-preserving, personalized services such as AR/VR enhancements and recommendation re-ranking, while confronting challenges of hardware fragmentation and model size, and complementing cloud AI through architectures like Hala’s MNN-based pipeline.

Device IntelligenceReal-time Decisionedge computing

0 likes · 10 min read

Device Intelligence: Concepts, Architecture, and Applications

DataFunSummit

Feb 4, 2023 · Artificial Intelligence

Walle: An End‑to‑End, General‑Purpose, Scalable Edge‑Cloud Collaborative Machine Learning System

The article introduces Walle, Alibaba's four‑year‑old edge‑cloud collaborative machine‑learning platform that unifies compute containers, data pipelines, and a deployment platform to enable low‑latency, privacy‑preserving, and high‑throughput AI services across billions of mobile devices, and presents its architecture, design challenges, and evaluation results.

Cloud ComputingMachine LearningSystem Architecture

0 likes · 25 min read

Walle: An End‑to‑End, General‑Purpose, Scalable Edge‑Cloud Collaborative Machine Learning System

Alipay Experience Technology

Dec 8, 2022 · Artificial Intelligence

How xNN Revolutionizes Edge AI with Scalable Modeling and Optimization

This article explains the evolution of Ant Group's xNN edge‑AI framework, detailing its four‑layer model‑optimization space, the lightweight modeling of version 1.0, and the transition to scalable modeling in version 2.0 to better exploit fragmented device compute resources.

Edge AIdeep learningmobile AI

0 likes · 21 min read

How xNN Revolutionizes Edge AI with Scalable Modeling and Optimization

Xiaohe Frontend Team

Nov 28, 2022 · Artificial Intelligence

Exploring Mobile-Friendly Machine Learning Frameworks: From ncnn to PaddlePaddle

This article reflects on remote‑work life while introducing machine learning fundamentals and reviewing several mobile‑optimized AI frameworks—including ncnn, Caffe, TensorFlow, PyTorch, and PaddlePaddle—to help developers choose suitable tools for on‑device intelligence.

Machine LearningTensorFlowframeworks

0 likes · 7 min read

Exploring Mobile-Friendly Machine Learning Frameworks: From ncnn to PaddlePaddle

Alipay Experience Technology

Nov 28, 2022 · Artificial Intelligence

Why Edge Intelligence Is Shaping the Future of Mobile Apps

This article explains the concept of edge intelligence, its advantages over cloud‑based AI, the technical challenges of deploying AI on mobile devices, Ant Group's development timeline, core technology stack, and future directions for edge‑cloud collaboration.

Edge AIai-optimizationmobile AI

0 likes · 10 min read

Why Edge Intelligence Is Shaping the Future of Mobile Apps

DaTaobao Tech

Nov 18, 2022 · Artificial Intelligence

ARMv86 Instruction Set Optimization for MNN: Accelerating Int8 and BF16 Matrix Multiplication

The article explains how ARMv86’s new SMMLA and BFMMLA GEMM instructions are integrated into MNN to accelerate INT8 and BF16 matrix multiplication, delivering up to 90% speedup over ARMv82’s SDOT and FP16‑FMLA kernels through optimized kernels, tiling, and compatibility handling.

ARMv86MNNMatrix Multiplication

0 likes · 15 min read

ARMv86 Instruction Set Optimization for MNN: Accelerating Int8 and BF16 Matrix Multiplication

OPPO Kernel Craftsman

Oct 28, 2022 · Artificial Intelligence

ShaderNN: A GPU Shader‑Based Lightweight Inference Engine for Mobile AI Applications

ShaderNN is an open‑source, sub‑2 MB GPU‑shader inference engine that runs TensorFlow, PyTorch and ONNX models directly on mobile graphics textures via OpenGL fragment and compute shaders, delivering real‑time, low‑power AI for image‑heavy tasks while eliminating third‑party dependencies and achieving up to 90 % speed gains.

GPUInference EnginePerformance

0 likes · 11 min read

ShaderNN: A GPU Shader‑Based Lightweight Inference Engine for Mobile AI Applications

ByteDance Terminal Technology

Jul 29, 2022 · Artificial Intelligence

Pitaya: ByteDance’s End‑Side AI Engineering Platform Overview

Pitaya, built by ByteDance’s Client AI and MLX teams, is a comprehensive end‑side AI engineering platform that provides a full workflow from model development and data preparation to deployment, monitoring, and federated learning, supporting large‑scale commercial scenarios across multiple apps.

AI PlatformEdge AIInference Engine

0 likes · 14 min read

Pitaya: ByteDance’s End‑Side AI Engineering Platform Overview

DaTaobao Tech

Jul 13, 2022 · Artificial Intelligence

MNN 2.0: A Unified Edge‑Cloud Deep Learning Framework Overview

MNN 2.0 transforms Alibaba’s lightweight deep‑learning engine into a unified edge‑cloud framework, delivering ultra‑small binaries, broad model‑format support, and aggressive CPU/GPU/DSP/NPU optimizations—including SIMD, Winograd, quantization, and sparse computation—while providing Python‑style APIs for preprocessing, inference, and on‑device training.

MNNdeep learningedge computing

0 likes · 18 min read

MNN 2.0: A Unified Edge‑Cloud Deep Learning Framework Overview

Kuaishou Large Model

May 27, 2022 · Mobile Development

How Kuaishou Optimizes Mobile AI Effects with Dynamic Device Grading

To ensure consistent user experience across the wide range of Android and iOS devices, Kuaishou’s Y‑tech team designed a dynamic model‑grading framework that evaluates CPU, GPU, NPU, memory and other hardware metrics, then dispatches appropriately sized AI effect models and configurations in real time.

AndroidKuaishoudevice optimization

0 likes · 12 min read

How Kuaishou Optimizes Mobile AI Effects with Dynamic Device Grading

Kuaishou Tech

Apr 11, 2022 · Artificial Intelligence

Kuaishou's Custom Video Matting Solution: Interactive Object Segmentation for Mobile Creators

Kuaishou's audio‑video technology team presents a self‑developed custom video matting system that combines foreground, interactive, and video object segmentation to let creators extract arbitrary subjects without green screens, featuring adaptive cropping, multi‑stage training, and deployment across Android and iOS devices.

Kuaishoucomputer visiondeep learning

0 likes · 15 min read

Kuaishou's Custom Video Matting Solution: Interactive Object Segmentation for Mobile Creators

Alibaba Terminal Technology

Mar 9, 2022 · Artificial Intelligence

How Edge AI Powers Alibaba’s Local Life Services: Architecture and Real‑World Wins

This article explains how Alibaba’s local‑life platforms leverage edge‑side AI to run machine‑learning inference on users’ devices, detailing the concept, advantages, technical architecture, and concrete implementations such as user feature extraction, intelligent recommendation, and smart push, while outlining future directions.

AlibabaEdge AIlocal services

0 likes · 12 min read

How Edge AI Powers Alibaba’s Local Life Services: Architecture and Real‑World Wins

Kuaishou Tech

Mar 3, 2022 · Artificial Intelligence

Optimization Techniques for Image Cropping in Kuaishou YKit AI SDK

This article details the engineering optimizations applied to the image cropping stage of Kuaishou's YKit AI SDK, covering instruction-level fixes, SIMD acceleration, I/O cache improvements, algorithmic refinements, parallel processing, and device‑tier strategies to achieve up to 4.6× speedup on mobile devices.

AI SDKNEONPerformance Optimization

0 likes · 12 min read

Optimization Techniques for Image Cropping in Kuaishou YKit AI SDK

21CTO

Nov 27, 2021 · Artificial Intelligence

How Huawei’s “Genius Teen” Scaled AutoML to Millions of Phones

Huawei’s 201‑million‑yuan “genius teen” Zhong Zhao leveraged AutoML to deploy high‑precision image‑pixel processing algorithms across tens of millions of Mate and P series smartphones, pioneering large‑scale commercial use of AutoML and advancing mobile visual models with dynamic convolution kernels and adversarial data augmentation.

AutoMLHuaweicomputer vision

0 likes · 9 min read

How Huawei’s “Genius Teen” Scaled AutoML to Millions of Phones

ITPUB

Nov 26, 2021 · Artificial Intelligence

How Huawei’s ‘Genius Teen’ Scaled AutoML to Millions of Smartphones

Huawei unveiled the work of young researcher Zhong Zhao, who within a year applied AutoML to pixel‑level image processing on millions of Mate and P series phones, detailing the technical challenges, novel pipeline, performance gains, and his broader contributions to mobile AI research.

AutoMLHuaweimobile AI

0 likes · 8 min read

How Huawei’s ‘Genius Teen’ Scaled AutoML to Millions of Smartphones

Kuaishou Large Model

Nov 26, 2021 · Artificial Intelligence

How Kuaishou’s ‘All‑Things AR’ Turns Real Objects into Interactive 3D Characters

‘All‑Things AR’ (万物AR) is a Kuaishou Y‑tech solution that lets users capture any real‑world object with a phone, automatically segments it using a custom AI model, and renders an animated 3D avatar via a lightweight SLAM‑based pipeline, enabling low‑cost, high‑quality AR experiences.

ARSLAMcomputer vision

0 likes · 16 min read

How Kuaishou’s ‘All‑Things AR’ Turns Real Objects into Interactive 3D Characters

Baidu App Technology

Nov 25, 2021 · Game Development

Building an AI-Powered Object Hunt Game with Paddle.js and PaddleClas

The article details how to create the AI‑driven “Object Hunt Battle” game by processing data, designing and training a PP‑LCNet model with PaddleClas, converting it for Paddle.js, and integrating real‑time WebGL inference on mobile devices, achieving sub‑50 ms latency and encouraging developers to explore further.

AI game developmentPaddle.jsPaddleClas

0 likes · 9 min read

Building an AI-Powered Object Hunt Game with Paddle.js and PaddleClas

DeWu Technology

Sep 30, 2021 · Mobile Development

Inside DeWu’s iOS Tech Salon: Video Effects, Mobile AI, and Engineering Evolution

The DeWu iOS tech salon held on September 25, 2021 brought together internal and external experts from DeWu, Alibaba, and ByteDance to share deep technical insights on video effect rendering, the MBox mobile toolchain, mobile AI with MNN, and the evolution of DeWu's iOS engineering practices, followed by interactive Q&A and community networking.

EngineeringMobile DevelopmentiOS

0 likes · 8 min read

Inside DeWu’s iOS Tech Salon: Video Effects, Mobile AI, and Engineering Evolution

Alibaba Terminal Technology

Sep 23, 2021 · Artificial Intelligence

Real‑Time Document Corner Detection on Mobile: Heatmap‑Based Keypoint Algorithms Explained

This article reviews the end‑to‑end pipeline for real‑time document corner detection on mobile devices, breaks down the keypoint detection workflow into image processing, encoding, network modeling and decoding, compares heatmap‑based and fully‑connected approaches, introduces a differentiable DSNT decoding method with unbiased coordinate transformations, and presents experimental results and conclusions on its effectiveness and limitations.

DSNTdocument-analysisheatmap

0 likes · 15 min read

Real‑Time Document Corner Detection on Mobile: Heatmap‑Based Keypoint Algorithms Explained

DataFunTalk

Jun 3, 2021 · Artificial Intelligence

Compression Techniques for BERT: Analysis, Quantization, Pruning, Distillation, and Structure-Preserving Methods

This article examines the internal structure of BERT and systematically presents various model‑compression strategies—including quantization, pruning, knowledge distillation, and structure‑preserving techniques—highlighting their impact on storage, computational cost, and inference speed for deployment on resource‑constrained mobile devices.

BERTKnowledge DistillationModel Compression

0 likes · 16 min read

Compression Techniques for BERT: Analysis, Quantization, Pruning, Distillation, and Structure-Preserving Methods

Sohu Tech Products

Feb 24, 2021 · Artificial Intelligence

EdgeRec: Edge Computing in Recommendation Systems

EdgeRec explores how moving recommendation system components to the edge—leveraging real‑time user behavior, heterogeneous action modeling, on‑device reranking, mixed‑ranking, and personalized “thousand‑person‑one‑model” training—can reduce latency, improve relevance, and boost business metrics compared to traditional cloud‑centric pipelines.

Meta LearningRecommendation Systemsedge computing

0 likes · 19 min read

EdgeRec: Edge Computing in Recommendation Systems

DataFunTalk

Dec 9, 2020 · Artificial Intelligence

WeChat Identify: From Object Detection to Large‑Scale Image Search – Technical Overview

This article details the evolution of WeChat’s Identify product, explaining its end‑to‑end image recognition pipeline—including object detection, multi‑label classification, mobile‑side detection, large‑scale retrieval, unsupervised clustering, and system architecture—while showcasing various application scenarios such as product, plant, and landmark recognition.

computer visionimage recognitionlarge-scale retrieval

0 likes · 12 min read

WeChat Identify: From Object Detection to Large‑Scale Image Search – Technical Overview

Kuaishou Large Model

Dec 3, 2020 · Artificial Intelligence

Kuaishou Y‑Tech’s Real‑Time, High‑Precision Facial & Body Keypoint Detection Explained

Y‑Tech’s in‑house keypoint detection system powers Kuaishou’s beauty and effect filters across live streaming, video creation, and editing by leveraging lightweight deep‑learning models, extensive multi‑scenario data collection, and specialized handling of occlusion, enabling real‑time, robust facial and body landmark tracking on diverse mobile devices.

beauty filterscomputer visiondeep learning

0 likes · 10 min read

Kuaishou Y‑Tech’s Real‑Time, High‑Precision Facial & Body Keypoint Detection Explained

JD Cloud Developers

Oct 19, 2020 · Artificial Intelligence

This Week's Top AI & Tech Innovations: Federated Learning, AI Processors, and More

This week’s tech roundup highlights JD’s new federated learning platform, Facebook’s AI-driven search for renewable-energy catalysts, ARM’s high‑performance AIPU, NVIDIA’s data‑center DPU, Chrome’s rollout of HTTP/3 with QUIC, Canonical’s take on Windows‑Linux migration, plus recent advances in stereo matching and mobile sensor action recognition.

AI hardwareWeb Protocolsmobile AI

0 likes · 8 min read

This Week's Top AI & Tech Innovations: Federated Learning, AI Processors, and More

Baidu App Technology

Sep 7, 2020 · Artificial Intelligence

Real-Time Mobile Super-Resolution Reconstruction in Baidu App

The article describes Baidu App's real-time mobile super-resolution using a VDSR-based model with pruning and depthwise separable convolutions, optimized via application-layer and inference engine techniques to halve latency and memory, enabling on-device high‑def image/video enhancement, reducing server load, and supporting iOS/Android integration.

Real-time Processingimage enhancementmobile AI

0 likes · 8 min read

Real-Time Mobile Super-Resolution Reconstruction in Baidu App

AntTech

Jun 9, 2020 · Artificial Intelligence

Deep Learning Model Compression and Acceleration Techniques for Mobile AI

This article reviews the motivations, challenges, and a comprehensive set of algorithmic, framework, and hardware methods—including structural optimization, quantization, pruning, and knowledge distillation—to compress and accelerate deep learning models for deployment on mobile devices, highlighting benefits such as reduced server load, lower latency, improved reliability, and enhanced privacy.

Knowledge DistillationModel Compressionmobile AI

0 likes · 17 min read

Deep Learning Model Compression and Acceleration Techniques for Mobile AI

Laravel Tech Community

May 31, 2020 · Mobile Development

Deploying and Training Deep Learning Models on iOS and Android: Core ML, NNAPI, and TensorFlow Lite

This article explains how to train and deploy convolutional neural networks directly on iOS and Android devices using Core ML, NNAPI, and TensorFlow Lite, compares performance with desktop TensorFlow, and provides practical code snippets and build‑time tips for mobile AI development.

AndroidCore MLModel Deployment

0 likes · 7 min read

Deploying and Training Deep Learning Models on iOS and Android: Core ML, NNAPI, and TensorFlow Lite

Baidu App Technology

May 29, 2020 · Mobile Development

How MML Simplifies Mobile AI Deployment: Architecture, Tools, and Code Walkthrough

This article explains the background of on‑device AI, introduces the Mobile Machine Learning (MML) framework and its layered architecture, details the core utilities such as model decryption and task scheduling, and provides a step‑by‑step code guide for initializing, preprocessing, inference, post‑processing, and releasing resources on mobile platforms.

AndroidEdge AIMML

0 likes · 9 min read

How MML Simplifies Mobile AI Deployment: Architecture, Tools, and Code Walkthrough

Tencent Music Tech Team

May 8, 2020 · Mobile Development

Mobile Machine Learning Frameworks Overview and Deployment Practices in Q Music

The article reviews four mobile‑focused machine‑learning frameworks—NCNN, TensorFlow Lite, PyTorch Mobile (Caffe2) and FeatherKit—detailing their size, speed, and resource trade‑offs, and explains Q Music’s edge‑inference pipeline, optimization strategies, and the challenges of performance variability on heterogeneous mobile devices.

FeatherKitMachine LearningPyTorch Mobile

0 likes · 25 min read

Mobile Machine Learning Frameworks Overview and Deployment Practices in Q Music

Programmer DD

Apr 19, 2020 · Artificial Intelligence

How Gesture Recognition Transforms Mobile Gaming with Real‑Time AI Control

This article presents a gesture‑based human‑computer interaction system that uses Paddle Lite and MobileNet to enable real‑time control of games on Android phones, tablets, and embedded boards, detailing its architecture, data preparation, model training, and on‑device inference.

AndroidHuman-Computer InteractionMobileNet

0 likes · 11 min read

How Gesture Recognition Transforms Mobile Gaming with Real‑Time AI Control

Tencent Cloud Developer

Mar 6, 2020 · Artificial Intelligence

WeChat "Scan" Object Detection: Mobile AI Model Design, Optimization, and Deployment

The paper presents a lightweight, anchor‑free CenterNet‑based object‑ness detector for WeChat’s Scan feature, built on a ShuffleNetV2 backbone with enlarged 5×5 depth‑wise convolutions, a streamlined detection head, and a Pyramid Interpolation Module, then quantized, ONNX‑converted and NCNN‑deployed to achieve a 436 KB model running in ~15 ms per frame on an iPhone 8 CPU.

CenterNetShuffleNetV2anchor-free

0 likes · 12 min read

WeChat "Scan" Object Detection: Mobile AI Model Design, Optimization, and Deployment

58 Tech

Jan 15, 2020 · Artificial Intelligence

Mobile AI Vehicle and VIN Recognition: From TensorFlow to TensorFlow Lite Deployment on Android and iOS

This article details how the 58 Used‑Car mobile team built, trained, and optimized TensorFlow‑based object‑detection models for on‑device vehicle and VIN code recognition, covering data preparation, model conversion to TF‑Lite, performance improvements, engineering integration on Android/iOS, and real‑world deployment results.

AndroidTensorFlowTensorFlow Lite

0 likes · 14 min read

Mobile AI Vehicle and VIN Recognition: From TensorFlow to TensorFlow Lite Deployment on Android and iOS

Amap Tech

Dec 20, 2019 · Artificial Intelligence

Advances in Network Positioning: Unsupervised Clustering and Supervised Hierarchical Ranking Algorithms

Gaode’s network positioning has evolved from unsupervised clustering of massive AP fingerprints and Bayesian grid ranking to a supervised two‑level hierarchical model that scores candidate grids with a neural‑network LTR loss, while adding scenario‑specific CNN and spatio‑temporal modules for indoor, rail and subway accuracy, and it now looks toward image‑based, 5G and IoT positioning.

fingerprint localizationgeolocationmobile AI

0 likes · 12 min read

Advances in Network Positioning: Unsupervised Clustering and Supervised Hierarchical Ranking Algorithms

Alibaba Cloud Developer

Dec 20, 2019 · Artificial Intelligence

How AI-Powered Hand Gesture Detection Drove a Double‑11 Celebrity Rock‑Paper‑Scissors Game

This article details how Alibaba leveraged AI-driven hand‑gesture detection and a lightweight SSD‑based object detection model to create an interactive rock‑paper‑scissors game for Double‑11, addressing challenges of undefined gestures, real‑time mobile performance, and data collection, and achieving over 16 million page views and high accuracy.

SSDfeature pyramid networkhand gesture recognition

0 likes · 22 min read

How AI-Powered Hand Gesture Detection Drove a Double‑11 Celebrity Rock‑Paper‑Scissors Game

MaGe Linux Operations

Nov 20, 2019 · Artificial Intelligence

How North Korea Built a Homegrown AI Facial‑Recognition Smartphone

North Korea’s newly unveiled “Blue Sky” smartphone incorporates a homegrown AI facial‑recognition system built on CNNs, MTCNN, MobileFaceNets and TensorFlow, showcasing how the isolated nation is advancing edge AI despite operating solely on its internal CentOS‑based intranet.

AITensorFlowdeep learning

0 likes · 7 min read

How North Korea Built a Homegrown AI Facial‑Recognition Smartphone

Architecture Digest

Aug 23, 2019 · Artificial Intelligence

Intelligent Publishing Solution for Xianyu C2C Product Structuring

This article presents Xianyu's intelligent publishing solution that leverages real‑time mobile AI to automatically associate user‑uploaded items with existing catalog entries, balancing low user cost, high accuracy, and system performance through a multi‑layer architecture and flexible pipeline design.

AIarchitecturemobile AI

0 likes · 8 min read

Intelligent Publishing Solution for Xianyu C2C Product Structuring

Xianyu Technology

Aug 13, 2019 · Artificial Intelligence

Intelligent Publishing Solution for Xianyu C2C Platform

Intelligent publishing for Xianyu’s C2C platform uses on‑device AI to automatically match user‑posted items with the Taobao/Tmall catalog, guiding real‑time multi‑frame capture, reducing manual tagging, boosting matching accuracy by about 20%, and preparing a phased rollout for video, image, and activity posts.

AISystem Architecturemobile AI

0 likes · 9 min read

Intelligent Publishing Solution for Xianyu C2C Platform

Alibaba Cloud Developer

Jul 2, 2019 · Artificial Intelligence

How MNN Powers Mobile AI: Inside Alibaba’s Open‑Source Inference Engine

Alibaba’s MNN (Mobile Neural Network) engine, now open‑sourced on GitHub, showcases how a lightweight, end‑side deep‑learning inference framework tackles fragmentation, optimizes model conversion, scheduling, and execution across diverse devices, delivering significant performance gains for mobile and IoT AI applications.

Inference EngineMNNOperator fusion

0 likes · 15 min read

How MNN Powers Mobile AI: Inside Alibaba’s Open‑Source Inference Engine

iQIYI Technical Product Team

May 30, 2019 · Mobile Development

SmileAR: iQIYI’s Mobile AR Solution Powered by TensorFlow Lite

SmileAR, iQIYI’s self‑developed mobile AR platform powered by TensorFlow Lite, delivers real‑time face, body and gesture recognition across iQIYI’s apps through MobileNet‑based models, quantization‑aware training, multi‑task learning and encrypted SDKs, achieving fast, lightweight, cross‑platform AR experiences for millions of users.

ARTensorFlow Litecomputer vision

0 likes · 10 min read

SmileAR: iQIYI’s Mobile AR Solution Powered by TensorFlow Lite

Alibaba Cloud Developer

Feb 27, 2019 · Artificial Intelligence

Inside Alibaba’s AliPlayStudio: Real-Time AI Video Interaction Techniques

This article details how Alibaba’s AliPlayStudio combines advanced computer‑vision algorithms—such as human semantic segmentation, gesture and pose detection, controllable style transfer, and face‑fusion—optimised for low‑power mobile and embedded devices, to deliver engaging real‑time video interactions across online and offline marketing scenarios.

Style Transferface fusiongesture recognition

0 likes · 17 min read

Inside Alibaba’s AliPlayStudio: Real-Time AI Video Interaction Techniques

Alibaba Cloud Developer

Jan 16, 2019 · Artificial Intelligence

How Alibaba’s AliPlayStudio Powers Real‑Time AI Video Interactions on Mobile

This article details the research and engineering behind Alibaba's AliPlayStudio, a video‑interactive platform that combines computer‑vision algorithms such as human parsing, gesture and pose detection, and controllable style transfer, all optimized for real‑time deployment on low‑power mobile and embedded devices.

Real-Time Interactiongesture recognitionmobile AI

0 likes · 17 min read

How Alibaba’s AliPlayStudio Powers Real‑Time AI Video Interactions on Mobile

Alibaba Cloud Developer

Jun 15, 2018 · Mobile Development

How Alipay’s xNN Engine Brings Deep Learning to Mobile Apps

This article explains how Alipay’s xNN deep‑learning engine tackles the challenges of deploying AI on billions of mobile devices by using aggressive model compression, a lightweight SDK, and joint algorithm‑ and instruction‑level optimizations to achieve high accuracy, tiny package size, and real‑time performance.

AlipayModel Compressiondeep learning

0 likes · 10 min read

How Alipay’s xNN Engine Brings Deep Learning to Mobile Apps

Xianyu Technology

May 24, 2018 · Artificial Intelligence

Custom TensorFlow Lite OP Pipeline: Architecture, Server and Client Implementation

The article provides an engineering‑focused guide to creating a custom TensorFlow Lite operation pipeline, covering its definition, server‑side registration and compilation, client‑side downloading, verification, decryption and dynamic loading, and discusses current limitations and possible extensions such as compression and new tensor types.

Custom OPModel EncryptionServer-Client

0 likes · 9 min read

Custom TensorFlow Lite OP Pipeline: Architecture, Server and Client Implementation

Meituan Technology Team

Feb 2, 2018 · Mobile Development

WhereAreYou: Mobile AR App for Vehicle Finding Using Core ML and Multi‑CNN Detection

The WhereAreYou hackathon project demonstrates an iOS AR app that visualizes nearby ride‑hailing cars by converting multiple CNN models to Core ML, using ARKit to map GPS bearings, then switching to vision‑based YOLO‑style detection and tracking for real‑time vehicle identification and distance labeling.

ARKitCNNCore ML

0 likes · 16 min read

WhereAreYou: Mobile AR App for Vehicle Finding Using Core ML and Multi‑CNN Detection

Alibaba Cloud Developer

Sep 28, 2017 · Artificial Intelligence

How Alipay’s xNN Brings Deep Learning to Millions of Mobile Devices

This article explains how Alipay’s xNN engine overcomes mobile deep‑learning challenges through aggressive model compression, lightweight SDK design, algorithm‑ and instruction‑level optimizations, enabling high‑accuracy AI inference on a wide range of Android and iOS devices with minimal app‑size impact.

AlipayInference OptimizationModel Compression

0 likes · 13 min read

How Alipay’s xNN Brings Deep Learning to Millions of Mobile Devices

Qunar Tech Salon

Feb 20, 2016 · Artificial Intelligence

Mobile Image Search: Algorithm Framework and Implementation at Paizhi Tao

Mobile image search has become a critical user demand, and since its 2014 launch, Alibaba’s Paizhi Tao has evolved through multiple iterations to a robust AI-driven pipeline comprising category prediction, object detection, deep and local image feature extraction, scalable retrieval indexing, and relevance-based ranking.

deep learningimage searchmobile AI

0 likes · 6 min read

Mobile Image Search: Algorithm Framework and Implementation at Paizhi Tao

21CTO

Jan 29, 2016 · Artificial Intelligence

How Mobile Image Search Powers Real-Time Shopping: Inside Pailitao’s AI Algorithm

Mobile visual search, a long‑standing dream, has evolved from early research to a production‑grade system at Pailitao, where a five‑module AI pipeline—category prediction, object detection, feature extraction, indexing, and ranking—enables billions of images to be searched instantly on mobile devices.

computer visiondeep learningimage search

0 likes · 8 min read