Artificial Intelligence 9 min read

MiniCPM5-1B Sets New Benchmark for Sub‑2B Models – AI‑Trained, 10% Cheaper Than Nvidia

The 1‑billion‑parameter MiniCPM5-1B model tops the AA leaderboard with a 17.9 score, outperforms 2‑billion‑parameter rivals, uses an AI‑generated training framework that cuts cost by 10%, and runs on virtually any device thanks to aggressive quantisation and open‑source tooling.

SuanNi

May 26, 2026

MiniCPM5-1B Sets New Benchmark for Sub‑2B Models – AI‑Trained, 10% Cheaper Than Nvidia

On May 25, MiniCPM5-1B, a 1 B‑parameter model jointly released by Mianbi Intelligence, Tsinghua University and the OpenBMB community, achieved the highest score (17.9) on the Artificial Analysis (AA) leaderboard, surpassing all models under 2 B parameters, including the Qwen3.5‑2B (16.3).

Small Size, Strong Performance

The MiniCPM family has consistently reduced parameter counts while improving capability. The first MiniCPM (2 B) beat Mistral‑7B and matched Llama2‑13B; MiniCPM 3.0 (4 B) exceeded GPT‑3.5‑Turbo‑0125; MiniCPM 4.0 (8 B/0.5 B) delivered up to 220× speed‑up with the self‑developed CPM.cu inference engine; MiniCPM‑V 4.5 (8 B multimodal) outperformed 72 B models; MiniCPM‑o 4.5 (9 B) added full‑duplex multimodal streaming. MiniCPM5‑1B continues this trend, achieving top‑rank performance with half the parameters of its 2 B competitors, confirming the “intelligent density” law that model capability doubles roughly every 3.5 months, a finding originally reported in a Nature paper co‑authored with Tsinghua.

Data‑Centric Governance

To maximise the utility of a 1 B‑parameter model, the team built a five‑level data‑quality hierarchy (L0‑L4). Each level applies stricter cleaning, filtering and quality‑control rules, ensuring that every training token contributes meaningfully. High‑knowledge‑density corpora were harvested for Chinese and English web text, and a high‑quality synthetic math dataset was generated. The resulting Ultra‑FineWeb‑L3 dataset is released alongside the model for community use.

AI‑Generated Training Framework – ForgeTrain

ForgeTrain is the world’s first production‑grade large‑model training framework written entirely by AI, with zero human‑coded contributions. On Nvidia H100 GPUs, ForgeTrain trains ~10 % faster than Nvidia’s own Megatron, translating to a comparable 10 % reduction in training cost. This framework, combined with the MiniCPM5‑1B base model, demonstrates that AI can reliably improve AI.

Universal Device Compatibility

After INT4 quantisation, MiniCPM5‑1B’s weights shrink to 0.5 GB, allowing it to run on virtually any device. GPU users can run FP16 for maximum throughput; CPU‑only environments can use the open‑source ArcLight inference engine (co‑developed with Tsinghua and OpenBMB) for smooth dialogue without a graphics card. The model is compatible with popular inference stacks such as SGLang, vLLM, llama.cpp, Ollama and Hugging Face, and fine‑tuning is supported by Llama_factory and ms_swift.

Real‑World Edge AI Example

An AI desk‑pet demo showcases the model’s ability to run locally on phones or laptops, offline or online, providing a personal companion without any cloud API or GPU cluster. The demo, together with deployment scripts and a “skills” package, enables one‑click installation via AI‑assisted tools.

Open‑Source Release

MiniCPM5‑1B, its training data, deployment recipes and the Ultra‑FineWeb‑L3 dataset are fully open‑source (see https://modelscope.cn/models/OpenBMB/MiniCPM5-1B, https://huggingface.co/openbmb/MiniCPM5-1B, https://github.com/OpenBMB/MiniCPM). This openness, combined with the model’s low‑parameter, high‑density design, pushes the frontier of zero‑threshold edge AI.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Edge AI Open Source benchmark AI model ForgeTrain MiniCPM5-1B sub‑2B

Written by

SuanNi

A community for AI developers that aggregates large-model development services, models, and compute power.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.