Tiny‑R1‑32B‑Preview: A 5% Parameter Model Matching Deepseek‑R1‑671B Performance
On February 24, 2025, 360 and Peking University unveiled Tiny‑R1‑32B‑Preview, a medium‑scale inference model that uses only 5% of the parameters yet achieves performance comparable to the 671‑billion‑parameter Deepseek‑R1, with leading results on math, programming, and scientific benchmarks.
On February 24, 2025, 360 and Peking University jointly released the medium‑scale inference model Tiny‑R1‑32B‑Preview, which achieves performance close to Deepseek‑R1‑671B while using only 5% of the parameters.
Benchmark results show the model scores 78.1 on the AIME‑2024 math test, surpassing Deepseek‑R1‑Distill‑Llama‑70B (70.0), and leads the 70B open‑source models on programming (LiveCodeBench 61.6) and scientific (GPQA‑Diamond 65.0) tasks.
The efficiency gain is significant: with just 32 B parameters the model reaches over 95 % of the original R1 performance, dramatically reducing inference cost.
Technical innovation follows a “divide‑and‑conquer‑merge” strategy: large‑scale domain data are generated from DeepSeek‑R1, three vertical models for mathematics, programming, and science are trained separately, and the Arcee team’s MergeKit tool fuses them into a balanced multi‑task model.
The model is released open‑source on Hugging Face ( https://huggingface.co/qihoo360/TinyR1-32B-Preview ). The full technical report, training code, and part of the datasets will be published soon, reflecting a commitment to democratize high‑efficiency inference.
Team members from 360 and Peking University are listed, and an illustration of the model is shown.
模型
参数量
数学 (AIME 2024)
代码 (LiveCodeBench)
科学 (GPQA-Diamond)
Deepseek-R1-Distill-Qwen-32B
32B
72.6
57.2
62.1
Deepseek-R1-Distill-Llama-70B
70B
70
57.5
65.2
Deepseek-R1
671B
79.8
65.9
71.5
Tiny-R1-32B-Preview
32B
78.1
61.6
65
DataFunSummit
Official account of the DataFun community, dedicated to sharing big data and AI industry summit news and speaker talks, with regular downloadable resource packs.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.