Tagged articles
1 articles
Page 1 of 1
Data Thinking Notes
Data Thinking Notes
Mar 30, 2025 · Artificial Intelligence

How DeepSeek‑R1 and Kimi‑K1.5 Push the Boundaries of Strong Reasoning Models

This comprehensive analysis by the Peking University AI Alignment team dissects the technical innovations behind DeepSeek‑R1, DeepSeek‑R1 Zero, and Kimi‑K1.5, covering reinforcement‑learning‑based post‑training, rule‑based rewards, GRPO optimization, scaling laws, multimodal extensions, safety challenges, and future research directions.

AI alignmentDeepSeekKimi
0 likes · 57 min read
How DeepSeek‑R1 and Kimi‑K1.5 Push the Boundaries of Strong Reasoning Models