How Qwen’s Mid‑Training with Value‑Document Guides Slashes Error Rates

Researchers at Claude applied the MSM (mid‑training) approach to Qwen models, inserting a value‑document pre‑training phase before alignment fine‑tuning, which reduced misalignment rates from 68%/54% to 5%/7% and cut required fine‑tuning data by 40‑60×, demonstrating superior generalization when combined with standard alignment.

AI alignmentLarge Language ModelsMSM

0 likes · 6 min read

How Qwen’s Mid‑Training with Value‑Document Guides Slashes Error Rates

Machine Heart

Apr 2, 2026 · Artificial Intelligence

Dual Alignment Theory Redefines Cross-Domain Offline RL Transfer

The paper revisits cross-domain offline reinforcement learning, showing that aligning both dynamics and value of source data is essential for effective policy transfer, and introduces the DVDF framework that jointly filters source samples, achieving consistent performance gains across multiple robotic control benchmarks.

DVDFPolicy Optimizationcross-domain transfer

0 likes · 13 min read

Dual Alignment Theory Redefines Cross-Domain Offline RL Transfer

Alibaba Cloud Developer

Aug 28, 2019 · R&D Management

How to Spot High‑Value Technical Problems in Business and Turn Them into Results

This article guides engineers on identifying technically valuable business problems, gathering and analyzing information, linking technical solutions to business outcomes, and adopting a product‑oriented mindset to ensure engineering work drives real business impact.

Information GatheringR&D methodologybusiness engineering

0 likes · 14 min read

How to Spot High‑Value Technical Problems in Business and Turn Them into Results