Critique Fine-Tuning (CFT): Boosting Large Language Model Reasoning with Minimal Data
The paper introduces Critique Fine-Tuning (CFT), a method that replaces simple imitation in supervised fine‑tuning with critique‑based learning, achieving superior reasoning performance on mathematical benchmarks using only 50 K samples, outperforming traditional reinforcement‑learning approaches that require millions of examples.