Tagged articles
4 articles
Page 1 of 1
Machine Heart
Machine Heart
May 7, 2026 · Artificial Intelligence

OrthoReg: Simple Orthogonal Regularization to Eliminate Model Merging Conflicts

The paper introduces OrthoReg, a lightweight orthogonal regularization added during fine‑tuning that provably enforces weight orthogonality, thereby resolving conflicts in model merging and providing a theoretical explanation for the success of task arithmetic.

OrthoRegOrthogonal RegularizationTask Arithmetic
0 likes · 12 min read
OrthoReg: Simple Orthogonal Regularization to Eliminate Model Merging Conflicts
Machine Learning Algorithms & Natural Language Processing
Machine Learning Algorithms & Natural Language Processing
Apr 18, 2026 · Artificial Intelligence

Model Ability Gets Squeezed Out in Multi‑Task Learning—How ESM Preserves It (CVPR 2026)

The paper reveals that multi‑task models suffer performance drops because tasks compete for the same internal subspace, and introduces Essential Subspace Merging (ESM) which separates critical directions and uses Polarized Scaling to keep multiple abilities stable, achieving significantly lower degradation than traditional baselines.

ESDESMessential subspace
0 likes · 16 min read
Model Ability Gets Squeezed Out in Multi‑Task Learning—How ESM Preserves It (CVPR 2026)
AI2ML AI to Machine Learning
AI2ML AI to Machine Learning
Nov 3, 2025 · Artificial Intelligence

Smol Training Playbook: Secrets to Building World-Class LLMs

The article details the SmolLM3 3B‑parameter model, its architecture, dual‑mode inference, a three‑stage data‑curation strategy, rigorous ablation methods, preference optimisation (APO/DPO), model merging, and practical training‑stability tricks, offering a comprehensive guide for building high‑performing large language models.

APOLLM trainingcontext scaling
0 likes · 13 min read
Smol Training Playbook: Secrets to Building World-Class LLMs