Xiaohongshu Tech REDtech
Oct 11, 2024 · Artificial Intelligence
Harmonized Speculative Sampling (HASS): Aligning Training and Decoding for Efficient Large Language Model Inference
HASS aligns training and decoding contexts and objectives for speculative sampling, using harmonized objective distillation and multi-step context alignment, achieving 2.81–4.05× speedup and 8%–20% improvement over EAGLE‑2 while preserving generation quality in real-world deployments at Xiaohongshu.
AIHASSInference Acceleration
0 likes · 11 min read