Focused Large Language Models are Stable Many-Shot Learners
FocusICL mitigates the reverse‑scaling of in‑context learning by masking irrelevant tokens and applying hierarchical batch attention, cutting attention complexity, and delivering consistent query focus that yields average accuracy gains of about 5 % across multiple LLMs and benchmarks.