
Prefix Teach, Suffix Fade: Local Teachability Collapse in Strong-to-Weak On-Policy Distillation
Researchers have identified a failure mode in on-policy distillation where dense supervision across entire model outputs paradoxically degrades performance in strong-to-weak settings. The finding challenges a foundational assumption in distillation: that full-sequence feedback always helps. The team proposes that learning signals should concentrate on trajectory segments where teacher feedback remains sufficiently discriminative, a principle with direct implications for how practitioners design distillation pipelines and allocate annotation budgets. This reframes the optimization surface for student model training and could reshape best practices in scaling weaker models from stronger teachers.58
























