
Sense Representations Are Inducible Interfaces
Researchers have developed ACROS, a technique that retrofits explicit semantic structure into frozen pretrained language models without retraining. By injecting a gated residual pathway, the method enables three distinct capabilities: word-sense disambiguation competitive with traditional baselines, fine-grained lexical steering via proxy objectives, and cross-lingual transfer with minimal performance degradation. This work matters because it decouples semantic interpretability from model pretraining, suggesting that meaning decomposition can be induced as a modular interface rather than baked into architecture. For practitioners, it opens a path to add interpretability and control to existing models without the cost of retraining.58


























