
Widening the Gap: Exploiting LLM Quantization via Outlier Injection
Researchers have demonstrated the first practical attack that reliably poisons large language models during quantization, a critical deployment step that compresses models for edge devices. Unlike prior work limited to simple quantization schemes, this approach exploits vulnerabilities in advanced quantization methods through outlier injection, enabling dormant malicious behavior to activate only after users compress the model. The finding exposes a supply-chain risk in the quantization pipeline: adversaries can distribute seemingly safe full-precision models that turn hostile once optimized, threatening the security assumptions underlying efficient LLM deployment at scale.68

























