SIREM: Speech-Informed MRI Reconstruction with Learned Sampling
Researchers propose SIREM, a cross-modal learning framework that reconstructs real-time MRI of vocal-tract dynamics by leveraging synchronized speech audio as a learned prior. The approach exploits the inherent correlation between acoustic output and articulatory configuration to overcome fundamental speed-resolution tradeoffs in undersampled k-space acquisition. This work exemplifies how multimodal fusion and domain-specific inductive biases can solve constrained inverse problems in medical imaging, with implications for clinical speech assessment and broader applications where paired sensor streams enable reconstruction under acquisition bottlenecks.52























