
Multivariate Distributional Reinforcement Learning Using Sliced Divergences
Researchers have solved a longstanding constraint in distributional reinforcement learning by extending one-dimensional divergence metrics to multivariate settings through sliced projections. The work addresses a critical gap where prior methods either lacked theoretical guarantees or became computationally intractable when modeling full return distributions across multiple dimensions. By proving Bellman contraction under both uniform and maximum-slicing variants, this advance removes a barrier to deploying richer value representations in complex control problems, particularly those requiring matrix-valued discount structures. The technique expands the toolkit for RL practitioners building systems where capturing distributional uncertainty across multiple objectives matters.58





















