Modelwire
Subscribe

Planning in entropy-regularized Markov decision processes and games

Illustration accompanying: Planning in entropy-regularized Markov decision processes and games

Researchers introduce SmoothCruiser, a planning algorithm that solves entropy-regularized MDPs and two-player games with polynomial sample complexity O(1/epsilon^4), addressing a gap where non-regularized settings lack worst-case guarantees.

MentionsSmoothCruiser

Modelwire summarizes — we don’t republish. The full article lives on arxiv.org. If you’re a publisher and want a different summarization policy for your work, see our takedown page.

Planning in entropy-regularized Markov decision processes and games · Modelwire