
BAPR: Bayesian amnesic piecewise-robust reinforcement learning for non-stationary continuous control
BAPR addresses a core challenge in real-world control systems: balancing robustness against sudden environmental shifts with performance during stable periods. By combining Bayesian online change detection with ensemble reinforcement learning, the method detects regime transitions and adapts policy conservatism accordingly, avoiding both the inefficiency of globally cautious approaches and the brittleness of purely adaptive ones. The work includes formal verification in Lean 4, establishing theoretical boundaries for when the approach guarantees convergence. This matters for autonomous systems, robotics, and industrial control where undetected dynamics shifts can cause failures, yet overly defensive policies waste resources during normal operation.58


























