This paper addresses the challenge of uncertainty in reinforcement learning (RL) by presenting a robust policy learning approach based on interval optimization. Traditional RL methods often depend on precise estimations of environment dynamics and reward functions, potentially resulting in sub-optimal or unsafe decisions when faced with real-world ambiguity and limited data. To overcome these limitations, we propose modeling value functions, rewards, and transitions as bounded intervals, thereby explicitly capturing both epistemic uncertainty (arising from incomplete knowledge) and aleatoric uncertainty (stemming from inherent randomness). Our contribution includes formal mathematical frameworks that enable interval-based representation throughout the RL process. We explore strategies for developing policies that are optimized within these interval constraints, ensuring greater resilience to uncertainty and variability. The paper further introduces benchmarking metrics specifically designed to evaluate the effectiveness and robustness of interval-aware RL policies, providing a systematic means of comparison against conventional approaches. To demonstrate the practical value of this methodology, we present a case study focused on financial credit line allocation. The results highlight that interval-aware RL not only enhances safety and reliability in decision-making but also leads to improved outcomes in environments characterized by uncertainty. By moving away from point estimates and adopting interval modeling, our work advocates for a fundamental shift in reinforcement learning practices—enabling more robust, uncertainty-aware policy learning that is well-suited to complex, real-world domains. This approach paves the way for safer and more effective RL deployments across various industries, including finance, healthcare, and robotics.
Building similarity graph...
Analyzing shared references across papers
Loading...
Gopichand Agnihotram
Joydeep Sarkar
Magesh Kasthuri
American Journal of Computer Science and Technology
Wipro (India)
Building similarity graph...
Analyzing shared references across papers
Loading...
Agnihotram et al. (Mon,) studied this question.
www.synapsesocial.com/papers/69d895206c1944d70ce0618e — DOI: https://doi.org/10.11648/j.ajcst.20260901.15