This study presents a hybrid reinforcement learning–assisted distributionally robust optimization (RL–DRO) framework for resilient and low-carbon energy system operation under uncertainty. The proposed model integrates a multi-agent reinforcement learning structure with a Wasserstein-metric distributionally robust formulation to capture both adaptive decision-making and conservative risk management. Reinforcement learning agents, representing distributed subsystems such as renewable generators, storage units, and flexible loads, are trained to minimize a composite objective combining expected cost and risk, while the DRO layer ensures robustness against distributional ambiguity. A case study on a renewable-dominated microgrid demonstrates that the RL–DRO framework converges smoothly within 4000 training iterations, achieving a 9. 7 % reduction in expected cost and a 28 % improvement in robustness compared with stochastic optimization. The optimal ambiguity radius balances efficiency and resilience, while renewable curtailment and storage utilization exhibit clear compensatory dynamics across uncertainty scenarios. Emission trajectories show an exponential decay from 200 to 140 tCO₂ across learning epochs, confirming the model’s ability to internalize environmental objectives. Overall, the RL–DRO architecture unifies data-driven learning and mathematical robustness, enabling distributed agents to achieve stable coordination and sustainable operation under high renewable penetration. The framework establishes a practical foundation for intelligent, risk-aware, and carbon-efficient decision-making in modern power systems.
Building similarity graph...
Analyzing shared references across papers
Loading...
Yongle Zheng
Shiqian Wang
Zhongfu Tan
Scientific Reports
North China Electric Power University
Economic Research Institute
Economic Research Institute
Building similarity graph...
Analyzing shared references across papers
Loading...
Zheng et al. (Wed,) studied this question.
www.synapsesocial.com/papers/69fd7ec6bfa21ec5bbf0712f — DOI: https://doi.org/10.1038/s41598-026-50532-z