Deep reinforcement learning (DRL) has been increasingly used in reservoir operation, but several key challenges and limitations need further study. This paper developed a novel and optimal reservoir operation model incorporating inflow forecasts based on DRL and the deterministic policy gradient algorithm. A multi-dimensional reward function was derived from the objective functions and constraints, and an optimal scheduling scheme was established with dynamically weighted reward functions. The observed daily flow data and 5-day inflow forecasts of the Three Gorges Reservoir (TGR) during flood seasons (from 10 June to 31 October) from 2010 to 2025 were used to evaluate the model performance and compared with the actual operation results. The results show that, compared with the actual operation, Scheme-1 with dynamic weights increases annual average flood prevention storage capacity by approximately 36.8%, enhances power generation by about 2.86 billion kW·h (≈5.49%), and reduces spillway waste water volume by around 3.33 billion m3. This study demonstrates that the optimal scheduling model can substantially improve the overall efficiency of reservoir operation, and the improvement is even more pronounced when the reward function weights are set dynamically.
Building similarity graph...
Analyzing shared references across papers
Loading...
Xiang et al. (Thu,) studied this question.
www.synapsesocial.com/papers/69e3209340886becb653f9fa — DOI: https://doi.org/10.3390/w18080948
Xin Xiang
Shenglian Guo
bokai Sun
Water
Wuhan University
Yangtze River Pharmaceutical Group (China)
Building similarity graph...
Analyzing shared references across papers
Loading...