Under the general trend of global energy transformation, the proportion of renewable energy in the power sector continues to increase. Power routers are of great significance for improving energy utilization efficiency and ensuring the stable operation of power systems. However, the intermittent and uncertain nature of distributed energy makes energy management of power routers difficult, and traditional optimization methods are also difficult to adapt. Therefore, this study proposes the integration of Proximal Policy Optimization with a multi-agent framework, combined with a Generative Adversarial Imitation Learning based on a double-buffer mechanism. The double-buffer mechanism is used to improve data utilization efficiency and training stability, and to optimize communication and collaboration among multiple agents, thereby realizing energy collaborative optimization of power routers. Experimental results show that after 420 trainings, the average round reward of the improved algorithm is stable at about −410, and the strategy loss function is the first to stabilize after 500 times. In practical scenarios, the proposed model maintains a DC bus voltage fluctuation range between 728V and 732V. Additionally, its electricity cost amounts to 3846.36 yuan, and its total runtime is 53.32 seconds—both of which are lower than those of the other two models. Overall, the enhanced algorithm and model notably improve the energy collaboration optimization of power routers, offering a practical solution to energy management issues and significantly advancing the progress in this area.
Building similarity graph...
Analyzing shared references across papers
Loading...
Junyan Lyu
Jing Huang
PLoS ONE
Building similarity graph...
Analyzing shared references across papers
Loading...
Lyu et al. (Wed,) studied this question.
www.synapsesocial.com/papers/69d8968f6c1944d70ce080a9 — DOI: https://doi.org/10.1371/journal.pone.0346372