In recent years, autonomous underwater vehicles (AUVs) have been increasingly employed for target surveillance and tracking. However, the limited performance and information-processing capability of a single AUV make it difficult to achieve high-precision tracking in practice. To address these challenges, this paper proposes an online-to-offline multi-agent reinforcement learning (MARL) framework that employs offline training on historical data to obtain the expert policy. Then, the optimal policy is generated by online fine-tuning technology, which enhances the training efficiency of reinforcement learning in new scenarios. To expand the surveillance range of AUV swarms, a distributed cooperative strategy based on area information entropy (AIE) is introduced. To reduce energy consumption in complex marine environments containing obstacles and vortices, ocean current and energy consumption models are introduced, together with an energy-efficiency optimization strategy. Furthermore, a long short-term memory (LSTM) network is integrated into the offline-to-online MARL framework to predict time-varying environmental states, thereby improving tracking accuracy and energy efficiency. Experimental results show that the proposed scheme is superior to the baseline schemes in terms of energy consumption, task success rate, and distance between AUVs. In addition, various performance indicators of the extended AUV swarm are also superior to the baseline schemes, demonstrating that the proposed scheme has excellent performance and scalability.
Building similarity graph...
Analyzing shared references across papers
Loading...
Renbo Li
Denghui Li
Xiangxin Zhang
Drones
Fudan University
China Aerospace Science and Industry Corporation (China)
SGIDI Engineering Consulting (China)
Building similarity graph...
Analyzing shared references across papers
Loading...
Li et al. (Thu,) studied this question.
www.synapsesocial.com/papers/69a287b00a974eb0d3c039ac — DOI: https://doi.org/10.3390/drones10030158