What question did this study set out to answer?

The aim is to improve target tracking accuracy and energy efficiency using a distributed AUV swarm.

February 28, 2026Open Access

Energy-Efficient Distributed AUV Swarm for Target Tracking via LSTM-Assisted Offline-to-Online Reinforcement Learning

Key Points

The aim is to improve target tracking accuracy and energy efficiency using a distributed AUV swarm.
Developed a multi-agent reinforcement learning (MARL) framework for AUVs.
Implemented offline training on historical data to create an expert policy.
Integrated a long short-term memory (LSTM) network to predict environmental states.
Applied an energy-efficiency optimization strategy in complex marine environments.
The proposed AUV swarm scheme shows improved energy consumption compared to baseline methods.
Achieved higher task success rates due to enhanced tracking accuracy.
Demonstrated better performance indicators in the extended AUV swarm.

Abstract

In recent years, autonomous underwater vehicles (AUVs) have been increasingly employed for target surveillance and tracking. However, the limited performance and information-processing capability of a single AUV make it difficult to achieve high-precision tracking in practice. To address these challenges, this paper proposes an online-to-offline multi-agent reinforcement learning (MARL) framework that employs offline training on historical data to obtain the expert policy. Then, the optimal policy is generated by online fine-tuning technology, which enhances the training efficiency of reinforcement learning in new scenarios. To expand the surveillance range of AUV swarms, a distributed cooperative strategy based on area information entropy (AIE) is introduced. To reduce energy consumption in complex marine environments containing obstacles and vortices, ocean current and energy consumption models are introduced, together with an energy-efficiency optimization strategy. Furthermore, a long short-term memory (LSTM) network is integrated into the offline-to-online MARL framework to predict time-varying environmental states, thereby improving tracking accuracy and energy efficiency. Experimental results show that the proposed scheme is superior to the baseline schemes in terms of energy consumption, task success rate, and distance between AUVs. In addition, various performance indicators of the extended AUV swarm are also superior to the baseline schemes, demonstrating that the proposed scheme has excellent performance and scalability.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Authors

Renbo Li

Denghui Li

Xiangxin Zhang

Journals

Drones

Actions

Institutions

Fudan University

China Aerospace Science and Industry Corporation (China)

SGIDI Engineering Consulting (China)

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Energy-Efficient Distributed AUV Swarm for Target Tracking via LSTM-Assisted Offline-to-Online Reinforcement Learning

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Authors

Journals

Actions

Institutions

References and Citations

Citation Network

Connected Papers

Discussion

Cite this study