Multi-robot systems (MRS) have attracted a lot of attention from researchers due to their widespread application in various environments. However, in multi-robot collaborative search tasks, two problems often arise: sparse rewards for capturing targets and control oscillations. To address these issues, this paper proposes the hierarchical active perception multi-agent deep deterministic policy gradient (HAP-MADDPG) framework. This framework guides robots to efficiently explore maps and discover targets through global utility planning based on global exploration rate and local information aggregation based on local exploration rate. A stability control mechanism, which includes hysteresis logic and reward decay, is introduced to suppress control oscillations. Experimental results show that the HAP-MADDPG framework achieves a success rate of 96.25% and an average search time of 216.3 steps. The path trajectories are smooth, demonstrating the effectiveness of the proposed approach.
Xu et al. (Tue,) studied this question.