Underwater manipulator grasping remains challenging because image blur, light attenuation, and flow-induced disturbances degrade perception and control. These factors make target localization, contact judgment, and stable lifting difficult, especially when visual degradation and tactile fluctuation occur together. We propose AVT-TD3, an adaptive visual–tactile fusion reinforcement learning method for underwater manipulator grasping. AVT-TD3 constructs a unified policy state from visual observations, short-horizon tactile variations, and manipulator proprioception. A gated fusion module adjusts the contribution of each sensory branch, while an action modulation mechanism limits abrupt velocity-command changes during contact establishment and lifting. We train the continuous grasping policy with Twin Delayed Deep Deterministic Policy Gradient (TD3) and evaluate it in simulation under different turbidity, flow velocity, and target conditions, followed by controlled water-tank feasibility validation. Simulation results show that AVT-TD3 achieves better performance than Deep Deterministic Policy Gradient (DDPG), Soft Actor-Critic (SAC), and standard TD3 in success rate, completion steps, slip rate, and velocity-command smoothness. In the standard test scenario, AVT-TD3 achieves a success rate of 92.7%, an average of 76 completion steps, a slip rate of 4.1%, and an action variation magnitude of 0.20. Controlled water-tank tests further support the feasibility of deploying AVT-TD3, although open-water validation remains for future work.
Wan et al. (Sun,) studied this question.