Virtual coupling control of trains is a promising technology for improving railway capacity and operational efficiency. However, existing multi-agent reinforcement learning (MARL) approaches struggle to capture long-sequence temporal dependencies among train states in complex multi-train interaction scenarios, resulting in limited robustness and coordination stability. To address this issue, this paper proposes a Predictive Mamba-based Multi-Agent Soft Actor–Critic (PM-MASAC) framework. A Mamba-based state prediction module is embedded into the centralized Critic network to model historical state sequences and generate predictive state representations, thereby enhancing value estimation accuracy. In addition, a multi-agent aggregated prioritized experience replay (PER) mechanism is introduced to improve the utilization of critical cooperative samples and stabilize training. A hierarchical local–global reward structure is further designed to ensure individual tracking performance while promoting overall formation coordination. Experimental results under realistic railway operating conditions demonstrate that PM-MASAC achieves superior robustness compared with baseline MARL methods. Velocity and spacing tracking errors are maintained within 3% and 1%, respectively, and the steady-state formation success rate exceeds 95.7% in the training environment.
Building similarity graph...
Analyzing shared references across papers
Loading...
Han Hu
Qingsheng Feng
Zhun Han
Electronics
Tongji University
Dalian Jiaotong University
Building similarity graph...
Analyzing shared references across papers
Loading...
Hu et al. (Fri,) studied this question.
www.synapsesocial.com/papers/69fbefef164b5133a91a40e7 — DOI: https://doi.org/10.3390/electronics15091823