Industry 4.0 is transforming the way companies manufacture, improve, and distribute products, moving toward fast, intelligent, and flexible manufacturing, which will bring about fundamental changes in enterprises’ production capabilities. The Flexible Job Shop Scheduling Problem (FJSP) allows a single job to be divided into multiple operations, each of which can be processed on multiple machines. Due to its high flexibility and complexity, traditional scheduling methods are difficult to meet the needs of dynamic production. Dispatching rules struggle to effectively perceive the global precedence relationships among jobs and the distribution of machine workloads; metaheuristic approaches suffer from slow iterative convergence; existing deep reinforcement learning methods often employ a single policy network to handle both operation sequencing and machine assignment in a coupled manner, which tends to cause training instability and slow convergence. This paper proposes a deep reinforcement learning model that integrates Multi-Proximal Policy Optimization (MPPO) and Dual Attention Network (DAN) to address the FJSP. The model uses the operation message attention block and machine message attention block of DAN to capture the dependency relationships between operations and the dynamic competitive relationships between machines, respectively, and extract deep features. At the same time, MPPO designs dual actor networks to handle operation sequencing and machine assignment decisions separately, and combines a centralized critic to optimize the policy. This balances exploration and exploitation and improves training stability. Experiments are conducted based on the SD1 and SD2 datasets. In FJSP instances of four scales, the model is compared with PPO-DAN, PPO-HGNN, traditional scheduling rules, and OR-Tools. The results show that the algorithm reduces makespan by up to 4.2% on SD1 and 10.1% on SD2. Moreover, it achieves better performance than traditional scheduling rules. Its comprehensive performance is superior to that of the comparison methods, verifying its effectiveness and practical application potential in solving the FJSP.
Xu et al. (Tue,) studied this question.