The rising global demand for energy has intensified the push for sustainability in the manufacturing sector—one of the world’s largest energy consumers. While traditional manufacturing strategies often prioritize continuous machine operation to maximize output, such approaches frequently overlook opportunities for improving energy efficiency. In response, this research introduces a conceptualized Digital Twin (DT) framework that integrates Deep Reinforcement Learning (DRL) to reduce energy consumption in stochastic manufacturing systems without compromising productivity, while also facilitating the practical deployment of DRL solutions in real-world applications. The proposed framework is composed of three interconnected layers. The physical layer represents actual manufacturing entities, including machines, buffers, and control equipment. The data layer enables bidirectional communication between the physical and digital domains, ensuring real-time synchronization and data consistency. The digital layer incorporates a discrete-event simulation environment that mirrors the physical system and captures its complexity, including uncertainties such as variable part arrival times, processing durations, machine startup delays, breakdowns, and repair times. At the core of the digital layer is a model-free DRL agent based on Proximal Policy Optimization (PPO), which determines optimal machine on/off switching strategies. The agent is extensively fine-tuned through grid search to develop robust control policies capable of adapting to dynamic system conditions without requiring an explicit model of the environment. Additionally, a rule-based action validation step is incorporated to enhance safety and interpretability, enabling human-in-the-loop oversight where necessary. A key contribution of this work is the design of a novel reward system—featuring a tailored reward function and a structured delivery method—that enables the agent to learn effective energy-saving strategies while preserving productivity. Applied to a real-world machining workstation, the trained agent achieves 19.57% energy reduction while maintaining 99.6% productivity, corresponding to an approximately 12% improvement in energy efficiency and a 2% productivity gain compared to the most relevant prior study in the field.
Abadi et al. (Thu,) studied this question.