This study addresses a structural limitation of existing HVAC control frameworks by introducing a prediction-grounded, physics-engine-free, data-driven formulation in which system dynamics are approximated using data-driven forecasts and pressure-independent control valves (PICVs) are treated as controllable optimization variables. The framework integrates a Gated Recurrent Unit (GRU) time-series forecaster with a Proximal Policy Optimization (PPO) agent, injecting look-ahead forecasts into the agent’s state. The GRU model predicts short-horizon HVAC energy and indoor conditions, and these look-ahead signals condition the PPO state to enable anticipatory control. Using these forecasts, the agent learns continuous policies for Air Handling Unit (AHU) fan speeds and PICV openings to minimize energy use under an explicit Predicted Mean Vote (PMV)-aware reward that constrains comfort within the target range. Experiments with real-world data from July to September demonstrate up to 11.85% energy savings relative to conventional control. PMV remained within the ANSI/ASHRAE 55 comfort band (PMV ∈ − 0 . 5 , 0 . 5 ) during operating periods, evidencing comfort preservation alongside energy savings. These findings highlight fine-grained, anticipatory control and consistent energy performance gains. The proposed framework is scalable to multi-zone HVAC control and provides a foundation for potential extension toward district- and city-scale energy management. These results demonstrate that prediction-grounded, physics-engine-free control provides a robust data-driven foundation for anticipatory HVAC control in real operational buildings. • Prediction-guided DRL enables physics-engine-free HVAC control. • Short-horizon forecasts embedded for anticipatory DRL control. • PICV opening rates optimized jointly with AHU fan speeds. • Up to 11.85% energy savings with PMV comfort maintained.
Baek et al. (Thu,) studied this question.