Real-time energy management for off-road hybrid electric vehicles (HEVs) poses significant challenges under variable and unknown driving conditions. Spurred by this challenge, this paper proposes a real-time adaptive energy management strategy (EMS) based on a model-based reinforcement learning (MBRL). Within the MBRL framework, a reinforcement learning (RL) oriented environment model is first constructed, consisting of a virtual vehicle model that represents a deterministic powertrain and an online Markov chain (MC) model to reflect dynamic driving conditions. Notably, both models support continuous online updates. Secondly, a novel multi-agent collaborative decision-time planning (DTP) algorithm is introduced. Unlike conventional RL methods that require learning a complete driving cycle, it learns the optimal action for each actual encountered vehicle state within the environment model. Moreover, its internal multi-agent collaborative mechanism combines a series of agent strategies, trained to converge under typical driving cycles, ensuring real-time performance. Simulation results under unknown off-road conditions demonstrate that the proposed strategy incurs only a 2.4% increase in fuel consumption and a 2.5% increase in state of health (SOH) compared to dynamic programming (DP), while maintaining a highly similar state of charge (SOC) trajectory. Simultaneously, the single-step computation time is merely 10.7 ms. It also significantly outperforms model-free Q-learning (MFQL) and rule-based strategies. Moreover, its consistent performance across three additional unknown driving cycles confirms its strong robustness. Finally, the effectiveness of the proposed strategy is validated on a hardware-in-the-loop (HIL) platform.
Ma et al. (Fri,) studied this question.