March 3, 2026Open Access

Reinforcement Learning with World Models for Autonomous Excavation Optimization in Wheel Loaders

Key Points

The learned policy achieved 89% higher productivity, reducing bucket-filling time by 50%.
Reinforcement learning with world models accelerates control strategies for heavy machinery, indicating significant advancements.
Analysis used a surrogate dynamics model trained on data, allowing policy optimization that runs 100 times faster.
These findings highlight the effectiveness of model-based approaches in enhancing automation for construction machinery.

Abstract

Automating the bucket-filling task in wheel loaders is challenging due to the complex, nonlinear interaction between the bucket and granular material. Traditional control methods and rule-based systems are often limited by human expertise and struggle to fully exploit the capabilities of autonomous machines such as Volvo’s concept electric wheel loader, Zeux. This study presents a model-based Reinforcement Learning (RL) approach to optimize the bucket-filling strategy for Zeux. A Long-Short-Term Memory (LSTM)-based surrogate dynamics model was trained on data from Volvo’s high-fidelity simulator to approximate realistic dynamics, while allowing efficient policy training using Proximal Policy Optimization (PPO) with imagined rollouts. This reduces computational cost and eliminates the need for direct interaction with the high-fidelity simulator, enabling policy iteration to run approximately 100 times faster than in the original environment. Compared to Volvo’s current rule-based driver model, the learned policy achieved 89% higher productivity, improved energy efficiency by 56%, and reduced the time to complete the bucket-filling task by 50%. The results show that world models can accelerate RL for heavy machinery control, enabling the discovery of strategies that outperform controllers based on human expert behavior. This approach aligned with the Volvo objective of finding an optimal bucket filling strategy for Zeux, to allow fair comparisons with other machines and to assess the overall viability of adopting an autonomous wheel loader within the company.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Authors

Duarte Sa Morais

Actions

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Reinforcement Learning with World Models for Autonomous Excavation Optimization in Wheel Loaders

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Authors

Actions

References and Citations

Citation Network

Connected Papers

Discussion

Cite this study