The material delivery route prediction problem aims to forecast the future delivery routes of couriers given a set of tasks. Due to the high non-linearity and complexity of vast historical data, as well as factors related to individual courier preferences, this task poses significant challenges. Most existing methods use deep neural networks based on supervised learning to capture behavior patterns from historical data. However, they often struggle with the dynamic nature of the data and the diversity of individual preferences. This paper proposes a new deep reinforcement learning framework that integrates Variational Autoencoders (VAE) to handle large-scale data features and incorporates dynamic embedding features to accurately reflect personal preferences. The framework is trained using Proximal Policy Optimization (PPO) for optimized policy. Experimental validation with two publicly available real-world urban delivery datasets from Cainiao Network and two datasets generated from cities across multiple countries shows that the proposed framework significantly outperforms seven existing prediction methods.
Building similarity graph...
Analyzing shared references across papers
Loading...
Hui Liu
Yinghui Pan
Buxin Zeng
Tsinghua Science & Technology
Building similarity graph...
Analyzing shared references across papers
Loading...
Liu et al. (Wed,) studied this question.
www.synapsesocial.com/papers/69df2c62e4eeef8a2a6b183a — DOI: https://doi.org/10.26599/tst.2025.9010105