Short-term load forecasting plays a pivotal role in modern power system operations yet it remains challenging due to the complex spatiotemporal dependencies in load data. This paper proposes a dual-head attention residual network (DARNet) that significantly advances STLF through three key innovations: (1) a hybrid encoder combining 1D-CNN and GRU architectures to simultaneously capture the local load patterns and long-term temporal dependencies, achieving a 28% better locality awareness than that of conventional approaches; (2) a novel dual-head attention mechanism that dynamically models both the inter-temporal relationships and cross-variable dependencies, reducing the feature engineering requirements; and (3) an autocorrelation-adjusted recursive forecasting framework that cuts the multi-step prediction error accumulation by 33% compared to that with standard seq2seq models. Extensive experiments on real-world datasets from three Chinese cities demonstrate DARNet’s superior performance, outperforming six state-of-the-art benchmarks by 21–35% across all of the evaluation metrics (MAPE, SMAPE, MAE, and RRSE) while maintaining robust generalization across different geographical regions and prediction horizons.
Ren et al. (Wed,) studied this question.