Groundwater level prediction is essential for sustainable water resource management. Although machine learning models are widely applied, input variable selection critically affects predictive performance, and existing studies rarely evaluate model performance comprehensively, considering accuracy, stability, physical interpretability, and computational efficiency. To address this issue, this study develops a hybrid framework integrating grid search-optimized long short-term memory (GS-LSTM) with the technique for order preference by similarity to ideal solution (TOPSIS). Using the Houston area as a case study, the framework evaluates 30 input combinations derived from precipitation (P), air temperature (T), relative humidity (H), wind speed (W), and reference evapotranspiration (E) across 22 monitoring wells to identify optimal and minimal input variable combinations sets. Key findings include: (1) optimal input combinations vary substantially among wells, highlighting spatial heterogeneity; (2) P and E are dominant drivers; (3) compared to daily input data, monthly averaged data increases the prediction success rate (proportion of successful runs across 27 hyperparameter configurations) by >40% and improves R2 by >0.3; (4) the minimal set comprises eight representative combinations that collectively cover the top-three ranked variable combinations for all 22 wells, maintaining high accuracy (e.g., Well 12# daily data: MAE = 0.13 m, RMSE = 0.16 m, R2 = 0.92) while reducing computational cost by 92.1% relative to testing all 30 combinations. The proposed optimal and minimal input sets offer a stable, accurate, and computationally efficient solution for groundwater resource management that accounts for spatial heterogeneity.
Li et al. (Thu,) studied this question.