Soil pH is a critical soil property governing nutrient availability and ecosystem functioning. Digital mapping of its spatial distribution is essential for precision agriculture and sustainable land management. This study performs a comparative analysis of six tree-based models coupled with residual kriging (RK) for 30 m resolution mapping of soil pH in Shayang County, China. Specifically, random forest (RF), extremely randomized trees (ERT), gradient boosting decision tree (GBDT), extreme gradient boosting (XGBoost), light gradient boosting machine (LightGBM), and categorical boosting (CatBoost) were used. Based on 1343 soil samples and 32 environmental variables, experimental results demonstrate that the integration of RK enhanced the prediction accuracy of all standalone models by taking the spatial dependence of residuals into account. Among the models, CatBoost-RK achieved the best performance with an R2 of 0.7265, RMSE of 0.5072, and RPD of 1.9122, closely followed by ERT-RK and RF-RK. The analysis of variable importance identified soil type (ST) and mean annual precipitation (MAP) as the most critical factors affecting soil pH distribution. The generated 30 m resolution soil pH map reveals distinct patterns across different land use types, with croplands showing lower soil pH and grasslands exhibiting higher pH with greater variability. These findings confirm the effectiveness of the hybrid ML-RK framework and provide valuable insights for selecting optimal modeling strategies in digital soil mapping.
Tian et al. (Wed,) studied this question.