High-resolution mapping of soil organic carbon (SOC) in arid regions remains challenging. Using Xinjiang as a case study, this research constructed a prediction framework integrating Boruta feature selection with the Random Forest (RF) algorithm to achieve refined mapping of topsoil SOC. Results indicated that: (1) Among the tested machine learning models, the Boruta–RF framework achieved the highest predictive performance (R2 = 0.48, with the lowest RMSE); (2) Evapotranspiration (ET) and Vapor Pressure Deficit (VPD) were dominant drivers, with the stepwise increase in ET and negative inhibition of VPD confirming the decisive role of hydrothermal fluxes in regulating carbon input; (3) The total SOC storage was estimated at approximately 3.20 Pg C. Despite low carbon density, the desert ecosystem contributed 44.33% of the total storage, constituting a massive Sparse Carbon Pool. This study confirms the necessity of incorporating hydrothermal parameters and highlights that neglecting desert ecosystems leads to a significant underestimation of regional carbon storage.
Li et al. (Sat,) studied this question.