Buried-pipeline leakage poses significant safety risks, yet traditional CFD (Computational Fluid Dynamics) simulations are too slow for real-time diagnosis. This study integrates machine learning with interval sampling to develop a fast and interpretable prediction method. From 1.4 billion CFD-generated data points, 140 million representative samples were extracted via 1:10 interval sampling. Using 17 physical features as inputs, we trained and compared XGBoost, LightGBM, and a Multi-Layer Perceptron (MLP). The MLP model demonstrated exceptional performance (R2 (R-squared) = 0.9988, RMSE (Root Mean Square Error) = 0.0153), significantly outperforming the tree-based models (R2 ≈ 0.93). Three independent sampling runs confirmed its robustness (R2 coefficient of variation~0%). SHAP (Shapley Additive Explanations) analysis identified spatial coordinates and leak aperture as the most critical factors, while also revealing the nonlinear influence of soil particle size. This approach offers a high-precision, interpretable, and efficient surrogate model for buried-pipeline leakage warning systems.
Yu et al. (Thu,) studied this question.