What question did this study set out to answer?

The aim is to create an effective machine learning framework to accurately predict the compressive strength of ultra-high-performance concrete (UHPC).

April 25, 2026Open Access

Machine Learning-Based Strength Prediction of Fiber-Reinforced UHPC: A Data-Driven Framework with Feature Engineering and Uncertainty Quantification

Key Points

The aim is to create an effective machine learning framework to accurately predict the compressive strength of ultra-high-performance concrete (UHPC).
Evaluated 20 algorithms across seven categories using 863 experimental observations.
Engineered six composite features such as water-cement ratio and fiber aspect ratio for prediction.
Ensured statistical robustness through 30 repeated experiments with frequentist and Bayesian analyses.
Optimal performance achieved with CatBoost (R2 = 0.8979 ± 0.0239, RMSE = 10.58 ± 1.45 MPa).
Curing age, sand content, and steel fiber volume identified as key predictors.
External validation shows R2 = 0.5923 (RMSE = 25.68 MPa) under significant cross-dataset conditions.

Abstract

Accurate prediction of ultra-high-performance concrete (UHPC) compressive strength is essential for optimizing mixture design and reducing experimental iterations. Existing machine learning approaches suffer from limited algorithm diversity, insufficient statistical validation, and inadequate uncertainty quantification. This study presents a comprehensive framework through systematic evaluation of 20 algorithms across seven categories on 863 experimental observations. Six physically meaningful composite features (such as water-cement ratio, total binder content, and fiber aspect ratio) are engineered to capture intrinsic material relationships, with the Boruta algorithm employed for feature selection. Statistical robustness is ensured through 30 repeated experiments analyzed using both frequentist (p-value, effect size, 95% CI) and Bayesian approaches. CatBoost achieves optimal performance (R2 = 0.8979 ± 0.0239, RMSE = 10.58 ± 1.45 MPa), with curing age, sand content, and steel fiber volume identified as dominant predictors through multi-perspective interpretability analysis integrating SHAP, ALE, permutation importance, and LIME. External validation on 810 independent samples yields R2 = 0.5923 (RMSE = 25.68 MPa) under significant cross-dataset conditions, with performance reduction attributed to feature availability differences and distribution shift. Comprehensive uncertainty quantification yields prediction uncertainty of 3.48%, substantially below previously reported thresholds. The proposed framework offers practitioners a reliable tool for UHPC mixture screening while maintaining prediction confidence for structural engineering applications.

Read Full Paperexternally

Bookmark

View Full Paper

Cite This Study

Huang et al. (Thu,) studied this question.

synapsesocial.com/papers/69ec59fc88ba6daa22daba3b https://doi.org/https://doi.org/10.3390/sym18050710

Bookmark

View Full Paper