Accurate prediction of solvent-bitumen viscosity is essential for the design and optimization of extended-solvent steam-assisted gravity drainage (ES-SAGD) processes, where viscosity reduction governs fluid mobility and recovery efficiency. Due to the highly nonlinear dependence of viscosity on temperature, pressure, solvent concentration, and fluid properties, conventional empirical and thermodynamic models often show limited generality across different operating conditions. In this study, data-driven machine learning techniques were employed to develop predictive models for solvent-bitumen viscosity using an extensive experimental database compiled from literature. The dataset was subjected to systematic preprocessing, including data cleaning, feature standardization, and 80/20 train-test splitting. Two optimized tree-based ensemble algorithms, Random Forest (RF) and Extreme Gradient Boosting (XGBoost), were trained using hyperparameter tuning. Model performance was evaluated using R2, RMSE, and MAE metrics, along with cross-plots, residual analysis, and feature importance evaluation. Results demonstrate that both models successfully capture the strong nonlinear relationships governing viscosity behavior, with XGBoost providing the highest prediction accuracy and the best generalization capability. The R2, RMSE, and MAE values for the test dataset are 0.932791, 811.882, and 111.091 for XGBoost, and those values for RF are 0.854601, 1194.149, and 150.703, respectively. Feature importance analysis confirms that temperature and solvent mole fraction are the dominant variables influencing viscosity reduction. The developed model offers a rapid, reliable, and data-driven alternative to experimental and thermodynamic approaches and can be integrated into ES-SAGD simulation and optimization workflows.
Bamzad et al. (Wed,) studied this question.