This study presents a machine learning (ML) framework for predicting heat transfer in magnetohydrodynamic (MHD) mixed convection within a complex V-shaped cavity filled with a nano-enhanced phase change material (NEPCM) suspension. Accurate computational fluid dynamics (CFD) simulations are essential for understanding heat transfer mechanisms in such systems, but generating comprehensive data through high-fidelity models remains computationally expensive. To address this challenge, we develop an integrated ML approach that combines synthetic data generation, physics-informed feature engineering, and optimized ensemble boosting. The methodology first augments a limited 34-sample CFD dataset to 2034 samples using Latin Hypercube Sampling with Radial Basis Function interpolation. Next, 14 physics-based features are engineered to encode the underlying physical phenomena. Finally, hyperparameters of three gradient boosting models—eXtreme Gradient Boosting (XGBoost), Light Gradient Boosting Machine (LightGBM), and Categorical Boosting (CatBoost)–are optimized via cross-validation. The framework predicts the average Nusselt number ( N u avg ) and average kinetic energy ( K E avg ) from seven geometric and operational inputs. CatBoost achieved optimal performance for N u avg ( R 2 = 0 . 9745 , mean absolute percentage error = 1.27%), while XGBoost excelled for K E avg ( R 2 = 0 . 9920 , mean absolute percentage error = 1.04%). The novelty of this work lies in its ability to generalize across different output variables and significantly reduce computational cost, enabling rapid design optimization and in-depth parametric analysis. This generalizable approach reduces computation time from hours to milliseconds, facilitating efficient design optimization and in-depth parametric studies. • ML framework predicts heat transfer in MHD mixed convection of NEPCM suspension. • Combines physics-informed feature engineering with optimized ensemble boosting. • CatBoost best for Nusselt number (R 2 = 0.9745), XGBoost for kinetic energy (R 2 = 0.9920). • Synthetic data generation expands limited CFD dataset from 34 to 2034 samples. • Reduces simulation time from hours to milliseconds for efficient design exploration.
Hussain et al. (Mon,) studied this question.