ABSTRACT The Hildebrand solubility parameter is a cornerstone descriptor for predicting thermodynamic compatibility in polymer‐solvent systems, yet its determination for polymers remains experimentally challenging, and traditional group contribution estimates lack accuracy and transferability. Here, we present an integrated framework that combines atomistic molecular dynamics (MD) simulations with machine learning (ML) to predict solubility parameter values across diverse molecular and polymeric systems. Solubility parameter values derived from MD simulations provide training targets for ML models built on molecular, quantum‐chemical, and chain‐specific descriptors. By systematically applying feature‐selection strategies, Pearson correlation filtering, principal component analysis, and recursive feature elimination, we construct optimized descriptor subsets tailored to each learning algorithm. Benchmarking across regression methods reveals that Gaussian process regression with recursive feature elimination achieves the best performance, with R 2 = 0.94, MAE = 0.54, and RMSE = 0.96 MPa 1/2 on the test set, consistently outperforming both empirical group‐contribution methods and conventional regressors. Our results highlight the critical role of descriptor curation and nonlinear models in capturing the complexity of polymer thermodynamics. This hybrid MD‐ML strategy establishes a generalizable and computationally efficient pathway for predicting solubility parameters, enabling rapid screening of solubility and miscibility for both known and novel molecules and polymers.
George et al. (Sun,) studied this question.