This study employs computational intelligence techniques – gene expression programming (GEP), back-propagation neural network (BPNN), support vector regression (SVR) and linear regression (LR)–to model the quantitative relationship between pollutant gases (PGs) and PM2.5 concentrations using 2021 environmental data from 12 Chinese cities. A comparative analysis was conducted to evaluate model performance using the correlation coefficient (R), Root Mean Squared Error (RMSE), and Mean Absolute Error (MAE). Results showed that the correlation coefficients (R) between predicted and actual PM2.5 concentrations ranged from –0.7579 to 0.9802 across all models. SVR and LR demonstrated the most robust performance, achieving high average R values of 0.8656 and 0.8671, respectively. LR also yielded the lowest average RMSE (0.12) and MAE (0.06) across the cities. GEP proved capable of finding highly accurate explicit models, achieving a maximum R of 0.9766. A key finding from the LR models is that CO and PM10 consistently had the most significant impact on PM2.5 concentrations. Correlation formulas derived from GEP and LR can support further PM2.5 analysis. These findings offer insights into PM2.5 formation mechanisms and inform pollution control strategies.
Wang et al. (Wed,) studied this question.