This study proposes a novel data-driven machine learning (ML) framework for multi-criteria environmental, social, and governance (ESG) evaluation. The framework aims to address the transparency, consistency, and subjectivity limitations of existing ESG evaluation systems by employing a fully data-driven, modular, and ML-supported architecture. It comprises three main modules: (i) ESG data preprocessing with missing-data imputation by the MissForest algorithm; (ii) a three-plane ESG feature selection workflow that integrates clustering, feature importance, and classification algorithms to identify representative ESG indicators; and (iii) a hybrid weighting and ranking procedure that combines unsupervised principal component analysis (PCA), criteria importance through inter-criteria correlation (CRITIC), and technique for order preference by similarity to ideal solution (TOPSIS) methods. A recent 2024 real-world application involving 57 listed Chinese pharmaceutical and biotechnology companies and 70 ESG indicators demonstrates the framework’s practical utility in producing transparent and objective ESG rankings. The main contributions of this work are fourfold: (1) the development of an end-to-end, entirely data-driven ML framework for ESG evaluation; (2) the introduction of an innovative three-plane ESG feature selection workflow within the framework; (3) the first study for designing a hybrid PCA-CRITIC-TOPSIS approach in ESG weighting and ranking; (4) the validation of the framework through a real-world industry application using recent and authentic ESG data.
Wang et al. (Wed,) studied this question.