Cardiovascular disease is now the leading cause of death globally; however, there remains a significant barrier to the use of machine learning models to predict cardiovascular disease due to their lack of generalizability from smaller databases. The aim of this study was to develop an interpretable ensemble learning model based on a large database of over 70,000 patients. Specifically, this research created three advanced Gradient Boosting Models and enhanced these by Synthetic Minority Over-sampling Technique (SMOTE) for improving the data balance as well as Bayesian Hyperparameter Tuning for optimizing the models. The top performing Voting Ensemble had a test accuracy of 73.61% and an Area Under the Curve of 0.8022. SHAP results demonstrated that systolic blood pressure was the most important feature and that meaningful clinical thresholds existed at 120 mmHg for blood pressure and 48 years for age that aligned with established medical standards. Although the model’s accuracy was somewhat lower than the benchmark of 78.42%, the increased interpretability of the model makes it a promising tool for clinical decision support. Therefore, this research demonstrates that by combining powerful ensemble methods with strong explanations for decisions, more trustworthy AI systems in the field of cardiovascular medicine can be created.
Building similarity graph...
Analyzing shared references across papers
Loading...
Yifeng Wang
Building similarity graph...
Analyzing shared references across papers
Loading...
Yifeng Wang (Mon,) studied this question.
www.synapsesocial.com/papers/69df2b2ce4eeef8a2a6b011d — DOI: https://doi.org/10.1051/itmconf/20268401007/pdf