What question did this study set out to answer?

The aim is to develop and assess machine learning models for predicting high-grade cervical intraepithelial neoplasia (CIN2+ and CIN3+).

February 27, 2026Open Access

Development and interpretability of an XGBoost model to predict high grade cervical intraepithelial neoplasia

Key Points

The aim is to develop and assess machine learning models for predicting high-grade cervical intraepithelial neoplasia (CIN2+ and CIN3+).
Retrospective analysis including 2,863 participants.
Development of three models: Logistic Regression, Random Forest, and XGBoost.
Dataset split into training (60%) and test (40%) sets.
Model performance evaluated using AUC, calibration curves, decision curve analysis, NRI, and IDI.
SHAP method used for model interpretability.
XGBoost model achieved highest test set AUCs: 0.735 for CIN2+ and 0.841 for CIN3+.
Significant AUC improvement for CIN3+ detection compared to Random Forest model (P < 0.001).
XGBoost showed notable NRI: 13.1% for CIN2+ and 28.0% for CIN3+ over Logistic Regression.
IDI for XGBoost was significant: 12.1% for CIN2+ and 11.0% for CIN3+.

Abstract

Effective risk stratification is crucial for managing cervical lesions. This study aimed to develop and evaluate machine learning models to improve the detection of cervical intraepithelial neoplasia (CIN) grade 2 or worse (CIN2+) and CIN3+. This retrospective study included 2,863 participants. We developed three models: a Logistic Regression (Logit.Model), a Random Forest (RF.Model), and an XGBoost (XGBoost.Model) to predict CIN2 + and CIN3 + status. The dataset was split into training (60%) and test (40%) sets. Model performance was assessed using AUC, calibration curves, decision curve analysis (DCA), net reclassification improvement (NRI), and integrated discrimination improvement (IDI). The SHapley Additive explanation (SHAP) method was employed for model interpretation. XGBoost.Model demonstrated robust performance, achieving the highest test set AUCs of 0.735 for CIN2 + and 0.841 for CIN3+. It showed significantly higher AUC for CIN3 + detection compared to RF.Model (P < 0.001). XGBoost.Model also provided significant NRI (13.1% for CIN2+, 28.0% for CIN3+) and IDI (12.1% for CIN2+, 11.0% for CIN3+) over the Logit.Model (all P < 0.05). SHAP analysis confirmed the model’s interpretability, highlighting key predictive features such as cytology and specific HPV genotypes. The XGBoost.Model exhibited superior and consistent performance, achieving the highest test set AUC and providing a significant NRI and IDI over the logistic regression model. Not applicable.

Mark Helpful

Bookmark

Relay

View Full Paper