What question did this study set out to answer?

To develop and validate a radiomics-machine learning model for predicting invasiveness in subcentimeter subsolid lung adenocarcinoma.

March 30, 2026Open Access

Radiomics-machine learning model for predicting invasiveness of subcentimeter subsolid lung adenocarcinoma: a validation study with external cohort and SHAP interpretability

Key Points

To develop and validate a radiomics-machine learning model for predicting invasiveness in subcentimeter subsolid lung adenocarcinoma.
Conducted a retrospective analysis with 177 patients from one hospital and 83 from another for external validation.
Extracted radiomic features from CT scans of subcentimeter subsolid nodules.
Applied feature selection techniques including mRMR and LASSO regression.
Trained three machine learning classifiers: logistic regression, random forest, and support vector machine for model evaluation.
Evaluated model performance using various metrics, including AUC, sensitivity, and specificity.
The logistic regression model showed the best performance with an AUC of 0.842 during internal validation.
Maintained a robust AUC of 0.778 in the external validation cohort, demonstrating generalizability.
Ten radiomic features predictive of invasiveness were identified, including markers of morphological irregularity and necrosis.
Decision curve analysis confirmed the model's clinical utility, outperforming standard management strategies.

Abstract

Background Preoperative discrimination of invasive adenocarcinoma (IAC) from pre-invasive lesions in subcentimeter subsolid nodules (SSNs) remains challenging using conventional computed tomography (CT). We aimed to develop and validate an interpretable radiomics-machine learning (ML) model for predicting invasiveness by leveraging SHapley Additive exPlanations (SHAP). Methods In this two-center retrospective study, 177 patients from Hospital 1 (training and internal validation) and 83 patients from Hospital 2 (independent external validation) with surgically confirmed lung adenocarcinoma manifesting as SSNs (≤1 cm) were enrolled. Radiomic features were then extracted from preoperative CT using the uAI Research Portal. Following a reproducibility assessment (intraclass correlation coefficient 0. 75), the minimum Redundancy Maximum Relevance (mRMR) and Least Absolute Shrinkage and Selection Operator (LASSO) regression were applied to select the most predictive features. Three ML classifiers: logistic regression (LR), random forest (RF) and support vector machine (SVM) were trained and validated using a 7: 3 cohort split, and the best-performing model was further evaluated in the external validation cohort. Model performance was evaluated by the area under the receiver operating characteristic curve (AUC), sensitivity, specificity, F1 score, calibration, and decision curve analysis (DCA). SHAP analysis was employed to provide global and local model interpretability. Results A set of ten radiomic features was selected to predict invasiveness (IAC prevalence: 44. 6%). The LR model demonstrated optimal performance during internal validation (AUC: 0. 842; sensitivity: 79. 2%; specificity: 73. 3%; F1 score: 0. 745) and exhibited superior generalizability compared to both the RF and SVM models. In the external validation cohort, the LR model maintained robust diagnostic performance, with an AUC of 0. 778 (95%CI: 0. 673-0. 862), confirming its cross-institutional generalizability. The DCA and PRC curves further confirmed its clinical utility and stability across different institutions. SHAP analysis identified waveletHLLglszmLowGrayLevelZoneEmphasis (an indicator of necrosis), originalₛhapeFlatness (reflecting morphological irregularity), and logfirstorderLoG. Minimum (suggestive of air-trapping) as top predictors of invasiveness. Decision curve analysis confirmed the model’s superior clinical utility over empirical management strategies. Conclusion The developed radiomics-LR model robustly predicts invasiveness in subcentimeter SSNs and provides biologically plausible explanations through SHAP. Its balanced performance and inherent interpretability support its potential integration into clinical workflow to aid in surgical decision-making.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Cite this study

Feng et al. (Thu,) studied this question.

synapsesocial.com/papers/69ca1210883daed6ee094ce6 — DOI: https://doi.org/10.3389/fonc.2026.1668102

Authors

Wenfeng Feng

Ruiting Chang

Tiezhi Li

Journals

Frontiers in Oncology

SHILAP Revista de lepidopterología

Actions

Institutions

Hebei Medical University

Second Hospital of Hebei Medical University

Harrison International Peace Hospital

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Radiomics-machine learning model for predicting invasiveness of subcentimeter subsolid lung adenocarcinoma: a validation study with external cohort and SHAP interpretability

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Cite this study

Authors

Journals

Actions

Institutions

References and Citations

Citation Network

Connected Papers

Discussion