Accurate metastasis prediction in lung adenocarcinoma is essential for effective treatment plans and improved prognosis. Current methods face challenges in accuracy and clinical application. We developed a binary classification model (XGBoost) and a multiclassification model (stacking) using SEER database data. The binary model predicts metastasis presence (M0 vs. non-M0), and the multiclassification model further refines the degree of metastasis (M1a, M1b, M1c). Model performance was assessed using ROC AUC, PR AUC, and KS curves. SHAP values were used to analyze important features and explain the decision-making process. The binary model achieved ROC AUC and PR AUC scores exceeding 0.77, with the KS curve showing high consistency in distinguishing between positive and negative samples. The multiclassification model also performed well, demonstrating stability and generalizability across different metastasis stages. Key predictive factors included AJCC stage, survival duration, tumor size, and treatment information. This study improves the accuracy and clinical application of metastasis prediction in lung adenocarcinoma through interpretable machine learning models. The combination of binary and multiclassification models not only predicts metastasis presence but also details its extent, providing valuable clinical decision support. Future research should integrate diverse data sources to enhance model robustness and better serve clinical practice. Not applicable.
Building similarity graph...
Analyzing shared references across papers
Loading...
Jian Xu
Shuo Chen
Chang Zhao
BMC Medical Informatics and Decision Making
Sichuan University
Central South University
University of Electronic Science and Technology of China
Building similarity graph...
Analyzing shared references across papers
Loading...
Xu et al. (Tue,) studied this question.
www.synapsesocial.com/papers/69a765d3badf0bb9e87da9c8 — DOI: https://doi.org/10.1186/s12911-026-03341-3