A machine-learning classifier using the Illumina protein prep 6K assay achieved a mean AUC of 0.89 and 65% overall sensitivity at 95% specificity for early cancer detection.
Does the Illumina protein prep 6K assay combined with machine learning accurately detect early-stage cancer in plasma samples?
423 plasma samples (217 normal and 206 cancer samples including bladder, breast, gastric, and lung cancers) from Eastern European sources and the United States.
Early-access Illumina protein prep 6K assay combined with a Support Vector Machine (SVM) classifier
Diagnostic performance (AUC, sensitivity, and specificity) for early cancer detectionsurrogate
A machine-learning framework using the Illumina Protein Prep 6K assay demonstrated strong cross-batch and cross-ancestry reproducibility for early cancer detection, achieving a mean AUC of 0.89.
Abstract Background: Studies in cancer early detection have revealed circulating proteins to be powerful and informative biomarkers. We further aimed to develop a machine-learning framework robust to batch effects across independently processed datasets and to identify a subset of proteins capable of detecting cancer at its earliest stages. In this study, we sought to evaluate the early-access Illumina protein prep proteomic assay not only for its technical reproducibility but also for its ability to yield biologically informative protein signatures relevant to cancer early detection. Experimental Procedures: Proteomic profiles from six plates were analyzed, encompassing approximately 217 normal and 206 cancer plasma samples. The first four plates contained samples collected from Eastern European sources and were used for model training and feature selection, while the remaining two plates contained samples collected in the United States from populations with diverse ancestry backgrounds and served as an external test set. These two plates included samples from bladder, breast, gastric, and lung cancers, representing diverse biological and technical conditions. Feature selection was performed using the Minimum Redundancy-Maximum Relevance (MRMR) method, which ranks proteins by maximizing mutual information with cancer status while minimizing redundancy. The top 200 proteins were used to train a Support Vector Machine (SVM) classifier with a radial-basis kernel. Results: The model achieved a mean AUC of 0.89 across the two independent test plates, demonstrating strong cross-batch and cross-ancestry reproducibility and confirming that informative, generalizable protein features can be extracted from the Illumina platform. At 95% specificity, the classifier achieved an overall sensitivity of 65% (95% CI: 51-77%), with particularly strong performance in Stage II cancers at 82% (95% CI: 52-95%), underscoring its potential utility for early detection. Performance was consistent across cancer types, with highest sensitivities observed in lung. Importantly, a six-fold leave-one-plate out cross-validation yielded an average sensitivity of 82% at 99% specificity, demonstrating that integrating diverse data sources will likely strengthen model generalizability. Conclusions: A machine-learning framework applied to large-scale proteomic data identifies a compact and biologically meaningful subset of proteins capable of early cancer detection. The results highlight the robustness of the Illumina Protein Prep, 6K assay, and the feasibility of developing batch-insensitive protein classifiers for population-scale cancer screening. Citation Format: KAMEL LAHOUEL, Mete Mulazimoglu, Kameron Bates, Candice Wike, Kunjur Manasa Upadhyaya, Victoria Zismann, Kamawela Leka, Payton Smith, Gracyn Benck, Kianna Martos Rupp, Chaney Jambor, Matteo Munini, Sophie Pénisson, Stephanie Pond, Jeffrey Trent, Patrick Pirrotte, Cristian Tomasetti. Early cancer detection using early-access Illumina protein prep 6K assay and machine learning abstract. In: Proceedings of the American Association for Cancer Research Annual Meeting 2026; Part 1 (Regular Abstracts); 2026 Apr 17-22; San Diego, CA. Philadelphia (PA): AACR; Cancer Res 2026;86(7 Suppl):Abstract nr 7614.
Building similarity graph...
Analyzing shared references across papers
Loading...
Kamel Lahouel
Mete Mulazimoglu
Kameron Bates
Cancer Research
Translational Genomics Research Institute
Building similarity graph...
Analyzing shared references across papers
Loading...
Lahouel et al. (Fri,) reported a other. A machine-learning classifier using the Illumina protein prep 6K assay achieved a mean AUC of 0.89 and 65% overall sensitivity at 95% specificity for early cancer detection.
www.synapsesocial.com/papers/69d1fd29a79560c99a0a302f — DOI: https://doi.org/10.1158/1538-7445.am2026-7614