What does this research mean for the field?

A machine learning model using Ki-67, histologic grade, and progesterone receptor status moderately predicts high Oncotype DX® Recurrence Scores (RS ≥ 25) in patients with ER+ breast cancer. Novelty: ClaimNovelty.CONFIRMATORY. Consensus alignment: ConsensusAlignment.SUPPORTS_CONSENSUS.

What question did this study set out to answer?

The study aims to evaluate the accuracy of machine learning models based on clinicopathologic data in predicting high Oncotype DX® recurrence scores.

February 19, 2026

Abstract PS3-08-12: Machine Learning Prediction of High Oncotype DX® RS based on Clinicopathologic features: Results from the GBECAM 0520 Multicenter Retrospective Study

Key Points

The study aims to evaluate the accuracy of machine learning models based on clinicopathologic data in predicting high Oncotype DX® recurrence scores.
Retrospective cohort assembly of 897 ER+ breast cancer patients with Oncotype DX results.
Statistical analysis to identify predictive variables such as Ki-67, PR status, and histologic grade.
Development of a ridge regression model to estimate Recurrence Score values.
Performance measured using area under the curve (AUC) and Pearson correlation.
Leave-one-center-out cross-validation for robustness.
The model achieved a mean Pearson correlation of 0.50 ± 0.03 between predicted and actual recurrence scores.
An AUC of 0.78 ± 0.03 for detecting high-risk tumors (Recurrence Score ≥ 25).
Ki-67 emerged as the strongest positive predictor of high recurrence scores.
Histologic grade positively correlated, while PR status was inversely correlated with recurrence score.

Abstract

Abstract Background: The 21-gene Oncotype DX® assay is a validated prognostic and predictive tool for assessing chemotherapy (CT) benefit in patients with estrogen receptor–positive (ER+), HER2-negative (HER2–) early-stage breast cancer. It has demonstrated clinical utility in node-negative (N0) patients and postmenopausal patients with N1 disease. However, financial cost associated with the test may significantly limit its accessibility to patients. The present study evaluates whether machine learning models based on routinely available clinicopathologic and immunohistochemical data could accurately predict high likelihood of a Recurrence Scores greater than 25, possibly providing support for clinical decision-making. Methods: We retrospectively assembled a cohort of 897 patients diagnosed with ER+ breast cancer (BC) diagnosed between 2005–2024 at seven cancer centers. All patients had Oncotype DX® results and a complete pathology report. Of these, 158 (18%) had RS 25. Statistical analysis identified Ki-67 (%), progesterone receptor (PR) status, and histologic grade as the most predictive variables. These were included in a ridge regression model to estimate continuous RS values (Table 1). Discriminative performance for RS 25 was measured by AUC. To test robustness, we performed leave-one-center-out cross-validation: training on six centers and testing on the seventh. Performance was averaged across folds. Results: Among the 897 patients included, 708 (78.9%) had N0 disease, 62 (6.9%) N1mic, and 118 (13.2%) N1 disease. Of note, nodal information was unavailable for 9 patients (1.0%). The median age was 54 years (range, 24–83). Oncotype DX® testing indicated low (11), intermediate (11–25), and high (25) genomic risk in 165 (18.4%), 574 (64.0%), and 158 (17.6%) of patients, respectively. The model achieved a mean Pearson correlation of 0.50 ± 0.03 between predicted and actual RS, and an AUC of 0.78 ± 0.03 for detecting high-risk (RS ≥ 25) tumors. AUC stratified by menopausal status showed no significant difference (p = 0.858), indicating similar performance across groups. After min–max scaling, Ki-67 was the strongest positive predictor, followed by histologic grade conversely, PR status was inversely correlated with RS. These findings are consistent with established tumor biology and previous studies, in which a high proliferative index, poor histologic differentiation, and low immunohistochemistry PR expression are features associated with more aggressive tumor characteristics. Conclusion: A simple machine learning model using only Ki-67, histologic grade, and PR status moderately predicts the Oncotype DX® score and accurately identifies high-risk cases. In resource-limited settings, this approach could support clinical decision-making. Prospective validations are needed to confirm its clinical utility.Table 1: Significant variables according to Recurrence Score (RS) categories (RS ≤ 25 and RS 25). Values are expressed as mean ± standard deviation for the continuous variable (Ki-67) and as number (percentage) for all categorical variables. Citation Format: J araujo, D araujo, L. Oliveira, P. de Souza, D. Rosa, A. Katz, D. Suzuki, D. Argolo, S. Sanches, L. Testa, J. Bines, R. Kaliks, R. Sousa, T. Corrêa, A. Shimada, C. dos Anjos, R. Linck, T. Megid, D. Batista, D. Gomes, M. Cesca, D. Gaudêncio, L. Moura, R. Bonadio, Z. Souza, J. Beal, M. Lopes, L. Sales, J. Marlière, M. Mano, D. Gagliato. Machine Learning Prediction of High Oncotype DX® RS based on Clinicopathologic features: Results from the GBECAM 0520 Multicenter Retrospective Study abstract. In: Proceedings of the San Antonio Breast Cancer Symposium 2025; 2025 Dec 9-12; San Antonio, TX. Philadelphia (PA): AACR; Clin Cancer Res 2026;32(4 Suppl):Abstract nr PS3-08-12.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Authors

João Luiz Vitorino Araujo

D. Araújo

L. Oliveira

Journals

Clinical Cancer Research

Actions

Institutions

Universidade de São Paulo

Hospital Israelita Albert Einstein

AC Camargo Hospital

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Abstract PS3-08-12: Machine Learning Prediction of High Oncotype DX® RS based on Clinicopathologic features: Results from the GBECAM 0520 Multicenter Retrospective Study

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Authors

Journals

Actions

Institutions

References and Citations

Citation Network

Connected Papers

Discussion

Cite this study