Breast cancer is still the most significant contributor to morbidity and mortality among women in China. Despite advances in imaging and molecular testing, few reliable biomarkers exist for early detection and disease characterization. The identification of new marker genes related to breast carcinogenesis could greatly improve diagnostic accuracy, and potentially influence treatment decisions. In this study, machine learning algorithms were implemented using the R programming environment to evaluate three publicly available breast cancer datasets included in the Gene Expression Omnibus (GEO) database. We screened differentially expressed genes and then selected the best feature genes using a machine learning cased feature selection model. Finally, we experimentally validated these genes by performing quantitative polymerase chain reaction (qPCR), Western blots, and immunohistochemistry (IHC). By intersecting the top 10 signature genes from each dataset, we were able to identify two consistently diagnostic gene candidates; S100P and COL10A1. Both genes were discovered to exhibit significantly greater expression in the tissues of breast cancer vs. normal controls, across all experimental validation. Our results suggest that S100P and COL10A1 may be appropriate as adjunct molecular biomarkers for improved early and accurate breast cancer diagnosis and could be especially helpful in cases with indeterminate morphological features to improve detection rates and decrease cancer related.
Building similarity graph...
Analyzing shared references across papers
Loading...
Xin Chen
Wenyang Pang
Yuanfan Wang
International Journal of Computational Intelligence Systems
Taizhou Municipal Hospital
Building similarity graph...
Analyzing shared references across papers
Loading...
Chen et al. (Tue,) studied this question.
www.synapsesocial.com/papers/69d893626c1944d70ce045c3 — DOI: https://doi.org/10.1007/s44196-026-01224-z