Machine learning-enhanced ultra-deep sequencing detected breast cancer ctDNA mutations with a median LoD of 0.15-0.25% and 83.0% PPA in clinical samples.
Does machine learning-enhanced ultra-deep sequencing accurately detect low-abundance ctDNA in breast cancer patients?
Blood samples from 51 cancer-free donors and 44 clinical samples from breast cancer patients.
Machine learning-enhanced ultra-deep targeted sequencing with UMI-tagging for ctDNA detection
Analytical performance including Limit of Detection (LoD), Limit of Blank (LoB), Positive Percent Agreement (PPA), and Negative Percent Agreement (NPA)surrogate
Machine learning-enhanced ultra-deep sequencing provides high accuracy and precision for detecting low-abundance ctDNA in breast cancer, offering a non-invasive diagnostic tool.
Abstract Breast Cancer is a major global health challenge and remains one of the most common cancers among women. Although traditional tissue biopsies provide valuable diagnostic insights, their invasive nature makes them unsuitable for long-term monitoring. Liquid biopsy, which analyzes circulating tumor DNA (ctDNA) in blood samples, offers a minimally invasive alternative with growing potential in breast cancer care. Recent studies have demonstrated that detecting specific mutations, such as ESR1, in ctDNA can guide treatment decisions, significantly reducing disease progression and mortality. Furthermore, ctDNA monitoring can accurately predict long-term outcomes, identifying patients at low risk of relapse. These findings highlight the importance of liquid biopsy in guiding personalized treatment strategies and improving patient outcomes, even in challenging low-tumor-burden scenarios. To establish an ultra-deep targeted sequencing for low-ctDNA detection, we collected blood samples from non-cancer controls and breast cancer patients. An advanced UMI-tagging protocol was applied to process 20 - 60ng of plasma cfDNA and 100ng of WBC genomic DNA, enabling targeted enrichment of 168 cancer-related genes through designed panels. Library sequencing was conducted on the NovaSeq6000 platform, producing a raw depth of 35,000X, and a mean unique molecular depth of 4000X. A multi-step data processing pipeline was implemented to eliminate technical noise and normalize biological heterogeneity. First, reads with identical UMIs and endpoints position were used to collapse and merge to generate high-accuracy consensus sequences, effectively remove random errors introduced during PCR amplification and sequencing. Building on this, a machine-learning-driven noise model was developed to systematically identify and filter recurrent artifacts, including DNA damage patterns and sequencer-specific error signatures. Additionally, paired WBC-sequencing was applied into the workflow to distinguish somatic mutations from non-cancer biological aberrations, such as clonal hematopoiesis-derived variants, ensuring specificity in detecting tumor-associated alterations. To evaluate analytical performance, ground-truth reference materials were utilized, with subsequent confirmation through a dilution series across different tumor allele fraction (TAF) levels. The limit of detection (LoD) was rigorously assessed across different variant classes at defined input masses. For panel-wide SNVs, the median LoD is 0.25% with a 20ng input, and this threshold could be further reduced to 0.15% with a 50ng input. For panel-wide indels, the median LoD improved from 0.25% at a 20ng input to 0.14% at 50ng input. Additionally, the limit of blank (LoB) was defined from 51 cancer-free donors with paired WBC controls, resulting in an exceptionally low error rate of 4.88*10-7 for both the SNVs and indels across the entire gene panel. This panel covers whole coding regions of key breast cancer-associated genes, such as PTEN, ESR1, PIK3CA, BRCA1, BRCA2 and PALB2. To further validate precision and accuracy, a combined evaluation was performed on 44 clinical samples derived from breast cancer patients with an expanded panel. This analysis showed a positive percent agreement (PPA) of 83.0% for panel-wide SNVs, 85.7% for panel-wide indels, along with a negative percent agreement (NPA) of 99.9%. In this study, we demonstrated that the accuracy and consistency of the advanced ultra-deep mutation sequencing technologies. Combined with machine-learning tools, this approach offers a practical, non-invasive diagnostic tool for mutation profiling, applicable even when tumor levels are low. Citation Format: X. LiY. NiS. ZhangC. YangG. LiY. ZhangX. ChenX. YangJ. SuB. Li. Machine learning-enhanced ultra-deep sequencing for low-abundance circulating tumor DNA (ctDNA) in breast cancer abstract. In: Proceedings of the San Antonio Breast Cancer Symposium 2025; 2025 Dec 9-12; San Antonio, TX. Philadelphia (PA): AACR; Clin Cancer Res 2026;32(4 Suppl):Abstract nr PS4-02-27.
Building similarity graph...
Analyzing shared references across papers
Loading...
X. Li
Yuwei Ni
S. Zhang
Clinical Cancer Research
Burning Rock Biotech (China)
Building similarity graph...
Analyzing shared references across papers
Loading...
Li et al. (Tue,) reported a other. Machine learning-enhanced ultra-deep sequencing detected breast cancer ctDNA mutations with a median LoD of 0.15-0.25% and 83.0% PPA in clinical samples.
www.synapsesocial.com/papers/6996a898ecb39a600b3ef76e — DOI: https://doi.org/10.1158/1557-3265.sabcs25-ps4-02-27
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: