Abstract Introduction: Cigarette smoking (CS) is associated with adverse health effects, including an increased risk of cancer development. Epigenome-wide association studies (EWAS) show that CS significantly alters DNA methylation (DNAm). While several DNAm-based predictors of smoking history have been trained using blood samples, their accuracy in tissue types remains limited. We aimed to develop a predictive model for smoking history that integrates DNAm data with reference based cell composition estimates in human tissue samples. Methods: We utilized Infinium MethylationEPIC array data from the Genotype-Tissue Expression (GTEx) project as training and testing, and the Clinical Proteomic Tumor Analysis Consortium Lung Adenocarcinoma (LUAD) cohort as external validation. Quality control and beta value extraction were performed with ENmix. Cell type proportions (CPs) were estimated by HiTIMED. Surrogate variables (SVs) were derived from GTEx to account for batch effects. EWAS was conducted to identify differentially methylated CpGs (DMC), adjusting for tissue type, sex, age, CPs, and SVs. GTEx was split into training and testing (70:30) sets. DMC, CPs, sex, age, and tissue type were used to train an Elastic Net model predicting smoking history (smoker vs. non-smoker). Model accuracy was assessed by the area under the receiver operating characteristic curve (AUC) comparing the reported smoking status to the predicted probabilities. LUAD normal samples were classified into 3 groups based on predictions for survival analysis using Coxph. Results: After QC, 654 GTEx normal, 183 normal adjacent and 199 primary tumor LUAD samples were retained. 198 DMC sites were enriched for cellular response to xenobiotic stimulus pathways. The model selected 83 features at min lambda, including 79 CpG sites, age, NK and CD8 CPs, and breast tissue type. cg21566642 (Lnc-ECEL1-1) overlapped with 3 published blood predictors; cg07339236 and cg11554391 with one. The ASH-MARCC model demonstrated near-perfect internal performance (training AUC 0.98, 95% CI 0.97-0.99; testing AUC 0.91, 95% CI 0.88-0.95). In external validation, the model maintained high accuracy on normal samples (AUC 0.81, 95% CI 0.75-0.87), while performance decreased but remained informative on tumor samples (AUC 0.66, 95% CI 0.58-0.74). The highest risk group showed a non-significant risk (adj-HR 2.1, 95% CI 0.98-4.56) versus the lowest. Conclusions: We developed ASH-MARCC, a DNA methylation based model to assess smoking history in human tissue. An objective and accurate smoking classification enhances clinical decision-making across cancer types by treatment personalization and prognostic precision. By enabling direct evaluation of smoking exposure in cancer tissues, especially when self-reported information is unavailable, ASH-MARCC offers a powerful tool with crucial applications in both research and clinical settings. Citation Format: Minghui Zhang, Brock C. Christensen, Lucas A. Salas Diaz, . ASH-MARCC: Assessment of smoking history via methylation and reference-based cell composition in human tissue samples abstract. In: Proceedings of the American Association for Cancer Research Annual Meeting 2026; Part 1 (Regular Abstracts); 2026 Apr 17-22; San Diego, CA. Philadelphia (PA): AACR; Cancer Res 2026;86(7 Suppl):Abstract nr 2312.
Building similarity graph...
Analyzing shared references across papers
Loading...
Zhang et al. (Fri,) studied this question.
www.synapsesocial.com/papers/69d1fcd4a79560c99a0a28c5 — DOI: https://doi.org/10.1158/1538-7445.am2026-2312
Minghui Zhang
Brock C. Christensen
Lucas A. Salas Diaz
Cancer Research
Dartmouth College
Building similarity graph...
Analyzing shared references across papers
Loading...