Abstract Epigenetic alterations, particularly DNA methylation, play a crucial role in the progression of oral squamous cell carcinoma (OSCC) from oral leukoplakia (OL). However, the molecular mechanisms driving this transition remain poorly understood. Using interpretable machine learning (IML) on genome-wide methylation data from 118 samples (22 OL, 74 OSCC, and 22 controls), we identified 20 key CpG sites among 820 193 loci through SHAP (SHapley Additive exPlanations) analysis. Notably, cg19853638, cg25393842, cg01743793, and cg10784570 mapped to pivotal genes such as TNFRSF19, ALOX5, and SH3PXD2A, which regulate cell morphology, inflammatory pathways, and immune responses— critical processes influencing OSCC malignancy and progression. To assess generalizability and confirm the robustness of classifier, the predictive model was validated on an independent Taiwanese cohort (GSE38532) profiled on a different array platform, achieving 98.8% accuracy and ROC–AUC of 0.999 demonstrating robust cross-population performance. Furthermore, cross-omics integration with an independent transcriptomic dataset (GSE31056) identified eight genes, including ALOX5, FOXP1, and VTI1A, showing consistent methylation and expression patterns, underscoring their biological relevance. Our findings highlight the functional relevance of SH3PXD2A, TNFRSF19, and ALOX5 in OSCC pathophysiology: SH3PXD2A mediates cell migration and invasion, TNFRSF19 is involved in survival signaling, and ALOX5 regulates inflammatory responses. These multi-layered analyses provide novel insights into epigenetic mechanisms underlying OL to OSCC progression and highlight candidate biomarkers with strong translational potential. By combining IML based methylation modeling with external and cross-omics validation, this study advances the development of reliable, interpretable biomarkers for precision oral cancer diagnostics and management.
Yadav et al. (Thu,) studied this question.