Familial Hemiplegic migraine (FHM) is a rare migraine subtype characterised by transient unilateral motor weakness. Although familial forms are associated with variants in CACNA1A , ATP1A2 , and SCN1A genes, many cases remain genetically unexplained, suggesting contributions from additional rare variations including single-nucleotide variants (SNVs) and copy-number variants (CNVs). Whole-exome sequencing (WES) data from 182 FHM cases and 1035 controls were analysed. Rare SNVs (allele frequency AF <0.01) were prioritised using pathogenicity annotation and ACMG criteria, while high-confidence CNVs were identified using GATK-gCNV. Elastic Net logistic regression (GLMNet), Extreme Gradiant Boosting (XGBoost), and a probabilistic ensemble were used. Model robustness was assessed through feature-label permutation testing following extensive batch effect minimization and population stratification correction. Feature importance was evaluated using permutation importance, regression coefficients, and XGBoost gains. Permutation-based testing demonstrated that all observed performance metrics lay far outside null distributions, with no permuted model achieving equivalent performance (empirical p = 0.005), suggesting successful batch effect and sequencing platform-specific differences minimization. SNV-only models showed limited discrimination (GLMNet AUC 0.626; XGBoost 0.533) and low sensitivity for cases. Incorporation of CNVs markedly improved performance, with GLMNet, XGBoost, and ensemble models achieving highly accurate metrics. Feature importance analyses consistently identified variants distributed across multiple loci, all of which contributed to FHM prediction. SNVs alone provide limited sensitivity for FHM prediction, whereas integration of CNVs yields robust and highly discriminative models. Permutation analyses confirm that performance is not attributable to chance, and feature convergence across linear and nonlinear models highlights overall shared contribution. These findings underscore the importance of structural variation in FHM and demonstrate the value of integrative machine-learning approaches for rare neurological disorders. • First large-scale integration of SNVs and CNVs in FHM. • Machine learning achieves high accuracy (AUC ∼0.95–0.97). • Predictive signal distributed across multiple genomic loci, not driven by any single CNV interval. • Robust performance across splits and feature ablation tests. Feature ablation across 10 top features shows minimal AUC decline, confirming model robustness. • Provides a framework for rare disease genomic prediction.
Building similarity graph...
Analyzing shared references across papers
Loading...
Mohammed M. Alfayyadh
Neven Maksemous
Heidi G. Sutherland
Computers in Biology and Medicine
Queensland University of Technology
Building similarity graph...
Analyzing shared references across papers
Loading...
Alfayyadh et al. (Fri,) studied this question.
www.synapsesocial.com/papers/69eefcaefede9185760d3a67 — DOI: https://doi.org/10.1016/j.compbiomed.2026.111699