Machine learning models for ECG-based dysglycemia detection demonstrated high predictive performance (AUC 0.78-0.99), but ~88% of studies lacked external validation, limiting translational readiness.
Systematic Review (n=17)
Do machine learning approaches using ECG accurately detect dysglycemia and are they ready for clinical translation?
While machine learning models for ECG-based dysglycemia detection show high predictive performance, their clinical translation is currently limited by a lack of external validation and standardized methodologies.
This structured critical review provides a comparative and analytically grounded overview of machine learning (ML) approaches for electrocardiography (ECG)-based detection of dysglycemia, with a specific focus on translational readiness for clinical screening. A structured literature search across PubMed, Scopus, Web of Science, and IEEE Xplore identified 183 records, of which 17 studies were included following predefined screening criteria and PRISMA-guided selection principles. The included studies demonstrate substantial heterogeneity in dataset size (ranging from 25,000 subjects), ECG acquisition modalities (single-lead, 12-lead, wearable), feature representations (raw signals, heart rate variability, engineered features), and ML strategies (classical algorithms, deep learning, and multimodal models). Reported model performance is generally high, with accuracy values frequently exceeding 0.85 and area under the curve (AUC) ranging from 0.78 to 0.99. Smaller experimental studies often report inflated performance (up to 96–99% accuracy), whereas large-scale population-based investigations demonstrate more moderate but clinically plausible results (AUC ≈ 0.80–0.85). External validation, a key requirement for clinical applicability, was performed in only a limited subset of studies (approximately 12%). From a physiological perspective, ML models exploit ECG alterations associated with dysglycemia, including reduced heart rate variability, QT interval prolongation, and changes in ventricular depolarization and repolarization dynamics. However, the relationship between metabolic dysfunction and ECG signals remains indirect. A key finding of this review is the mismatch between reported predictive performance and translational readiness. The majority of studies (≈65–70%) are classified as early-stage (Level 1–2 or 2–3), relying on small, single-center datasets and internal validation. Only a minority of studies achieve near-translational maturity (Level 4), characterized by large-scale datasets and external validation. ECG-based dysglycemia detection represents a promising non-invasive and scalable screening paradigm. However, its clinical translation is constrained by the lack of standardized ECG acquisition protocols, limited dataset diversity, insufficient external validation, and fragmented methodological approaches. Future research should prioritize large multi-center datasets, standardized feature extraction pipelines, hybrid interpretable models, and prospective validation to enable robust, generalizable, and clinically deployable screening systems.
Alimbayeva et al. (Wed,) conducted a systematic review in Dysglycemia (n=17). Machine learning for ECG-based detection was evaluated on Model performance (accuracy and AUC) and translational readiness. Machine learning models for ECG-based dysglycemia detection demonstrated high predictive performance (AUC 0.78-0.99), but ~88% of studies lacked external validation, limiting translational readiness.