S4ECG multi-window analysis improved macro-averaged AUROC by 1.0–11.6 percentage points and increased AF detection specificity to 0.967–0.998 compared to single-window approaches.
Does multi-window temporal analysis using S4ECG improve arrhythmia classification accuracy compared to single-window approaches?
Multi-window temporal analysis using structured state-space models significantly improves arrhythmia classification accuracy and cross-dataset robustness compared to conventional single-window approaches.
Effect estimate: 1.0-11.6 percentage points improvement
Abstract Objective. Arrhythmia classification from electrocardiograms (ECGs) suffers from high false positive rates and limited cross-dataset generalization, particularly for atrial fibrillation (AF) detection where specificity ranges from 0.72 to 0.98 using conventional 30 s analysis windows. While conventional deep learning approaches analyze isolated 30 s ECG windows, many arrhythmias, particularly AF and atrial flutter, exhibit diagnostic features that emerge over extended time scales. Approach. We introduce S4ECG, a deep learning architecture based on structured state-space models (S4), designed to capture long-range temporal dependencies by jointly analyzing multiple consecutive ECG windows spanning up to 2 min. We evaluated S4ECG on four publicly available databases for multi-class arrhythmia classification, including systematic cross-dataset evaluations to assess out-of-distribution robustness. Main results. Multi-window analysis consistently outperformed single-window approaches across all datasets, improving the macro-averaged area under the receiver operating characteristic curve by 1.0–11.6 percentage points. For AF detection specifically, specificity increased from 0.718–0.979 (single-window) to 0.967–0.998 (multi-window) at a fixed sensitivity threshold, representing a 3–10 fold reduction in false positive rates. Significance. Comparative analysis against convolutional neural network baselines demonstrated superior performance of the S4 architecture. Cross-dataset evaluation revealed that multi-window approaches substantially improved generalization performance, with smaller performance degradation when models were tested on held-out datasets from different institutions and acquisition protocols. A systematic investigation revealed optimal diagnostic windows of 10–20 min, beyond which performance plateaus or degrades. These findings demonstrate that structured incorporation of extended temporal context enhances both arrhythmia classification accuracy and cross-dataset robustness. The identified optimal temporal windows provide practical guidance for ECG monitoring system design and may reflect underlying physiological timescales of arrhythmogenic dynamics.
Wang et al. (Thu,) conducted a other in Arrhythmia. S4ECG (multi-window temporal analysis) vs. Single-window approaches and convolutional neural network baselines was evaluated on Macro-averaged area under the receiver operating characteristic curve (1.0-11.6 percentage points improvement). S4ECG multi-window analysis improved macro-averaged AUROC by 1.0–11.6 percentage points and increased AF detection specificity to 0.967–0.998 compared to single-window approaches.