What question did this study set out to answer?

This research aims to enhance automated sleep scoring by developing a learnable PSG channel encoding for improved accuracy.

May 10, 2026

0363 Improving Automated Sleep Staging with Deep Learning: Learnable PSG Channel Encoding

Key Points

This research aims to enhance automated sleep scoring by developing a learnable PSG channel encoding for improved accuracy.
Analyzed data from 543 subjects across seven independent datasets using 5-fold cross-validation.
Developed a learnable 2D PSG positional encoding to represent spatial relationships among EEG signals.
Compared three models: no encoding, fixed channel encoding, and learnable 2D channel encoding.
The learnable 2D channel encoding achieved an average accuracy of 82.84%±2.08%, outperforming no-encoding (81.6%±1.79%) and fixed-encoding (82.35%±1.53%).
Significant improvements were seen in six out of seven datasets when comparing the learnable encoding to no encoding.
The learnable encoding consistently outperformed fixed encoding across four datasets, indicating its robustness.

Abstract

Abstract Introduction Sleep scoring is essential for clinical sleep diagnostics, and recent advances in deep learning have accelerated and standardized this process. Transformer-based models such as SleepTransformer have achieved state-of-the-art performance, and our previous work, FlexSleepTransformer, further improved scoring accuracy and cross-dataset generalizability by fusing information from multiple PSG channels. However, these methods rely on fixed channel encodings that fail to capture the spatial organization of PSG electrodes, limiting their ability to model inter-channel relationships effectively. To address this limitation, we introduced a learnable 2D PSG channel encoding that explicitly represents spatial structure and integrated it into FlexSleepTransformer, leading to improved performance across multiple datasets. Methods A total of 543 subjects from seven independently acquired datasets were included. For each dataset, subject-level 5-fold cross-validation was performed to prevent data leakage. The baseline model followed the two-level sequence-to-sequence SleepTransformer architecture, which processed intra-epoch information and inter-epoch temporal context in a manner consistent with human scoring guidelines. Because the datasets differed in their PSG channel configurations, the model required a way to recognize each channel’s spatial origin. Traditional fixed channel encodings identified separate channels but failed to capture spatial relationships. To overcome this limitation, we introduced a learnable 2D PSG positional encoding that allowed the model to autonomously learn spatially informed representations of signals from different brain regions. Three models were evaluated across all datasets: (1) no channel encoding, (2) fixed channel encoding, and (3) the proposed learnable 2D channel encoding. Results Across all seven datasets, the proposed learnable 2D channel encoding achieved the highest average accuracy (82.84%±2.08%), outperforming both the no-encoding model (81.6%±1.79%) and the fixed-encoding model (82.35%±1.53%). Statistical comparisons further showed that the learnable encoding significantly outperformed the no-encoding baseline on six datasets and outperformed the fixed encoding on four datasets, demonstrating its consistent advantage across diverse data sources. Conclusion We introduced a learnable 2D EEG channel encoding for Transformer-based sleep staging and successfully incorporated it into FlexSleepTransformer. Results across seven datasets demonstrated consistent and significant performance improvements, highlighting the strong potential of this approach for deployment in real clinical workflows. Support (if any) None

Bookmark

0363 Improving Automated Sleep Staging with Deep Learning: Learnable PSG Channel Encoding

Key Points

Abstract

Cite This Study