What question did this study set out to answer?

The aim is to enhance the accuracy of colorectal polyp segmentation in challenging endoscopic images.

March 30, 2026

Waveformer: Dual-Branch Adaptive Network with Wavelet-Guided Cross-Context Decoding for Colorectal Polyp Segmentation

Key Points

The aim is to enhance the accuracy of colorectal polyp segmentation in challenging endoscopic images.
Developed a dual-branch adaptive network for feature extraction.
Utilized wavelet-based frequency decomposition for enhanced edge responses.
Implemented camouflage identification and information fusion for detailed semantic aggregation.
Conducted experiments on CVC-ClinicDB and Kvasir-SEG datasets.
Achieved Dice Similarity Coefficients of 95.60% and 94.11% on CVC-ClinicDB and Kvasir-SEG datasets, respectively.
Outperformed fourteen state-of-the-art methods in segmentation accuracy.
Demonstrated strong generalization ability with cross-dataset evaluations yielding DSC scores of 81.0% and 79.2%.

Abstract

With the advancement of deep learning, polyp segmentation in endoscopic images has achieved remarkable progress. However, clinical polyps often exhibit variable morphology, blurred boundaries, and low contrast with the intestinal mucosa, hindering accurate lesion localization and edge delineation. Moreover, complex conditions of low light, luminal distortion, and mucosal folds further exacerbate the problem with identification, resulting in frequent misdetections and omissions in computer-aided diagnosis. Accordingly, we propose Waveformer, a local-global co-modeling segmentation network, to improve segmentation accuracy. Concretely, the encoder employs parallel CNN-Transformer branches to synergistically extract detailed and global features, thereby enhancing the completeness and discriminative power of the representation. The decoder integrates a wavelet-based frequency decomposition unit (WFDU), a camouflage identification module (CIM), and an information fusion layer (IFL). These modules collaboratively enhance edge responses and semantic aggregation across scales, significantly boosting the framework's capability in boundary modeling and lesion discernment. Extensive experiments on CVC-ClinicDB and Kvasir-SEG datasets achieve Dice Similarity Coefficients (DSC) of 95.60 % and 94.11 %, outperforming fourteen state-of-the-art (SOTA) methods. Cross-dataset evaluations further verify its strong generalization ability, with DSC scores of 81.0 % and 79.2 %, respectively.

Bookmark

Waveformer: Dual-Branch Adaptive Network with Wavelet-Guided Cross-Context Decoding for Colorectal Polyp Segmentation

Key Points

Abstract

Cite This Study