Accurate segmentation of colorectal polyps in colonoscopy images is crucial for early prevention and computer-aided diagnosis of colorectal cancer, yet large variations in polyp appearance, low polyp-mucosa contrast, and device-related imaging discrepancies still hinder robust performance, especially for small and flat lesions and cross-dataset generalization. To address these challenges, we propose a Dual-Encoder Global–Local Joint Feature Aggregation Network (DEGF-Net) that enhances feature fusion and improves generalization. DEGF-Net adopts a dual-encoder architecture that separately models long-range global context and fine-grained local textures. A Global Joint Feature Fusion Module (GFFM) employs global attention to align and aggregate high-level features from both branches into a unified representation, while an Upper-Lower Level Feature Fusion Module (UL-FM) performs residual multi-scale cross-layer fusion in the decoder to narrow the semantic gap between high-level semantics and low-level details and refine polyp boundaries. In addition, a multi-output hybrid loss is applied to the final and intermediate predictions to leverage deep supervision, accelerate convergence, and improve robustness. Experiments on two benchmark colonoscopy datasets, Kvasir-SEG and CVC-ClinicDB, show that under a unified setting, DEGF-Net achieves mean Dice scores of 0.933 and 0.958, respectively, surpassing recent CNN-based, Transformer-based, and hybrid architectures and exhibiting strong cross-dataset generalization. These results indicate that DEGF-Net can substantially improve automatic polyp segmentation and provide a promising technical basis for computer-aided colorectal cancer screening. • A novel CNN-Transformer dual-encoder framework is proposed for colorectal polyp segmentation. • A global joint feature fusion module explicitly aligns high-level CNN and Transformer semantics. • A residual cross-scale fusion strategy bridges the semantic gap between global context and fine details. • The proposed method achieves Dice scores of 0.933 and 0.958 on Kvasir-SEG and CVC-ClinicDB. • Strong cross-dataset and cross-domain generalization is demonstrated on retinal and cell datasets.
Building similarity graph...
Analyzing shared references across papers
Loading...
Yu et al. (Mon,) studied this question.
www.synapsesocial.com/papers/69a91cbed6127c7a504bfbae — DOI: https://doi.org/10.1016/j.bspc.2026.110023
He Yu
Jinming Guo
Xiaorui Cao
Biomedical Signal Processing and Control
James Cook University
Changchun University of Science and Technology
Changchun University
Building similarity graph...
Analyzing shared references across papers
Loading...