In practical underwater object detection tasks, imbalanced sample distribution and the scarcity of samples for certain classes often lead to insufficient model training and limited generalization capability. To address these challenges, this paper proposes FS2-DETR (Few-Shot Detection Transformer for Sonar Images), a transformer-based few-shot object detection network tailored for sonar imagery. Considering that sonar images generally contain weak, small, and blurred object features, and that data scarcity in some classes can hinder effective feature learning, the proposed FS2-DETR introduces the following improvements over the baseline DETR model. (1) Feature Enhancement Compensation Mechanism: A decoder-prediction-guided feature resampling module (DPGFRM) is designed to process the multi-scale features and subsequently enhance the memory representations, thereby strengthening the exploitation of key features and improving detection performance for weak and small objects. (2) Visual Prompt Enhancement Mechanism: Discriminative visual prompts are generated to jointly enhance object queries and memory, thereby highlighting distinctive image features and enabling more effective feature capture for few-shot objects. (3) Multi-Stage Training Strategy: Adopting a progressive training strategy to strengthen the learning of class-specific layers, effectively mitigating misclassification in few-shot scenarios and enhancing overall detection accuracy. Extensive experiments conducted on the improved UATD sonar image dataset demonstrate that the proposed FS2-DETR achieves superior detection accuracy and robustness under few-shot conditions, outperforming existing state-of-the-art detection algorithms.
Building similarity graph...
Analyzing shared references across papers
Loading...
Shibo Yang
Xiaoyu Zhang
Panlong Tan
Journal of Marine Science and Engineering
Nankai University
Tianjin haihe hospital
Building similarity graph...
Analyzing shared references across papers
Loading...
Yang et al. (Wed,) studied this question.
www.synapsesocial.com/papers/698586388f7c464f2300a230 — DOI: https://doi.org/10.3390/jmse14030304