Accurate fruit detection is a key component of precision agriculture applications such as crop yield estimation, orchard management, and intelligent harvesting. In scenarios where immature fruits exhibit visual similarity to the background or where significant varietal differences exist, traditional models often lack sufficient generalization ability, resulting in reduced detection accuracy and unstable predictions. To address this problem, this paper proposes a fruit detection model, MSRRT-DETR, which achieves a balance of high accuracy, real-time performance, and strong generalization capability. To improve detection accuracy and robustness in complex orchard environments, MSRRT-DETR introduces three major enhancements to the RT-DETR framework: a Multi-Scale Convolutional Attention Module (MSBlock) to enhance feature representation at different scales; a Spatial and Channel Synergistic Attention Module (SCSA) to improve object focus and discriminative capability; and a Re-parameterized Feature Pyramid Network (RepGFPN) to achieve efficient multi-scale feature fusion. Experimental results show that MSRRT-DETR achieves a mAP50 of 87.3% on the self-constructed TSApple dataset, outperforming mainstream lightweight models YOLOv8, YOLO11, and YOLO12 by 2.0–7.9 percentage points, exceeding two-stage detectors including Faster R-CNN, Mask R-CNN, and Cascade R-CNN by 5.1–8.6 percentage points, and surpassing the RT-DETR series by 1.1–2.6 percentage points. With an inference speed of 30.2 FPS, comparable to the YOLO series, MSRRT-DETR achieves an excellent balance between accuracy and real-time performance. In addition, MSRRT-DETR demonstrates outstanding cross-domain generalization capability on four public datasets including MinneApple, validating its stable applicability across diverse scenarios and fruit varieties. MSRRT-DETR combines high recognition accuracy, fast inference, and strong cross-domain generalization, fully meeting the requirements of fruit detection in complex agricultural scenarios. The model provides robust technical support for intelligent monitoring and automated orchard management in precision agriculture, and holds significant practical value and broad potential for application.
Building similarity graph...
Analyzing shared references across papers
Loading...
Xinyu Zhang
Sawut Mamat
Xiaohuang Liu
PLoS ONE
Building similarity graph...
Analyzing shared references across papers
Loading...
Zhang et al. (Fri,) studied this question.
www.synapsesocial.com/papers/69b5ff4f83145bc643d1b8ca — DOI: https://doi.org/10.1371/journal.pone.0342854