Grapes are high-value crops, but expanding cultivation has made manual harvesting inefficient and costly due to labor shortages and weather constraints. Automated harvesting requires accurate and lightweight image segmentation to ensure reliable visual perception. Improving segmentation precision, robustness, and model compactness is thus critical for intelligent grape harvesting. To enhance segmentation robustness in complex orchard environments, this study introduces a multimodal fusion and multi-scale enhancement strategy and develops a lightweight instance segmentation network. Using a multimodal grape dataset containing RGB, near-infrared (NIR), and depth information, a multi-resolution training scheme based on an image-pyramid framework was constructed. Among the three YOLOv11-based fusion strategies, early fusion achieved the best performance. Accordingly, the lightweight model YOLOv11-WFD was designed by integrating FasterNeXt, DySample, and WaveletPool to strengthen feature extraction, adaptive sampling, and small-object perception. The model delivers high segmentation accuracy and strong deployment suitability for intelligent harvesting applications. Experimental results show that YOLOv11-WFD achieves a mAP@50:95 of 79.3% on the validation set with only 2.25 M parameters, demonstrating outstanding performance in both precision and compactness. Compared with YOLOv3-tiny, YOLOv5n, YOLOv8n, YOLOv10n, YOLOv11n, and YOLOv12n, YOLOv11-WFD improves mAP@50:95 by 25.4, 3.0, 2.7, 2.8, 2.0, and 3.1 percentage points, respectively, while reducing parameters by 80.4%, 7.8%, 23.5%, 10.7%, 20.8%, and 18.8%. Overall, YOLOv11-WFD achieves an excellent balance among accuracy, speed, and complexity, verifying the effectiveness of the multimodal fusion and lightweight integration strategy. It shows strong potential for practical applications and large-scale deployment in complex agricultural environments such as intelligent grape harvesting.
Building similarity graph...
Analyzing shared references across papers
Loading...
Pengyan Wang
Chengshuai Li
Linjing Wei
Agronomy
Gansu Agricultural University
Building similarity graph...
Analyzing shared references across papers
Loading...
Wang et al. (Tue,) studied this question.
www.synapsesocial.com/papers/69d893896c1944d70ce048ef — DOI: https://doi.org/10.3390/agronomy16070679