To advance automatic tomato leaf disease detection in precision agriculture, this study addresses critical challenges in complex field environments, such as variable lesion scales, background interference, and deployment constraints. We propose MSP-Net, a task-driven detection framework with targeted architectural refinements integrating three specific optimizations. First, a Multi-Scale Perception Convolution Module (MSPCM) is introduced to capture diverse disease features across early-to-late infection stages. Second, SimAM-enhanced C3k2 layers are utilized to suppress background noise and focus on fine-grained lesion cues. Third, a Multi-Scale Feature Enhancement Module (MSFEM) bridges the semantic gap between shallow and deep features to improve fusion efficacy. Furthermore, we construct a lightweight variant, L-MSP-Net, using architectural migration and structured pruning for edge efficiency. Experimental results on the real-world Tomato-Village dataset show that MSP-Net achieves 92.0% mAP@0.5, outperforming the YOLOv11s baseline by 2.0%. L-MSP-Net attains 86.1% mAP@0.5, improving by 3.6% over the lightweight YOLOv11n baseline while reducing parameters by 10.5%, and is successfully deployed on the RK3588 edge platform. Additional cross-dataset experiments on PASCAL VOC and MS COCO evaluate the transferability of the proposed architectural refinements to generic object detection tasks.
Kang et al. (Thu,) studied this question.