Conventional models encounter challenges in detecting vehicle appearance components in intricate settings because of their limited small-target recognition capability and suboptimal fusion of multi-scale features. To address these issues, we propose an enhanced vehicle appearance segmentation model based on the YOLOv11-seg framework. Central to our approach is the MCALayerPlus module, designed to concurrently process targets across a wide range of scales. By executing multi-scale feature extraction, the model effectively suppresses false detections arising from cluttered backgrounds. Furthermore, we incorporate an improved ShapeIoU loss function, which integrates a size-sensitivity factor and a category-aware shape penalty term. This integration sharpens shape-matching precision, captures nuanced feature representations, and accelerates model convergence. Experimental results on a specialized automotive dataset demonstrate state-of-the-art performance, achieving a mean Average Precision (mAP@0.5) of 94.09%, an mAP@0.5:0.95 of 77.12%, precision of 91.31%, and recall of 90.75%. Notably, the model maintains a lightweight profile (5.75 MB), ensuring high-speed inference (45.3 FPS) suitable for real-time deployment in intelligent transportation systems.
Huang et al. (Sat,) studied this question.