To address the lack of dedicated datasets for infrared detection of small UAVs in air-to-air scenarios, this paper first constructs the self-built SIM-AIR dataset covering complex scenarios, and then proposes YOLO-KMM an efficient YOLOv11-based object detection model tailored to the dataset's small-target characteristics and deployment requirements; collected by an UAV equipped with an infrared thermal imager, the SIM-AIR dataset consists of 3,993 precisely annotated images across four weather conditions: sunny, cloudy, snowy, and hazy, where 99.7% of the targets are ultra-small objects and their width < 40 pixels, with an average size of 11.2×6.6 pixels, including complex scenarios such as "dark targets" in snowy weather and low signal-to-noise ratio (SNR) in haze, which fully simulate real-world detection challenges. To tackle the issues of sparse small-target features and strong background interference, YOLO-KMM integrates the C2KD feature enhancement module and C3K2-MU lightweight detection head, forming a dual-optimized architecture of "feature enhancement - efficient detection": the C2KD module captures weak small-target features and suppresses noise via cross-scale fusion and attention mechanisms, while the C3K2-MU module adopts grouped convolution and depthwise separable convolution to reduce the number of parameters while preserving feature representation capability. Experiments on the SIM-AIR dataset show that YOLO-KMM achieves an mAP₅₀ of 88.2%. This is 7.8 percentage points higher than the baseline YOLOv11, with a precision of 94.0% and recall of 74.3%, reduces the small-target missed detection rate by 12.5%, and maintains an inference speed of 246.18 FPS, 2.3M parameters, and 5.4 GFLOPs of computation; compared with YOLOv5/8/12, the model achieves a better balance among accuracy, speed, and complexity, verifying the practicality and challenge of the SIM-AIR dataset and providing an efficient solution for air-to-air small-target infrared detection.
Fu et al. (Fri,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: