The continuous expansion of wind farms and the escalating demand for automated operation and maintenance have established the efficient and accurate performance of intelligent inspection systems for switching stations as a critical factor for ensuring power facility safety and stability. However, the intelligent inspection trolleys deployed in such settings are frequently hampered by suboptimal instrument detection accuracy and limited robustness, attributable primarily to environmental interference from fog, variable lighting conditions, or image noise. This paper proposes a multi-module-integrated real-time object detection model, termed HDC-RTDETR (HSAN + DBlockC3 + CGAFusion + RT-DETR). The model is grounded in the intelligent inspection principle of “clear visibility precedes efficient inspection”, with the primary objective of enabling reliable instrument identification under the influence of fog, changing lighting conditions or image noise. Specifically, building upon the RT-DETR architecture, we introduce three targeted enhancements: (1) the HSAN module adaptively fuses grayscale, edge, and color features to improve robustness against composite degradations (e.g., fog, illumination variations, noise) by enhancing target responses while suppressing background clutter; (2) DBlockC3 captures and integrates multi-scale contextual information, refining the discrimination of fine-grained instrument details under complex lighting; and (3) the CGAFusion module strengthens hierarchical feature integration within the encoder, effectively mitigating fog-induced blurring effects. Experimental validation on a Custom Dataset demonstrates that the proposed model achieves a mAP@50 of 95.566% (representing an improvement of 3.390 percentage points) and a precision of 90.557% (an increase of 11.20 percentage points). Furthermore, on an Industrial Instrument Needle Dataset, it attains a mAP@50 of 98.130% (+2.242%) and a precision of 95.130% (+4.269%). In addition, we validated its edge deployment capabilities on the Jetson AGX Orin, achieving real-time inference at 16.5 FPS, which meets the near-real-time video streaming processing requirements of many application scenarios. These results confirm that the HDC-RTDETR model exhibits superior detection performance and environmental adaptability in complex industrial scenarios, thereby establishing a high-confidence localization foundation for subsequent instrument reading extraction tasks.
Shang et al. (Tue,) studied this question.