Substation drawings are high-density technical artifacts that serve as the authoritative data source across the entire power-infrastructure lifecycle. Manual auditing of these drawings is labor-intensive, error-prone, and increasingly untenable as drawing complexity grows. To address this challenge, we propose KIEP (Key Information Extraction and Parsing), a two-stage computer-vision framework that automatically localizes, recognizes, and semantically interprets textual and symbolic elements from complex substation drawings. In Stage-1, we fine-tune DeepSolo for rotated text detection and recognition alongside YOLOv8 for multi-class symbol detection, utilizing our newly developed SKID dataset which comprises 347 real-world substation drawings and supports three critical tasks: text extraction, symbol detection, and semantic text parsing. In Stage-2, a Hungarian-based geometric matching module aligns each text instance with its governing symbol, after which a predefined symbol table resolves domain-specific semantics. Extensive experiments on SKID demonstrate that KIEP achieves 91.0% F-measure for text extraction, 81.1% mAP for symbol detection, and 80.4% Parsing F-measure for end-to-end semantic parsing, establishing an effective solution for automated key information extraction and parsing in substation drawings to facilitate smart substation applications.
Yang et al. (Fri,) studied this question.