Facing the dual challenges of safe production and skill inheritance in construction industry, this study proposes an intelligent monitoring and vocational skills training system for construction site based on computer vision. The system takes the hierarchical micro-service architecture as the core, and collects data in real time through multi-source sensors such as panoramic camera, unmanned aerial vehicle and wearable IMU/UWB at the sensing layer. A lightweight Mobile-YOLOv8 model is deployed at the edge layer. By using dynamic pruning and adaptive resolution strategy, the average accuracy rate of three types of violation detection of "no helmet, missing overhead guardrail and mechanical collision risk" is increased to over 96%, the false alarm rate is reduced to 1.8%~5.1%, and the early warning response time is shortened to 0.5~1.2 seconds. The platform layer builds a digital twin training scene based on BIM-Unity3D, integrates low-delay interaction of 12 types of equipment simulation and gesture recognition, opens up monitoring and training links, automatically generates personalized training programs according to violation records, and realizes closed-loop improvement of skills through a multi-dimensional quantitative evaluation model (operation standardization 40%, aging 30%, energy consumption 20% and risk prediction 10%). Three-month experiments on 12000 m2 subway site show that the false alarm rate of system monitoring is only 1/5 of that of traditional manual inspection. In the grouping test of 60 workers, the score of operation standard of personalized training group is 93.7, the emergency response time is 4.3 s, the skill solidification rate is 89.6%, and the accident rate is reduced to 0.9%, which is significantly better than the traditional mentoring system and general VR training.
Peng et al. (Sun,) studied this question.