What question did this study set out to answer?

The aim is to improve detection of driver distraction by integrating global and local information through multimodal features.

March 2, 2026

2M2F-A: Multimodal Features Fusion with Attention Mechanism for Distracted Driver Detection

Key Points

The aim is to improve detection of driver distraction by integrating global and local information through multimodal features.
Developed a system using visual features and skeletal information for distraction detection.
Created a key points attention map to emphasize fine-grained details around driver joints.
Evaluated the model on the AUCDD-v1 and SFDD datasets to compare its performance.
Achieved an accuracy of 95.08% on the AUCDD-v1 dataset.
Achieved an accuracy of 99.84% on the SFDD dataset.
Demonstrated competitive performance against existing state-of-the-art models.

Abstract

Driver distraction is one of the main factors in road accidents, emphasizing the importance of early detection and alerting mechanisms to mitigate the risk. To achieve this, it is crucial to identify the distraction and its source. Existing distracted driver detection methods primarily analyze full images based on visual features, often overlooking fine-grained details within specific regions, decreasing the distinction between highly similar classes. Addressing this gap requires considering global and local information to capture the driver’s actions better. Our proposed approach is based on the fusion of two different modalities of features: the visual features of the global appearances and the skeletal information. Additionally, we utilized a method to generate a key points attention map that presents the distribution of key points in the image and the regions where the local information is located, which drives the model’s attention to fine-grained details around the driver’s joints. Our model demonstrates competitive accuracy compared to state-of-the-art models, achieving an accuracy of 95.08% on AUCDD-v1 and 99.84% on SFDD datasets.

Bookmark

Cite This Study

Boulahmar et al. (Sat,) studied this question.

synapsesocial.com/papers/69a52dabf1e85e5c73bf0bf7 https://doi.org/https://doi.org/10.1142/s2301385027500750

Bookmark