What question did this study set out to answer?

To improve the automated detection of suspicious human activities in various challenging environments using advanced classification techniques.

April 15, 2026Open Access

A Novel Classification Model for Suspicious Human Activities in Diverse Environments Using Fused Feature Block and Machine Vision Techniques

Key Points

To improve the automated detection of suspicious human activities in various challenging environments using advanced classification techniques.
Developed a GM_CNN3D model integrating handcrafted and deep features for activity classification.
Used Gaussian Mixture Model to localize motion regions before feature integration.
Evaluated the model on five diverse datasets including real-world scenarios with varying conditions.
Achieved accuracy of up to 99.12% and an F1-score of 98.7%.
Obtained a ROC-AUC of 0.992, outperforming existing models including CNN and LSTM.
Demonstrated robust performance across different scenes with varying lighting and crowd density.

Abstract

Automated detection of suspicious human activities in complex and crowded environments remains a critical challenge in modern surveillance systems due to high false-positive rates, poor contrast and generalization across diverse scenes. We propose a GMCNN3D Model for the classification of suspicious activity based on a Deep Fused Feature Block (DFFB) framework that integrates handcrafted spatial descriptors (PCA-HOG and Motion-HOG) with deep spatiotemporal features extracted from 3D Convolution Neural Network (3D-CNN). Motion regions are first localized using a Gaussian Mixture Model (GMM), after which handcrafted and deep features are concatenated in a dimensionality-normalized fusion stage, followed by a fully connected layer and softmax classification. The system is evaluated on five diverse and publicly available datasets: Violent Crowd, Hockey Fight, Kaggle Fight, Movies Fight, and Custom Annotated YouTube Clips, achieving up to 99. 12% accuracy, 98. 7% F1-score, and a ROC-AUC of 0. 992, outperforming state-of-the-art CNN, LSTM, and SlowFast models. All datasets include real world scenarios with varying lighting, crowd density, and camera viewpoints, with annotations created manually where unavailable. The proposed method demonstrates robust cross-scene performance, enabling automated alarming and reduced false positives in real-time security operations.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Cite this study

Mughal et al. (Mon,) studied this question.

www.synapsesocial.com/papers/69df2b49e4eeef8a2a6b034b — DOI: https://doi.org/10.3390/digital6020030

Authors

Bushra Mughal

Fernando B. Duarte

Tiago Cunha Reis

Journals

Digital

Actions

Institutions

Instituto Politécnico de Lisboa

Universidade Lusófona

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

A Novel Classification Model for Suspicious Human Activities in Diverse Environments Using Fused Feature Block and Machine Vision Techniques

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Cite this study

Authors

Journals

Actions

Institutions

References and Citations

Citation Network

Connected Papers

Discussion