What question did this study set out to answer?

The aim is to develop a resource-efficient system for detecting violence in audio recordings for public safety.

March 18, 2026Open Access

Acoustic Violence Detection Using Cascade Strategy for Computationally Constrained Scenarios

Key Points

The aim is to develop a resource-efficient system for detecting violence in audio recordings for public safety.
Two-stage cascade system combining least squares linear detector and YAMNet
Implemented a percentile-based forwarding rule for segment routing
Evaluated on a dataset of real-world violent audio with background noise and reverberation
Conducted an ablation study to assess the LSLD pre-filter's effectiveness
Developed an energy consumption model for sustainable deployment
Maintained performance close to a Stage 2-only baseline while reducing deep-inference workload
Achieved good performance in low-false-alarm conditions
Demonstrated robustness under various noise conditions
Validated LSLD's role as an inexpensive pre-filter

Abstract

Detecting violent content in audio recordings is crucial for public safety, autonomous surveillance, and content moderation, particularly when visual cues are unreliable or unavailable. A resource-aware two-stage cascade system is proposed for acoustic violence detection that combines a lightweight Least Squares Linear Detector (LSLD) as a first-stage screener with a trimmed version of YAMNet as a second-stage classifier. A percentile-based forwarding rule controls the fraction of segments routed to the deep stage, turning the accuracy–cost trade-off into an explicit operating parameter for always-on deployment. The approach is evaluated on a publicly released dataset of real-world violent audio augmented with background noise and artificial reverberation. The results in the low-false-alarm regime show that the proposed cascade preserves performance close to a Stage 2-only baseline while substantially reducing average deep-inference workload. An ablation study validates the role of the LSLD as an inexpensive pre-filter, and robustness is assessed under clean, reverberant, and 12 dB noise conditions. Finally, an analytic energy consumption model is provided, which links computational workload to daily energy demand and photovoltaic sizing on ultra-low-power hardware, supporting sustainable off-grid deployment.

Read Full Paperexternally

Bookmark

View Full Paper

Cite This Study

Zhu-Zhou et al. (Mon,) studied this question.

synapsesocial.com/papers/69ba423c4e9516ffd37a251c https://doi.org/https://doi.org/10.3390/electronics15061227

Bookmark

View Full Paper