Abstract Micro-gesture is an imperceptible non-verbal behaviour characterised by low-intensity movement. However, its low-intensity and short-duration nature pose challenges for traditional action recognition models. To address this, we propose micro-gesture Mamba-inspired linear attention (MGMILA), a motion-aware framework integrating Mamba-inspired linear attention (MILA), a linear complexity model optimized for video-based micro-gesture recognition. Additionally, we design motion extraction module variants, motion as layer (MAL), motion as content (MAC), and motion as gate (MAG) to enhance spatiotemporal motion localization. Furthermore, we introduce human segmentation mask prediction as an auxiliary task to guide the network in attending to human-related regions, thereby improving its motion perception and recognition capability. Experiments on iMiGUE, spontaneous micro gesture (SMG), and MA-52 demonstrate state-of-the-art (SOTA) performance, validating the effectiveness of our approach.
Building similarity graph...
Analyzing shared references across papers
Loading...
Xing et al. (Wed,) studied this question.
www.synapsesocial.com/papers/69d896a46c1944d70ce0827a — DOI: https://doi.org/10.1007/s11633-025-1587-8
Bohao Xing
Di Li
Rong Gao
Machine Intelligence Research
Lappeenranta-Lahti University of Technology
Brno University of Technology
Building similarity graph...
Analyzing shared references across papers
Loading...