In few-shot action recognition (FSAR), limited annotated data and large scene variations make it difficult for models to learn stable spatial semantics and reliable temporal dynamics. As a result, spatiotemporal representations tend to be weak, and models often fail to focus on discriminative motion regions or capture frame-to-frame changes accurately. Furthermore, the insufficient fusion of local details and global context renders the learned features more susceptible to background noise and scene bias. These issues become more pronounced when background clutter is severe or when different action classes share locally similar segments, leading to unreliable support–query matching and shifted similarity distributions, which ultimately result in class confusion. To address these challenges, we propose a bidirectional adaptive spatiotemporal modeling method integrated with contrastive learning for FSAR. The method constructs attention-guided bidirectional differencing features to model inter-frame variations with semantic alignment, while suppressing background noise. It introduces a local–global interactive channel attention module to strengthen both local and global dynamic representations, and integrates dynamic distance adjustment with hard negative mining during tuple-level matching. This combination imposes contrastive constraints that enhance intra-class compactness and inter-class separability, thereby mitigating interference from cross-class similar segments. Experiments under the standard 5-way 1-shot/5-shot protocol demonstrate consistent improvements across multiple datasets, and the proposed method achieves the best performance under the 5-shot setting while remaining competitive under the 1-shot setting.
Building similarity graph...
Analyzing shared references across papers
Loading...
Huang et al. (Tue,) studied this question.
synapsesocial.com/papers/69e07d1d2f7e8953b7cbe2ea — DOI: https://doi.org/10.3390/electronics15081637
Jing Huang
Zijian Zhao
Electronics
Zhejiang Sci-Tech University
Building similarity graph...
Analyzing shared references across papers
Loading...
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: