Multimodal sentiment analysis (MSA) is a challenging task that utilizes verbal, visual, and acoustic cues to infer human sentiment and has garnered substantial research attention in recent years. However, due to the diversity of multimodal data, current MSA methods often fail to adequately leverage the rich semantic knowledge present in the linguistic modality while also overlooking the issue of informational redundancy within the visual and auditory modalities. In addition, the intermodal heterogeneity and spurious cross-modal interactions also pose huge challenges for effective multimodal fusion. To address these issues, we propose an MSA approach based on dynamic linguistic enhancement and synergistic cross-modal Transformer (LESCT). Our LESCT constructs a dynamic language enhancement network (LEN) for feature extraction. The proposed LEN enables visual and auditory features to dynamically capture contextual cues from multigranularity language representations via guided attention mechanism, thereby mitigating intramodal redundancy and noise interference. On this basis, the LESCT builds a new synergistic cross-modal Transformer (SCT) and local-to-composite multimodal fusion strategy. The SCT network employs a bimodal generator to produce composite features for each pair of modalities, transferring the composite information from the bimodal features to complementary unimodal features to facilitate rich intermodel and intramodal interaction. Extensive experiments were performed on three popular MSA benchmark datasets CMU-MOSI, CMU-MOSEI, and CH-SIMS. The overall accuracy of our LESCT is 86.43% on CMU-MOSI, 86.38% on CMU-MOSEI, and 81.35% on CH-SIMS. Experimental results demonstrate that our proposed LESCT is superior to the state-of-the-art (SOTA) methods. The code is available at https://github.com/jhwvh/LESCT.
Building similarity graph...
Analyzing shared references across papers
Loading...
Linqin Cai
Daohong Liu
Lanrui Liu
IEEE Transactions on Neural Networks and Learning Systems
Chongqing University of Posts and Telecommunications
Building similarity graph...
Analyzing shared references across papers
Loading...
Cai et al. (Thu,) studied this question.
www.synapsesocial.com/papers/69d894ad6c1944d70ce05a78 — DOI: https://doi.org/10.1109/tnnls.2026.3678350
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: