Social networks such as Twitter have grown rapidly and are now flooded with sarcastic comments, both in text and in images. Detecting sarcasm in multimodal data has significant social value and is attracting increasing research attention. However, most studies overlook the role of sentiment, even though sentiment information in text is closely linked to clues of sarcasm. Additionally, few consider how text and images align semantically. To address these issues, we propose a sentiment-guided mutual learning network (SGMLN) for multimodal sarcasm detection. SGMLN utilizes sentiment information to inform the combination of text and image features, and employs mutual learning to facilitate knowledge sharing among classifiers. We design a sentiment-guided attention layer that injects sentiment into both modalities, producing features that capture sarcasm more effectively. Sentic-BERT extracts sentiment-aware vectors from text, using word-level sentiment as a mask. In mutual learning, a logistic distribution function measures differences between classifiers, improving knowledge transfer between modalities. This step boosts multimodal understanding and model performance. By introducing sentiment-aware representations and semantic alignment, SGMLN bridges the gap between text and images, making them more consistent. Experiments on public datasets demonstrate that our model is effective and outperforms alternatives.
Building similarity graph...
Analyzing shared references across papers
Loading...
Wang et al. (Wed,) studied this question.
www.synapsesocial.com/papers/69d8970c6c1944d70ce0847d — DOI: https://doi.org/10.3390/s26082304
Yiran Wang
Xinyu Zhao
Yongtang Bao
Sensors
Shandong University of Science and Technology
Building similarity graph...
Analyzing shared references across papers
Loading...