What question did this study set out to answer?

The aim is to improve sarcasm detection in multimodal data by incorporating sentiment information and enhancing feature alignment.

April 10, 2026Open Access

SGMLN: Sentiment-Guided Mutual Learning Network for Multimodal Sarcasm Detection

Key Points

The aim is to improve sarcasm detection in multimodal data by incorporating sentiment information and enhancing feature alignment.
Developed SGMLN to combine text and image features using sentiment information.
Designed a sentiment-guided attention layer to inject sentiment into both text and image modalities.
Utilized Sentic-BERT for sentiment-aware feature extraction from text.
Employed mutual learning for knowledge sharing between classifiers to enhance performance.
SGMLN improves sarcasm detection accuracy compared to existing methods.
Experiments show enhanced performance through mutual learning and sentiment integration.
The model effectively bridges the semantic gap between text and image data.

Abstract

Social networks such as Twitter have grown rapidly and are now flooded with sarcastic comments, both in text and in images. Detecting sarcasm in multimodal data has significant social value and is attracting increasing research attention. However, most studies overlook the role of sentiment, even though sentiment information in text is closely linked to clues of sarcasm. Additionally, few consider how text and images align semantically. To address these issues, we propose a sentiment-guided mutual learning network (SGMLN) for multimodal sarcasm detection. SGMLN utilizes sentiment information to inform the combination of text and image features, and employs mutual learning to facilitate knowledge sharing among classifiers. We design a sentiment-guided attention layer that injects sentiment into both modalities, producing features that capture sarcasm more effectively. Sentic-BERT extracts sentiment-aware vectors from text, using word-level sentiment as a mask. In mutual learning, a logistic distribution function measures differences between classifiers, improving knowledge transfer between modalities. This step boosts multimodal understanding and model performance. By introducing sentiment-aware representations and semantic alignment, SGMLN bridges the gap between text and images, making them more consistent. Experiments on public datasets demonstrate that our model is effective and outperforms alternatives.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Cite this study

Wang et al. (Wed,) studied this question.

www.synapsesocial.com/papers/69d8970c6c1944d70ce0847d — DOI: https://doi.org/10.3390/s26082304

Authors

Yiran Wang

Xinyu Zhao

Yongtang Bao

Journals

Sensors

Actions

Institutions

Shandong University of Science and Technology

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

SGMLN: Sentiment-Guided Mutual Learning Network for Multimodal Sarcasm Detection

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Cite this study

Authors

Journals

Actions

Institutions

References and Citations

Citation Network

Connected Papers

Discussion