What question did this study set out to answer?

The research aims to analyze how multimodal large language models identify visual hate speech in Chinese-speaking communities.

February 14, 2026Open Access

How do multi-modal large language models understand non-English visual hate? Insights from studying hate speech in Chinese-speaking communities on Instagram

Read Full Paperexternally

Key Points

The research aims to analyze how multimodal large language models identify visual hate speech in Chinese-speaking communities.
Evaluated two MLLMs, Gemini-1.5 and GPT-4o-mini, for hate speech detection.
Used expert annotations for comparative analysis.
Employed a zero-shot learning approach for model testing.
Conducted qualitative error analysis to understand model limitations.
Identified issues like hallucinations in model outputs.
Observed tendencies to over-label non-hateful content as hate speech.
Highlighted lack of cultural and linguistic sensitivity affecting detection.

Abstract

This study provides a critical analysis of the efficacy of Multimodal Large Language Models (MLLM) in identifying visual hate speech on Instagram, such as image memes, specifically within the context of non-English and non-Western communities. By focusing on the unique dynamics of hate speech circulating among Chinese-speaking populations, particularly aimed at mainland Chinese individuals, this research illuminates the complexities and challenges associated with employing MLLMs for multi-modal hate speech detection through a zero-shot learning approach. Through a comparative evaluation of two cutting-edge MLLMs, Gemini-1.5 and GPT-4o-mini, measured against expert annotations and incorporating qualitative error analysis, the study reveals factors contributing to the complexity of the task. This includes hallucinations, tendencies toward over-labelling content as hate speech, and a notable absence of linguistic and cultural sensitivity. These findings highlight the needs for the development of culturally attuned models and methodologies that enhance the effectiveness of hate speech moderation in diverse cultural contexts.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Authors

Jing Zeng

Qinghao Guan

Ariadna Matamoros Fernandez

Actions

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

How do multi-modal large language models understand non-English visual hate? Insights from studying hate speech in Chinese-speaking communities on Instagram

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Authors

Actions

References and Citations

Citation Network

Connected Papers

Discussion

Cite this study