Extant work has identified two discursive forms of racism: overt and covert. While both forms have received attention in scholarly work, research on covert racism has been limited. Its subtle and context-specific nature has made it difficult to systematically identify covert racism in text, especially in large corpora. In this article, we first propose a theoretically driven and generalizable process to identify and classify covert and overt racism in text. This process allows researchers to construct coding schemes and build labeled datasets. We use the resulting dataset to train XLM-RoBERTa, a cross-lingual large language model (LLM) for supervised classification with a cutting-edge contextual understanding of text. We show that XLM-R and XLM-R-Racismo, our pretrained model, outperform other state-of-the-art approaches in classifying racism in large corpora. We illustrate our approach using a corpus of tweets relating to the Ecuadorian indígena community between 2018 and 2021.
Building similarity graph...
Analyzing shared references across papers
Loading...
Gordillo et al. (Sat,) studied this question.
www.synapsesocial.com/papers/698c1bcd267fb587c655dbbc — DOI: https://doi.org/10.1177/00491241251412360
Diana Davila Gordillo
Joan C. Timoneda
Sebastián Vallejo Vera
Sociological Methods & Research
Purdue University West Lafayette
Western University
Leiden University
Building similarity graph...
Analyzing shared references across papers
Loading...