Abstract This study explores the challenges and opportunities of using large language models (LLMs) for automatic annotation of offensive language in Hebrew. The analysis is based on a six-level taxonomy of offensive discourse – covering offensiveness, target, target presence, vulgarity, offense strength, and specific aspects – enabling systematic examination of nuanced Hebrew patterns. Several prompting strategies were tested, including few-shot, role-based prompting, chain-of-thought, and LLM-as-judge, and their outputs were compared to human annotations using standard evaluation metrics and inter-annotator reliability. Findings reveal that salient categories such as explicit threats show strong interpretive stability, while ambiguous ones, such as discrediting attacks, require higher precision. The study also introduces a two-step classification method: first identifying the two most plausible categories, then selecting the more accurate one. This approach reduces the model’s bias toward general categories and improves fine-grained classification. Overall, the study contributes by (1) offering a methodological framework to assess LLMs’ interpretive limits compared to humans, and (2) laying groundwork for building more refined datasets to advance Hebrew offensive language research.
Building similarity graph...
Analyzing shared references across papers
Loading...
Chaya Liebeskind
Yael Yefet
Lodz Papers in Pragmatics
Jerusalem College of Technology
Building similarity graph...
Analyzing shared references across papers
Loading...
Liebeskind et al. (Fri,) studied this question.
www.synapsesocial.com/papers/69a52dbff1e85e5c73bf0cb2 — DOI: https://doi.org/10.1515/lpp-2025-0093