What does this research mean for the field?

LLM-assisted human-in-the-loop annotation improves the classification of offensive language in Hebrew by reducing bias and enhancing precision. Novelty: ClaimNovelty.METHODOLOGICAL. Consensus alignment: ConsensusAlignment.NEUTRAL.

What question did this study set out to answer?

This research aims to enhance the classification of offensive language in Hebrew using large language models (LLMs) and human annotation.

March 2, 2026

Improving Hebrew offensive language classification using LLM-assisted human-in-the-loop annotation!!

Key Points

This research aims to enhance the classification of offensive language in Hebrew using large language models (LLMs) and human annotation.
Developed a six-level taxonomy of offensive discourse.
Tested various prompting strategies, including few-shot and chain-of-thought prompting.
Compared LLM outputs to human annotations using evaluation metrics and reliability measures.
Introduced a two-step classification method to improve precision in identifying offensive categories.
Salient categories like explicit threats showed strong interpretive stability.
Ambiguous categories, such as discrediting attacks, required higher precision in classification.
The two-step classification method reduced bias toward general categories and improved accuracy.

Abstract

Abstract This study explores the challenges and opportunities of using large language models (LLMs) for automatic annotation of offensive language in Hebrew. The analysis is based on a six-level taxonomy of offensive discourse – covering offensiveness, target, target presence, vulgarity, offense strength, and specific aspects – enabling systematic examination of nuanced Hebrew patterns. Several prompting strategies were tested, including few-shot, role-based prompting, chain-of-thought, and LLM-as-judge, and their outputs were compared to human annotations using standard evaluation metrics and inter-annotator reliability. Findings reveal that salient categories such as explicit threats show strong interpretive stability, while ambiguous ones, such as discrediting attacks, require higher precision. The study also introduces a two-step classification method: first identifying the two most plausible categories, then selecting the more accurate one. This approach reduces the model’s bias toward general categories and improves fine-grained classification. Overall, the study contributes by (1) offering a methodological framework to assess LLMs’ interpretive limits compared to humans, and (2) laying groundwork for building more refined datasets to advance Hebrew offensive language research.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Authors

Chaya Liebeskind

Yael Yefet

Journals

Lodz Papers in Pragmatics

Actions

Institutions

Jerusalem College of Technology

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Improving Hebrew offensive language classification using LLM-assisted human-in-the-loop annotation!!

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Authors

Journals

Actions

Institutions

References and Citations

Citation Network

Connected Papers

Discussion

Cite this study