What question did this study set out to answer?

The study aims to explore the effectiveness and limitations of large language models for content moderation in a classroom context.

March 25, 2026Open Access

Building an LLM-Powered Content Moderation Bot in the Classroom

Key Points

The study aims to explore the effectiveness and limitations of large language models for content moderation in a classroom context.
Developed Discord bots using large language models for content moderation in a Stanford course.
Conducted interviews with 16 students to assess the performance and challenges encountered with the bots.
LLMs demonstrated high accuracy in moderating content, often exceeding student expectations.
In disagreements between students and LLMs, the model's judgments were frequently validated upon closer analysis.
Students noted challenges including sensitivity to prompt phrasing and contextual interpretation issues.

Abstract

ABSTRACT As online platforms seek to improve content-moderation strategies, large language models (LLMs) may be a potential tool. This study examines opportunities and limitations of LLM-powered moderation through a unique lens: student projects for a Stanford University course titled Trust and Safety. In this course, students developed Discord bots using LLMs to moderate specific types of harmful content. Interviews with 16 of the students suggest that these models demonstrate high accuracy, often exceeding students’ expectations. Notably, in cases of disagreement between the student and the model, closer analysis frequently validated the model’s judgments. However, students also observed limitations: LLMs proved unhelpfully sensitive to prompt phrasing and exhibited many contextual interpretation challenges common to human moderators and traditional machine-learning classifiers.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Cite this study

Grossman et al. (Mon,) studied this question.

www.synapsesocial.com/papers/69c37b41b34aaaeb1a67d86c — DOI: https://doi.org/10.1017/s1049096526101929

Authors

Shelby Grossman

Anthony Mensah

Alex Stamos

Journals

PS Political Science & Politics

Actions

Institutions

Stanford University

Arizona State University

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Building an LLM-Powered Content Moderation Bot in the Classroom

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Cite this study

Authors

Journals

Actions

Institutions

References and Citations

Citation Network

Connected Papers

Discussion