Despite rising concerns about sycophancy—excessive agreement or flattery from artificial intelligence (AI) systems—little is known about its prevalence or consequences. We show that sycophancy is widespread and harmful. Across 11 state-of-the-art models, AI affirmed users’ actions 49% more often than humans, even when queries involved deception, illegality, or other harms. In three preregistered experiments ( N = 2405), even a single interaction with sycophantic AI reduced participants’ willingness to take responsibility and repair interpersonal conflicts, while increasing their conviction that they were right. Despite distorting judgment, sycophantic models were trusted and preferred. This creates perverse incentives for sycophancy to persist: The very feature that causes harm also drives engagement. Our findings underscore the need for design, evaluation, and accountability mechanisms to protect user well-being.
Building similarity graph...
Analyzing shared references across papers
Loading...
Myra Cheng
Cinoo Lee
Pranav Khadpe
Science
Stanford University
Carnegie Mellon University
Building similarity graph...
Analyzing shared references across papers
Loading...
Cheng et al. (Thu,) studied this question.
www.synapsesocial.com/papers/69c76fff8bbfbc51511e04d5 — DOI: https://doi.org/10.1126/science.aec8352
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: