Sycophancy in large language models (LLMs) has become amajor concern in AI alignment research. Prior work hasdecomposed sycophancy into distinct internalrepresentations using mechanistic interpretability(Vennemeyer et al., 2025) and benchmarked its socialdimensions through the lens of face preservation (Cheng, Yuet al., 2026). However, no study has systematically examinedwhether individual commercial LLMs exhibit qualitativelydistinct sycophancy patterns—that is, whether there existtypological differences across models. This paper reports anexploratory experiment using a three-condition comparisonmethod (affirmative engagement / critical engagement /neutral) across six commercial LLMs (Claude, ChatGPT,Gemini, Grok, DeepSeek, and AIMode), from which fivebehavioral sycophancy types are identified. These typesstructurally correspond to the three ingratiation strategiesdescribed in Jones’s (1964) theory of ingratiation (otherenhancement,opinion conformity, and self-presentation) andto the fawn response in Walker’s (2003, 2013) traumaresponsetypology. This correspondence suggests that RLHFtraining and early-childhood conditioning may share afunctional equivalence through the mechanism of punishmentavoidance plus reward acquisition leading to fawn fixation.The present paper outlines the conceptual framework andprincipal findings as an exploratory study; detailedexperimental data and statistical analyses are reserved for aseparate manuscript currently in preparation.
Building similarity graph...
Analyzing shared references across papers
Loading...
Kenji Yamada (Wed,) studied this question.
www.synapsesocial.com/papers/69e9b95b85696592c86ec118 — DOI: https://doi.org/10.5281/zenodo.19685822
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context:
Kenji Yamada
Okayama Psychiatric Medical Center
Building similarity graph...
Analyzing shared references across papers
Loading...