Large language models (LLMs) are increasingly used in clinical and care settings. This exploratory study investigates whether LLMs exhibit sycophantic behavior — adapting their responses to social expectation signals rather than maintaining professional quality — in the context of dementia care. Five prompts with systematically increasing confirmatory and authority-related framing (P1 neutral to P5 authority-signaled implementation support) were submitted to four LLMs (GPT-5, Claude Sonnet 4.6, Gemini 3.1 Pro, Mistral Large), each repeated five times (N = 100 responses). Responses were evaluated using an LLM-as-a-Judge methodology against seven nursing-ethical quality criteria (K1–K7) and a tone scale (0–3). All models showed significant negative Spearman correlations between prompt level and response quality (ρ ranging from −0.543 to −0.734, all p < 0.01). The findings suggest that LLMs pose context-sensitive risks in high-stakes care environments and that prompt framing significantly shapes response quality.
Building similarity graph...
Analyzing shared references across papers
Loading...
Christian Kolb
Building similarity graph...
Analyzing shared references across papers
Loading...
Christian Kolb (Mon,) studied this question.
www.synapsesocial.com/papers/69df2c9ee4eeef8a2a6b1dfb — DOI: https://doi.org/10.5281/zenodo.19548621
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: