What question did this study set out to answer?

The research aims to investigate how LLMs operate during high-stakes scenarios, focusing on ethical consistency and cooperation.

April 17, 2026Open Access

Crisis as catalyst: evaluating ethical consistency and cooperation in LLMs under high-stakes scenarios

Key Points

The research aims to investigate how LLMs operate during high-stakes scenarios, focusing on ethical consistency and cooperation.
Used psychological scales to evaluate LLMs in simulated emergencies.
Conducted assessments in two crisis scenarios: natural disaster resource allocation and crowd panic response.
Analyzed decision-making using both quantitative and qualitative approaches across three LLMs.
LLMs displayed broad human-like preferences but showed significant variations in ethical trade-offs.
Decision-making in LLMs revealed flattened distributions compared to human responses.
LLMs demonstrated less sensitivity to social variables, indicating value misalignment risks.

Abstract

As large language models (LLMs) integrate into critical decision-making, their alignment with human values in high-stakes scenarios remains unclear. This study systematically investigates LLM behavioral consistency, focusing on cooperative intent, resource distribution, and moral reasoning, under simulated emergencies. We employed established psychological scales in two crisis scenarios: natural disaster resource allocation and crowd panic response. We use “catalyst” metaphorically: crisis framings serve as an observational stress test that amplifies and reveals latent behavioral trade-offs in LLMs rather than improving the models. Using a standardized API framework, we evaluated three primary LLMs (gpt-4o, DeepSeek-V3, and DeepSeek-R1) across repeated trials, analyzing both quantitative decisions and qualitative justifications. Results reveal that while LLMs reproduce broad human-like preferences (e.g., cooperation over competition), they exhibit systematic variations in ethical trade-offs and “flattened” decision distributions. Models differed significantly in cooperative framing and showed attenuated sensitivity to social variables (e.g., future interaction expectations) compared to humans. These findings advance computational crisis management and AI ethics, demonstrating context-dependent value misalignment risks. We propose a novel framework for evaluating behavioral consistency in silicon-based agents during crises, offering critical methodological and ethical guidance for deploying LLMs in socially complex, high-stakes environments.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Cite this study

Yang et al. (Wed,) studied this question.

www.synapsesocial.com/papers/69e1cfb15cdc762e9d858b40 — DOI: https://doi.org/10.1057/s41599-026-07194-z

Also consider

Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context:

Value Creation in Inter-Organizational Collaboration: An Empirical Study· 2016 · 125 citations
Reflexion: language agents with verbal reinforcement learning· 2023 · 90 citations
Designing Resilience· 2010 · 269 citations
Organising for Effective Emergency Management: Lessons from Research<sup>1</sup>· 2010 · 263 citations
Differences Between Tight and Loose Cultures: A 33-Nation Study· 2011 · 3,162 citations

Authors

Aoxiang Yang

Mingyang ZUO

Rui PENG

Journals

Humanities and Social Sciences Communications

Actions

Institutions

Tsinghua University

Communication University of China

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Crisis as catalyst: evaluating ethical consistency and cooperation in LLMs under high-stakes scenarios

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Cite this study

Also consider

Authors

Journals

Actions

Institutions

References and Citations

Citation Network

Connected Papers

Discussion