What question did this study set out to answer?

This research investigates how natural language explanations from large language models influence human-AI performance during collaborative tasks.

March 29, 2026Open Access

The Persuasion Paradox: When LLM Explanations Fail to Improve Human-AI Team Performance

Key Points

This research investigates how natural language explanations from large language models influence human-AI performance during collaborative tasks.
Conducted three controlled human-subject studies focusing on visual reasoning and logical reasoning tasks.
Implemented a multi-stage reveal design and between-subjects comparisons.
Examined the impact of LLM explanations versus predicted probabilities and selective automation on task performance.
In visual reasoning tasks, explanations increased user confidence but did not enhance accuracy beyond AI predictions.
Explanations suppressed users' ability to recover from AI errors significantly.
Conversely, language-based logical reasoning tasks showed higher accuracy and recovery rates with LLM explanations compared to other forms of support.

Abstract

While natural-language explanations from large language models (LLMs) are widely adopted to improve transparency and trust, their impact on objective human-AI team performance remains poorly understood. We identify a Persuasion Paradox: fluent explanations systematically increase user confidence and reliance on AI without reliably improving, and in some cases undermining, task accuracy. Across three controlled human-subject studies spanning abstract visual reasoning (RAVEN matrices) and deductive logical reasoning (LSAT problems), we disentangle the effects of AI predictions and explanations using a multi-stage reveal design and between-subjects comparisons. In visual reasoning, LLM explanations increase confidence but do not improve accuracy beyond the AI prediction alone, and substantially suppress users’ ability to recover from AI errors. Interfaces exposing model uncertainty via predicted probabilities, as well as a selective automation policy that defers uncertain cases to humans, achieve significantly higher accuracy and error recovery than explanation-based interfaces. In contrast, for language-based logical reasoning tasks, LLM explanations yield the highest accuracy and recovery rates, outperforming both expert-written explanations and probability-based support. This divergence reveals that the effectiveness of narrative explanations is strongly task-dependent and mediated by cognitive modality. Our findings demonstrate that commonly used subjective metrics such as trust, confidence, and perceived clarity are poor predictors of human-AI team performance. Rather than treating explanations as a universal solution, we argue for a shift toward interaction designs that prioritize calibrated reliance and effective error recovery over persuasive fluency.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Cite this study

Cohen et al. (Sat,) studied this question.

www.synapsesocial.com/papers/69c8c324de0f0f753b39dbbd — DOI: https://doi.org/10.5281/zenodo.19254149

Authors

Ruth Cohen

Lu Feng

Ayala Bloch

Actions

Institutions

University of Virginia

Bar-Ilan University

Ariel University

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

The Persuasion Paradox: When LLM Explanations Fail to Improve Human-AI Team Performance

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Cite this study

Authors

Actions

Institutions

References and Citations

Citation Network

Connected Papers

Discussion