What question did this study set out to answer?

This work aims to quantitatively measure cognitive biases in large language models using a systematic experimental framework.

June 4, 2026Open Access

Cognitive Biases in Large Language Models: A Systematic Quantitative Assessment and Debiasing Analysis

Key Points

This work aims to quantitatively measure cognitive biases in large language models using a systematic experimental framework.
Introduced the Bias Strength Index for bias quantification.
Evaluated eleven cognitive biases across eight state-of-the-art LLMs with N=100 independent trials per configuration.
Employed a Generalized Linear Mixed-Effects Model analysis to assess bias effects across testable bias-model combinations.
Statistically significant bias effects were observed in 27 of 43 combinations, with 62.8% showing multi-comparison significance.
Framing and primacy/recency effects were universally present across models, while other biases varied significantly.
Three debiasing strategies showed different effectiveness depending on the bias type.

Abstract

Large Language Models (LLMs) are increasingly deployed in decision-support systems across high-stakes domains, yet their susceptibility to cognitive biases—systematic deviations from rational judgment well-documented in human psychology—remains poorly understood in quantitative terms. Existing studies typically examine a narrow set of biases, test a single model family, and rely on qualitative assessments of bias presence. In this work, we present a rigorous experimental framework, inspired by the methodology of experimental physics, for the systematic quantitative measurement of cognitive biases in LLMs. We introduce the Bias Strength Index (BSI), a normalized metric with associated confidence intervals that quantifies the magnitude of bias on a continuous scale, and we decompose the total uncertainty into statistical and systematic components—the latter arising from prompt reformulation. We evaluate a comprehensive taxonomy of eleven cognitive biases (including anchoring, framing effect, confirmation bias, availability heuristic, sunk cost fallacy, bandwagon effect, status quo bias, and others) across eight state-of-the-art LLMs from seven families: GPT-4.1 Mini, Claude 3.5 Sonnet, Gemini 2.5 Flash, Llama 3.3 70B, Llama 3.1 8B, Mistral Large (mistral-large-2411), DeepSeek V3, and MiniMax M2.5. Each bias is probed through multiple semantically equivalent prompt variants, with N = 100 independent trials per configuration, yielding a dataset of over 70,000 model responses. Our results reveal that all tested models exhibit non-zero bias effects for multiple bias categories, though with markedly different profiles. A trial-level Generalized Linear Mixed-Effects Model (GLMM) analysis finds statistically significant bias effects in 27 of 43 testable bias–model combinations (62.8%) after multiple-comparison correction, while a more conservative variant-level test—which requires effects to generalize across prompt formulations—yields only one significant result, highlighting the dominant role of prompt-induced systematic uncertainty. Framing and primacy/recency effects are near-universal, while susceptibility to other biases varies substantially across model families. We further evaluate three debiasing strategies—zero-shot chain-of-thought, adversarial counter-prompting, and role-based prompting—applied at inference time without modifying model weights. Our findings provide a quantitative foundation for auditing cognitive biases in LLMs and highlight the bias-dependent effectiveness of prompt-based debiasing techniques.

Read Full Paperexternally

Mark Helpful

Bookmark

Relay

View Full Paper