May 29, 2024Open Access

LLMs achieve adult human performance on higher-order theory of mind tasks

Puntos clave

Los puntos clave no están disponibles para este artículo en este momento.

Resumen

This paper examines the extent to which large language models (LLMs) have developed higher-order theory of mind (ToM); the human ability to reason about multiple mental and emotional states in a recursive manner (e.g. I think that you believe that she knows). This paper builds on prior work by introducing a handwritten test suite -- Multi-Order Theory of Mind Q&A -- and using it to compare the performance of five LLMs to a newly gathered adult human benchmark. We find that GPT-4 and Flan-PaLM reach adult-level and near adult-level performance on ToM tasks overall, and that GPT-4 exceeds adult performance on 6th order inferences. Our results suggest that there is an interplay between model size and finetuning for the realisation of ToM abilities, and that the best-performing LLMs have developed a generalised capacity for ToM. Given the role that higher-order ToM plays in a wide range of cooperative and competitive human behaviours, these findings have significant implications for user-facing LLM applications.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Cite this study

Street et al. (Wed,) studied this question.

www.synapsesocial.com/papers/68e67cc7b6db643587606e3f — DOI: https://doi.org/10.48550/arxiv.2405.18870

Also consider

Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context:

Testing theory of mind in large language models and humans· 2024 · 214 citations
Do LLMs Exhibit Human-Like Reasoning? Evaluating Theory of Mind in LLMs for Open-Ended Responses· 2024 · 1 citations
Theory of Mind Abilities of Large Language Models in Human-Robot Interaction: An Illusion?· 2024 · 17 citations
Re-evaluating Theory of Mind evaluation in large language models· 2025 · 4 citations
ToMBench: Benchmarking Theory of Mind in Large Language Models

Authors

Winnie Street

John Oliver Siy

Geoff Keeling

Actions

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

LLMs achieve adult human performance on higher-order theory of mind tasks

Puntos clave

Resumen

Citation Network

Connected Papers

Discussion

Cite this study

Also consider

Authors

Actions

References and Citations

Citation Network

Connected Papers

Discussion