Abstract Emerging research seeks to draw neuroscientific insights from the neural predictivity of large language models (LLMs). However, as results rapidly proliferate, there is a growing need for large-scale assessments of their robustness. Here, we analyze a wide range of models and methodological approaches across three widely used neural datasets. We find that the use of shuffled train-test splits has contributed to findings that are influential but spurious. Furthermore, how activations are extracted from LLMs can bias results in favor of specific model classes. Lastly, we find that confounding variables, particularly positional signals and word rate, perform competitively with trained LLMs and fully account for the neural predictivity of untrained LLMs on these neural datasets. Although many studies in the field avoid these pitfalls, our results indicate that some apparent alignment between LLMs and brains has emerged from non-robust methods and overlooked confounds.
Building similarity graph...
Analyzing shared references across papers
Loading...
Nima Hadidi
Ebrahim Feghhi
Bryan H Song
Nature Communications
University of California, Los Angeles
Building similarity graph...
Analyzing shared references across papers
Loading...
Hadidi et al. (Mon,) studied this question.
www.synapsesocial.com/papers/69f154e0879cb923c4945288 — DOI: https://doi.org/10.1038/s41467-026-72253-7