What question did this study set out to answer?

April 30, 2026Open Access

Cognitive Grounding for Perspective Integration in Multi-LLM Systems

Key Points

This research aims to evaluate structured collaboration between multiple large language models for integrating perspectives effectively.
Introduced the Parallel Synthesis architecture with specialized cognitive roles: Analyzer, Creative, Critic, and Synthesizer.
Conducted experiments on AI2 Reasoning Challenge (1172 questions) and Massive Multitask Language Understanding benchmark (1531 questions).
Implemented the Emergent Reasoning Score to assess perspective integration and novel concept generation.
Synthesis Effectiveness ranged from SE=0.711 to 0.744, indicating reliable perspective integration.
Emergent Value remained low at EV=0.096 to 0.112, suggesting limited generation of new concepts.
A Majority Voting baseline achieved comparable accuracy to the Synthesizer, emphasizing the importance of reasoning traces over answer selection.

Abstract

This paper investigates whether structured collaboration between multiple large language models (LLMs), each assigned a distinct cognitive role grounded in psychological theory, produces benefits beyond simple answer aggregation. We propose the Parallel Synthesis architecture, in which three cognitively specialized roles Analyzer (hierarchical decomposition), Creative (divergent thinking), and Critic (critical evaluation) process each task independently and in parallel, and a Synthesizer integrates their outputs into a final response. To evaluate collaborative reasoning, we introduce the Emergent Reasoning Score (ERS), a composite metric that separates perspective integration (Synthesis Effectiveness) from novel concept generation (Emergent Value). Experiments on Experiments on the AI2 Reasoning Challenge (ARC-Challenge) (1172 questions) and and the Massive Multitask Language Understanding benchmark (MMLU) (1531 questions) show two consistent findings. First, the architecture achieves high Synthesis Effectiveness (SE=0.711–0.744), indicating reliable integration of all three cognitive perspectives. Second, Emergent Value remains low (EV=0.096–0.112), indicating that synthesis primarily recombines existing concepts rather than generating substantial novel content. A Majority Voting baseline achieves comparable or slightly higher answer accuracy than the Synthesizer on both benchmarks, showing that the architecture’s main contribution lies not in answer selection but in producing integrated reasoning traces that draw on multiple perspectives. These findings suggest that the practical value of cognitively grounded multi-agent architectures lies in reliable perspective integration, while ERS provides a reusable framework for distinguishing integration from genuinely novel reasoning in multi-agent LLM systems. The empirical results reported here constitute a pilot validation of the proposed framework on closed-form benchmarks, intended to establish a proof of concept and motivate larger-scale evaluation.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Authors

Lev Sukherman

Yetunde Longe-Folajimi

Marina Konkol

Journals

Computers

Actions

Institutions

Worcester Polytechnic Institute

Wentworth Institute of Technology

Moscow State Institute of International Relations

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Cognitive Grounding for Perspective Integration in Multi-LLM Systems

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Authors

Journals

Actions

Institutions

References and Citations

Citation Network

Connected Papers

Discussion

Cite this study