This paper investigates whether structured collaboration between multiple large language models (LLMs), each assigned a distinct cognitive role grounded in psychological theory, produces benefits beyond simple answer aggregation. We propose the Parallel Synthesis architecture, in which three cognitively specialized roles Analyzer (hierarchical decomposition), Creative (divergent thinking), and Critic (critical evaluation) process each task independently and in parallel, and a Synthesizer integrates their outputs into a final response. To evaluate collaborative reasoning, we introduce the Emergent Reasoning Score (ERS), a composite metric that separates perspective integration (Synthesis Effectiveness) from novel concept generation (Emergent Value). Experiments on Experiments on the AI2 Reasoning Challenge (ARC-Challenge) (1172 questions) and and the Massive Multitask Language Understanding benchmark (MMLU) (1531 questions) show two consistent findings. First, the architecture achieves high Synthesis Effectiveness (SE=0.711–0.744), indicating reliable integration of all three cognitive perspectives. Second, Emergent Value remains low (EV=0.096–0.112), indicating that synthesis primarily recombines existing concepts rather than generating substantial novel content. A Majority Voting baseline achieves comparable or slightly higher answer accuracy than the Synthesizer on both benchmarks, showing that the architecture’s main contribution lies not in answer selection but in producing integrated reasoning traces that draw on multiple perspectives. These findings suggest that the practical value of cognitively grounded multi-agent architectures lies in reliable perspective integration, while ERS provides a reusable framework for distinguishing integration from genuinely novel reasoning in multi-agent LLM systems. The empirical results reported here constitute a pilot validation of the proposed framework on closed-form benchmarks, intended to establish a proof of concept and motivate larger-scale evaluation.
Building similarity graph...
Analyzing shared references across papers
Loading...
Lev Sukherman
Yetunde Longe-Folajimi
Marina Konkol
Computers
Worcester Polytechnic Institute
Wentworth Institute of Technology
Moscow State Institute of International Relations
Building similarity graph...
Analyzing shared references across papers
Loading...
Sukherman et al. (Mon,) studied this question.
www.synapsesocial.com/papers/69f2f2221e5f7920c6387921 — DOI: https://doi.org/10.3390/computers15050277