What type of study is this?

This is a Quantitative Study study.

September 29, 2025Open Access

ReviewAgents: Bridging the Gap Between Human and AI-Generated Paper Reviews

Puntos clave

ReviewAgents narrows the gap between AI and human reviews, enhancing accuracy and comprehensiveness.
A novel dataset, Review-CoT, with 142k comments serves to train LLM agents for structured reasoning.
Experimental results indicate ReviewAgents outperform existing LLMs in generating review comments.
The ReviewBench benchmark evaluates LLM-generated comments, showing existing gaps compared to human reviewers.

Resumen

Academic paper review is a critical yet time-consuming task within the research community. With the increasing volume of academic publications, automating the review process has become a significant challenge. The primary issue lies in generating comprehensive, accurate, and reasoning-consistent review comments that align with human reviewers' judgments. In this paper, we address this challenge by proposing ReviewAgents, a framework that leverages large language models (LLMs) to generate academic paper reviews. We first introduce a novel dataset, Review-CoT, consisting of 142k review comments, designed for training LLM agents. This dataset emulates the structured reasoning process of human reviewers-summarizing the paper, referencing relevant works, identifying strengths and weaknesses, and generating a review conclusion. Building upon this, we train LLM reviewer agents capable of structured reasoning using a relevant-paper-aware training method. Furthermore, we construct ReviewAgents, a multi-role, multi-LLM agent review framework, to enhance the review comment generation process. Additionally, we propose ReviewBench, a benchmark for evaluating the review comments generated by LLMs. Our experimental results on ReviewBench demonstrate that while existing LLMs exhibit a certain degree of potential for automating the review process, there remains a gap when compared to human-generated reviews. Moreover, our ReviewAgents framework further narrows this gap, outperforming advanced LLMs in generating review comments.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Authors

Xiaojin Gao

Jiacheng Ruan

Z. D. Zhang

Actions

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

ReviewAgents: Bridging the Gap Between Human and AI-Generated Paper Reviews

Puntos clave

Resumen

Citation Network

Connected Papers

Discussion

Authors

Actions

References and Citations

Citation Network

Connected Papers

Discussion

Cite this study

Also consider