Abstract We present a systematic analysis of module-level design choices in GraphRAG, a retrieval-augmented generation framework that integrates structured knowledge graphs into question answering. Focusing on triple extraction, community clustering, and report generation, we evaluate multiple strategies across two knowledge-intensive benchmarks. Our results show that high-quality triple extraction is critical, as the accuracy and coverage of the resulting knowledge graph can become a bottleneck for downstream reasoning. We also find that the granularity of fundamental knowledge units, as determined by community clustering, has a significant impact on downstream performance: Achieving a balance between factual detail and topical coherence within each unit is important to enable precise and comprehensive retrieval and to facilitate effective multi-hop reasoning. In addition, simple template-based reporting outperforms LLM-based summarization in both accuracy and efficiency. These findings provide practical guidance for the structure- aware design of retrieval-augmented systems.
Building similarity graph...
Analyzing shared references across papers
Loading...
Nishida et al. (Thu,) studied this question.
www.synapsesocial.com/papers/69fd7f0dbfa21ec5bbf0777d — DOI: https://doi.org/10.1162/tacl.a.615
Noriki Nishida
Rumana Ferdous Munne
Shanshan Liu
Transactions of the Association for Computational Linguistics
Kyoto University
RIKEN
RIKEN Nishina Center
Building similarity graph...
Analyzing shared references across papers
Loading...