What question did this study set out to answer?

The aim is to bridge knowledge gaps in historical text analysis using large language models through a new framework.

March 3, 2026

Research on graph-retrieval augmented generation based on historical text knowledge graphs

Key Points

The aim is to bridge knowledge gaps in historical text analysis using large language models through a new framework.
Developed the GraphRAG framework combining various advanced techniques
Created a character relationship dataset with minimal manual annotation
Conducted experiments using multiple domain-specific models with Chinese input
Achieved F1 score of 0.68 in relation extraction with domain-specific model
Integrated DeepSeek-R1 model with GraphRAG increased F1 score by 0.11 on C-CLUE dataset
Effectively reduced model 'hallucinations' and enhanced interpretability in outputs

Abstract

Abstract This article addresses domain knowledge gaps in general large language models for historical text analysis in the context of computational humanities and AIGC technology. We propose the GraphRAG framework, combining chain-of-thought prompting, self-instruction generation, and process supervision to create a “The First Four Histories” character relationship dataset with minimal manual annotation. This dataset supports automated historical knowledge extraction, reducing labor costs. In the graph-augmented generation phase, we introduce a collaborative mechanism between knowledge graphs and retrieval-augmented generation, improving the alignment of general models with historical knowledge. Experiments show that the domain-specific model Xunzi-Qwen1.5-14B, with Simplified Chinese input and chain-of-thought prompting, achieves optimal performance in relation extraction (F1 = 0.68). The DeepSeek-R1 model integrated with GraphRAG achieves an absolute F1 increase of 0.11 (0.08 → 0.19) on the open-domain C-CLUE relation extraction dataset, surpassing the F1 value of Xunzi-Qwen1.5-14B (0.12), effectively alleviating “hallucinations,” and improving interpretability. This framework offers a low-resource solution for classical text knowledge extraction, advancing historical knowledge services and humanities research.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Authors

Fan Yang

Qi Zhang

Wenqian Xing

Journals

Digital Scholarship in the Humanities

Actions

Institutions

Nanjing Agricultural University

Shanxi University

Shanxi University of Finance and Economics

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Research on graph-retrieval augmented generation based on historical text knowledge graphs

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Authors

Journals

Actions

Institutions

References and Citations

Citation Network

Connected Papers

Discussion

Cite this study