This preprint presents MaxEntRAG, a Maximum Entropy based retrieval framework for Graph-Augmented Retrieval-Augmented Generation. The paper studies whether RAG systems require dense, LLM-generated knowledge graphs, or whether sparse, source-anchored retrieval structures can preserve enough relational signal for effective multi-hop and domain-specific retrieval. MaxEntRAG replaces exhaustive graph extraction with entropy-driven anchor selection. High-information lexical anchors are linked directly back to source spans, creating a compact transitive retrieval structure without using an LLM for indexing or graph construction. The method is designed to reduce graph density, indexing cost, and query latency while preserving source-grounded evidence paths. The paper introduces and studies the Density Paradox: the observation that increasing graph density can improve retrieval only up to a point, after which additional semantic edges introduce topological noise and reduce retrieval precision. Experiments are reported on GraphRAG-Bench, HotpotQA, and MuSiQue, with comparisons against representative graph-based retrieval baselines. This upload is a preprint version of the manuscript. It has not yet been peer reviewed. But submitted to a conference.
Gavara Haranadh (Tue,) studied this question.