What type of study is this?

This is a Quantitative Study study.

October 2, 2025Open Access

CORE-RAG: Lossless Compression for Retrieval-Augmented LLMs via Reinforcement Learning

Key Points

CORE achieves a high compression ratio of 3% while preventing performance degradation.
The approach shows an improvement of 3.3 points in average exact match score compared to no compression.
CORE uses reinforcement learning to optimize context compression without predefined labels.
Extensive experiments across four datasets validate CORE's effectiveness in enhancing task performance.

Abstract

Retrieval-Augmented Generation (RAG) has emerged as a promising approach to enhance the timeliness of knowledge updates and the factual accuracy of responses in large language models. However, incorporating a large number of retrieved documents significantly increases input length, leading to higher computational costs. Existing approaches to document compression tailored for RAG often degrade task performance, as they typically rely on predefined heuristics in the absence of clear compression guidelines. These heuristics fail to ensure that the compressed content effectively supports downstream tasks. To address these limitations, we propose CORE, a novel method for lossless context compression in RAG. CORE is optimized end-to-end and does not depend on predefined compression labels, which are often impractical to obtain. Instead, it leverages downstream task performance as a feedback signal, iteratively refining the compression policy to enhance task effectiveness. Extensive experiments across four datasets demonstrate the effectiveness of CORE. With a high compression ratio of 3%, CORE not only prevents performance degradation compared to including full documents (i.e., without compression) but also improves the average Exact Match (EM) score by 3.3 points. The code for CORE will be released soon.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Authors

Ziqiang Cui

Yuanchi Weng

Xing Tang

Actions

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

CORE-RAG: Lossless Compression for Retrieval-Augmented LLMs via Reinforcement Learning

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Authors

Actions

References and Citations

Citation Network

Connected Papers

Discussion

Cite this study

Also consider