What question did this study set out to answer?

The research aims to enhance the efficiency of range queries in log-structured merge-trees using an innovative caching approach.

April 10, 2026Open Access

Improving Range Scan Performance in LSM-trees with Group Caching

Key Points

The research aims to enhance the efficiency of range queries in log-structured merge-trees using an innovative caching approach.
Introduced Group Cache using key-value groups as caching units
Developed a size-aware policy to prioritize small, high-utility KV groups
Conducted theoretical analysis and extensive experiments in RocksDB
Achieved up to 3× faster query performance under the same memory budget
Reduced memory usage by 75% while maintaining similar query performance
Demonstrated superior performance compared to traditional caching methods

Abstract

Log-structured merge-trees (LSM-trees) are widely used in modern key-value stores, but their multi-level structure reduces lookup efficiency, especially for range scans. Existing caching solutions, like block caches or full query caches, are memory-inefficient because they fail to exploit a critical asymmetry: eliminating an I/O from upper LSM-tree levels requires caching far fewer key-value pairs (KVs) than from lower levels. To address this, we introduce Group Cache, which uses KV Groups, the minimal set of KVs within a block for a specific query, as its fundamental caching unit. By employing a size-aware policy that prioritizes small, high-utility KV Groups, Group Cache maximizes I/O savings per unit of memory. We also address practical challenges like compaction management, intra-group hotness difference and scalability. Our theoretical analysis and extensive experiments in RocksDB demonstrate that Group Cache significantly outperforms traditional caching methods, achieving up to 3× faster query performance with the same memory budget, or achieving similar performance while using 75% less space.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Cite this study

Wang et al. (Thu,) studied this question.

www.synapsesocial.com/papers/69d893c96c1944d70ce04baa — DOI: https://doi.org/10.1145/3786661

Also consider

Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context:

Structural Designs Meet Optimality: Exploring Optimized LSM-tree Structures in a Colossal Configuration Space· 2024 · 10 citations
Grafite: Taming Adversarial Queries with Optimal Range Filters· 2024 · 9 citations
Mnemosyne: Dynamic Workload-Aware BF Tuning via Accurate Statistics in LSM trees· 2025 · 3 citations
Large deviations for sums of partly dependent random variables· 2004 · 121 citations
GeckoFTL· 2016 · 23 citations

Authors

Hengrui Wang

Jiaoyi Zhang

Jiansheng Qiu

Journals

Proceedings of the ACM on Management of Data

Actions

Institutions

Tsinghua University

East China Normal University

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Improving Range Scan Performance in LSM-trees with Group Caching

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Cite this study

Also consider

Authors

Journals

Actions

Institutions

References and Citations

Citation Network

Connected Papers

Discussion