Similarity search in metric spaces is a fundamental problem in data management with many applications. While numerous indices have been proposed to support similarity search, most existing approaches rely on pivot-based strategies that suffer from critical limitations. Traditional single-pivot methods offer limited pruning power, while multi-pivot techniques often become inefficient as data evolves, since pivot updates incur substantial computational overhead. In this paper, we propose LM-Tree (short for Learned M-Tree), a hybrid learned index that combines pivots and learning models with M-Tree to address the above problems. LM-Tree's key innovation lies in its self-adaptive node architecture, where each node dynamically selects an appropriate number of pivots and incorporates lightweight learning models to enhance pruning efficiency. This design enables each node to maintain a relatively large number of child nodes (or objects for leaf nodes) while consistently delivering strong pruning performance, thereby enabling LM-Tree to use a small number of nodes to index objects in metric spaces. Furthermore, we develop an efficient maintenance algorithm that handles dynamic updates, including pivot adjustments, model reforms, node splits, and merges with low overhead. Extensive experiments on both real and synthetic datasets demonstrate that LM-Tree significantly outperforms state-of-the-art methods.
Building similarity graph...
Analyzing shared references across papers
Loading...
Wang et al. (Thu,) studied this question.
www.synapsesocial.com/papers/69d894526c1944d70ce0536e — DOI: https://doi.org/10.1145/3786665
Yaqi Wang
Bin Wang
Rui Zhu
Proceedings of the ACM on Management of Data
Northeastern University
Shenyang Aerospace University
Building similarity graph...
Analyzing shared references across papers
Loading...