Enumerable filters are the gold-standard for full-featured filters: they support insertions, deletions, merging; they can support associated values, such as counts; and like traditional filters they support queries with a small probability of false positives. When designing a system that uses filters, high-performance enumerable filters are required to simplify system design and improve overall performance, compared to systems that must work around the limitations of traditional filters that are not full featured. The vector quotient filter (VQF) is the state of the art enumerable filter. For limited-functionality filters, blocked Bloom filters (BBF) and the prefix filter (PF) are state of the art and offer tradeoffs. Both support insertions and queries but not deletions or merging. BBFs are faster for insertions and queries but use more space than PFs and VQFs. For small-space filters, the state-of-the-art suggests a tradeoff: PFs are faster than VQFs but VQFs are enumerable, which makes them a favorable candidate to be integrated in applications. Given these tradeoffs, we are left with the following question: Do we need to give up features in order to achieve the highest performance? In this paper, we present the breadcrumb filter (BCF), a full-featured enumerable filter. For insertions, the BCF is up to 34% faster than VQF, 3.2x faster than cuckoo filter, 19.6% faster than PF. For queries, the BCF achieves competitive performance, outperforming the VQF while matching the cuckoo filter. At the same time, it achieves higher space efficiency than VQF, prefix, cuckoo filter and BBF. We conclude that the choice of filter is now simplified: if additional features beyond insertions, such as deletions and counting, are required, or if minimizing space is crucial, the BCF offers the best performance. On the other hand, if only insertions are needed and space is not constrained, the BBF is the right choice.
Building similarity graph...
Analyzing shared references across papers
Loading...
Krapivin et al. (Thu,) studied this question.
www.synapsesocial.com/papers/69d894ce6c1944d70ce05b0d — DOI: https://doi.org/10.1145/3786629
Andrew Krapivin
Aaditya Rangarajan
Alex Conway
Proceedings of the ACM on Management of Data
Cornell University
New York University
University of Utah
Building similarity graph...
Analyzing shared references across papers
Loading...