This paper introduces CorrBound , a cardinality estimator that exploits the correlations between the join columns across and within relations. The state-of-the-art estimators do not observe such correlations and can therefore yield poor estimates. By explicitly accounting for correlations, CorrBound can significantly improve the accuracy of cardinality estimation. CorrBound supports multi-join queries (acyclic or cyclic) with conjunctions of equality and range predicates and group-by clauses. Its estimate is the optimal solution of a linear program whose constraints encode data statistics and Shannon inequalities. It uses a new information inequality that captures the intra- and inter-relation correlations of join columns and uses the generalized inner product of degree vectors of join columns. This inequality also captures the ℓ p -norm inequality used by the state-of-the-art estimator LpBound. The inequality comes with a high-dimensionality challenge: managing the cardinality tensor defined by the generalized inner products of many large degree vectors. To keep the estimation time feasible, CorrBound uses two compression techniques that retain the accuracy of the inner products: sketches of degree vectors and low-rank decomposition of the cardinality tensor. We experimentally evaluate CorrBound against traditional, pessimistic, and machine learning-based estimators on the JOBlight and STATS, and subgraph matching benchmarks. Our main finding is that CorrBound can be more accurate than the state of the art while maintaining a low estimation time.
Building similarity graph...
Analyzing shared references across papers
Loading...
Mayer et al. (Thu,) studied this question.
www.synapsesocial.com/papers/69d894326c1944d70ce051fb — DOI: https://doi.org/10.1145/3786633
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context:
Christoph Mayer
Haozhe Zhang
Mahmoud Abo Khamis
Proceedings of the ACM on Management of Data
University of Washington
University of Zurich
Seattle University
Building similarity graph...
Analyzing shared references across papers
Loading...