What question did this study set out to answer?

The aim is to enhance SQL analytics by enabling operations on compressed data directly without decompression.

March 21, 2026

GPU Acceleration of SQL Analytics on Compressed Data

Key Points

The aim is to enhance SQL analytics by enabling operations on compressed data directly without decompression.
Developed new methods for running SQL queries directly on compressed data.
Utilized schemes like Run-Length Encoding and dictionary encoding.
Implemented multiple RLE columns processing without decompression.
Leveraged PyTorch tensor operations for device portability.
Achieved speedups of up to ten times compared to traditional CPU-only systems.
Demonstrated effectiveness on real-world queries using production datasets.
Showed capability to handle datasets too large for GPU memory when uncompressed.

Abstract

GPUs are uniquely suited to accelerate (SQL) analytics workloads when datasets fit in the GPU High Bandwidth Memory (HBM). Unfortunately, GPU HBMs remain typically small when compared with lower-bandwidth CPU main memory. Current solutions to accelerate queries on large datasets include multi-GPU execution, processing smaller data batches, and hybrid execution with a connected device (e.g., CPUs). Unfortunately, these approaches are exposed to the limitations of lower main memory and host-to-device interconnect bandwidths, introduce additional I/O overheads, or incur higher costs. This is a substantial problem when trying to scale adoption of GPUs on larger datasets. Data compression can alleviate this bottleneck, but to avoid paying for costly decompression/decoding, an ideal solution must include computation primitives to operate directly on data in compressed form. This is the focus of our paper: a set of new methods for running queries directly on light-weight compressed data using schemes such as Run-Length Encoding (RLE), index encoding, bit-width reductions, and dictionary encoding. Our novelty includes operating on multiple RLE columns without decompression, handling heterogeneous column encodings, and leveraging PyTorch tensor operations for portability across devices. Experimental evaluations show speedups of an order of magnitude compared to state-of-the-art commercial CPU-only analytics systems, for real-world queries on a production dataset that would not fit into GPU memory uncompressed. This work paves the road for GPU adoption in a much broader set of use cases, and it is complementary to most other scale-out or fallback mechanisms.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Authors

Zezhou Huang

Krystian Sakowski

Hans Lehnert

Journals

Proceedings of the VLDB Endowment

Actions

Institutions

Microsoft Research (United Kingdom)

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

GPU Acceleration of SQL Analytics on Compressed Data

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Authors

Journals

Actions

Institutions

References and Citations

Citation Network

Connected Papers

Discussion

Cite this study