Research article: KV-Cache Compression Benchmarks — Quantization vs Eviction vs Pruning
Oleh Ivchenko (Mon,) studied this question.