notesum.ai
Published at December 4Unifying KV Cache Compression for Large Language Models with LeanKV
cs.LG
cs.DC
Released Date: December 4, 2024
Authors: Yanqi Zhang1, Yuwei Hu1, Runyuan Zhao1, John C. S. Lui, Haibo Chen2
Aff.: 1Huawei; 2Shanghai Jiao Tong University
