notesum.ai

Published at December 4

Unifying KV Cache Compression for Large Language Models with LeanKV

cs.LG
cs.DC

Released Date: December 4, 2024

Authors: Yanqi Zhang1, Yuwei Hu1, Runyuan Zhao1, John C. S. Lui, Haibo Chen2

Aff.: 1Huawei; 2Shanghai Jiao Tong University

Arxiv: http://arxiv.org/pdf/2412.03131v1