notesum.ai

Published at December 4

ClusterKV: Manipulating LLM KV Cache in Semantic Space for Recallable Compression

cs.LG
cs.AI
cs.PF

Released Date: December 4, 2024

Authors: Guangda Liu1, Chengwei Li, Jieru Zhao, Chenqi Zhang, Minyi Guo

Aff.: 1Shanghai Jiao Tong University

Arxiv: http://arxiv.org/pdf/2412.03213v1