notesum.ai

Published at November 28

Marconi: Prefix Caching for the Era of Hybrid LLMs

cs.DC
cs.AI
cs.LG

Released Date: November 28, 2024

Authors: Rui Pan, Zhuang Wang, Zhen Jia, Can Karakus, Luca Zancato, Tri Dao, Ravi Netravali, Yida Wang

Arxiv: http://arxiv.org/pdf/2411.19379v1