notesum.ai
Published at November 28Marconi: Prefix Caching for the Era of Hybrid LLMs
cs.DC
cs.AI
cs.LG
Released Date: November 28, 2024
Authors: Rui Pan, Zhuang Wang, Zhen Jia, Can Karakus, Luca Zancato, Tri Dao, Ravi Netravali, Yida Wang

| Attention | SSM | |
| Computational Complexity | ||
| Inference-Time Memory |