notesum.ai

Published at November 26

Star Attention: Efficient LLM Inference over Long Sequences

cs.CL
cs.AI
cs.LG

Released Date: November 26, 2024

Authors: Shantanu Acharya1, Fei Jia1, Boris Ginsburg1

Aff.: 1NVIDIA

Arxiv: http://arxiv.org/abs/2411.17116v1