notesum.ai

Published at November 26

cs.CL

cs.AI

cs.LG

Released Date: November 26, 2024

Authors: Shantanu Acharya¹, Fei Jia¹, Boris Ginsburg¹

Aff.: ¹NVIDIA

Model	Seq. Len.	Block Size	Ring-Attn	Star-Attn
Model	(K)	(K)	Acc.(%)	$\Delta$ Acc.	$\Delta$ Speedup
	16	4	86.12	+2.47%	1.1x
	32	8	82.52	+1.54%	1.2x
	64	16	79.05	+1.28%	1.8x
Llama-3-8B-Instruct, 1048K Gradient.ai (2024)	128	32	77.39	+1.23%	2.7x
	16	4	95.09	-2.85%	1.7x
	32	8	94.61	-2.70%	2.0x
Llama-3.1-70B-Instruct, 128K Meta-AI (2024)	64	16	88.54	-1.63%	4.7x