notesum.ai

Published at November 14

cs.LG

cs.AI

cs.CL

Released Date: November 14, 2024

Authors: Jan Hansen-Palmus¹, Michael Truong-Le, Oliver Hausdörfer², Alok Verma¹

Aff.: ¹Recogni; ²Technical University of Munich

Model	Sub-variant	Value Dtype	Block Size	Bits	FP16	Increase
					Perplexity
Llama 3.1	8B	FP4	8	4.6	7.22	3.22%
Llama 3.1	70B	FP5	32	5.2	3.86	1.68%
Gemma 2	2B	FP5	32	5.2	14.27	1.39%
Gemma 2	9B	FP4	32	4.2	10.40	1.83%
Mistral	7B	FP4	32	4.2	5.23	1.18%
	22B	FP4	8	4.6	4.02	1.62%
	123B	FP5	32	5.2	2.65	0.48%