notesum.ai

Published at December 5

cs.CL

cs.AI

Released Date: December 5, 2024

Authors: Fredrik Carlsson¹, Fangyu Liu², Daniel Ward¹, Murathan Kurfali¹, Joakim Nivre³

Aff.: ¹RISE Research Institutes of Sweden; ²Google DeepMind; ³Uppsala University

Strong Baselines
Model	Context PPL	128 Pref	256 Pref	128 TTR	256 TTR
Original Texts	–	–	–	73.5	73.8
TinyLLama (1.1 B) Top-P	245	31.8	21.1	38.8	28.2
DeepSeek (7 B) Top-P	34	50.0	35.6	58.2	49.7
Llama 3.1 (8 B) Top-P	36	50.5	38.5	62.1	57.0
Original Models
TinyLlama (1.1 B)	245	12.0	4.9	25.1	17.0
DeepSeek (7 B)	34	37.7	17.1	45.6	32.2
Llama 3.1 (8 B)	36	35.0	25.6	48.5	34.5
Llama 3.1 (70 B)	29	48.7	34.4	56.4	50.6
Hyperfitted Models
TinyLLama (1.1 B)	467	44.6	34.3	64.5	60.0
DeepSeek (7 B)	545	49.4	45.2	62.3	60.5
Llama 3.1 (8 B)	389	50.1	42.9	64.5	62.6
Llama 3.1 (70 B)	255	55.9	52.4	62.0	61.6
Hyperfitted Models + Citation Blocking
TinyLLama (1.1 B)	467	45.2	35.0	64.8	60.3
DeepSeek (7 B)	545	47.5	44.1	62.5	60.6
Llama 3.1 (8 B)	389	47.6	41.2	64.4	63.3