notesum.ai
Published at December 10CMT: A Memory Compression Method for Continual Knowledge Learning of Large Language Models
cs.CL
cs.AI
Released Date: December 10, 2024
Authors: Dongfang Li1, Zetian Sun1, Xinshuo Hu1, Baotian Hu1, Min Zhang1
Aff.: 1Harbin Institute of Technology (Shenzhen)

| Datasets | Method | DistilGPT2 | GPT2-Large | GPT2-XL | Llama-2 | ||||
| EM | EM | EM | EM | ||||||
| StreamingQA | Uniform | 01.62 | 03.76 | 04.74 | 07.00 | 05.11 | 07.48 | 12.43 | 13.54 |
| Salient Spans | 01.44 | 04.67 | 04.86 | 08.54 | 05.40 | 09.42 | 13.33 | 18.97 | |
| CaMeLS | 01.62 | 05.79 | 05.35 | 10.60 | 06.55 | 11.67 | - | - | |
| MAC | 05.59 | 10.18 | 07.25 | 13.31 | 08.99 | 15.38 | 14.29 | 21.79 | |
| CMT (ours) | 06.43 | 12.32 | 07.32 | 13.43 | 09.61 | 16.48 | 18.36 | 25.98 | |
| SQuAD | Uniform | 01.24 | 02.54 | 03.64 | 04.97 | 06.10 | 06.78 | 13.25 | 17.01 |
| Salient Spans | 01.03 | 02.47 | 04.03 | 06.48 | 04.55 | 06.74 | 13.74 | 18.66 | |
| CaMeLS | 01.47 | 03.08 | 04.97 | 08.63 | 06.70 | 10.15 | - | - | |
| MAC | 02.01 | 06.85 | 06.43 | 11.42 | 07.10 | 12.55 | 15.07 | 21.14 | |
| CMT (ours) | 03.12 | 07.59 | 07.15 | 12.45 | 09.81 | 12.85 | 19.54 | 25.50 | |
| ArchivalQA | Uniform | 04.86 | 04.08 | 07.66 | 08.71 | 08.61 | 10.78 | 18.53 | 21.35 |
| Salient Spans | 04.52 | 03.76 | 09.75 | 11.19 | 11.81 | 14.11 | 18.97 | 22.75 | |
| CaMeLS | 04.62 | 06.19 | 09.92 | 12.41 | 13.87 | 15.74 | - | - | |
| MAC | 07.55 | 10.58 | 11.84 | 15.26 | 14.01 | 17.12 | 20.12 | 23.90 | |
| CMT (ours) | 08.15 | 11.03 | 12.28 | 16.12 | 14.55 | 18.01 | 21.73 | 25.40 | |