notesum.ai
Published at October 21Does your LLM truly unlearn? An embarrassingly simple approach to recover unlearned knowledge
cs.CL
cs.AI
Released Date: October 21, 2024
Authors: Zhiwei Zhang1, Fali Wang1, Xiaomin Li2, Zongyu Wu1, Xianfeng Tang3, Hui Liu3, Qi He3, Wenpeng Yin1, Suhang Wang1
Aff.: 1The Pennsylvania State University; 2Harvard University; 3Amazon

| Method | NEWS | BOOKS | ||||||
| M1 | M2 | M3 0 | M4 | M1 | M2 | M3 0 | M4 | |
| Target | 58.4 | 63.9 | -99.8 | 55.2 | 99.8 | 59.4 | -57.5 | 66.9 |
| Target + Quan. (8 bit) | 40.8 | 66.4 | -99.8 | 54.1 | 99.0 | 45.1 | -57.3 | 65.7 |
| Target + Quan. (4 bit) | 34.2 | 54.4 | -99.8 | 48.2 | 85.3 | 36.8 | -60.1 | 50.5 |
| Retrain | 20.8 | 33.1 | 0.0 | 55.0 | 14.3 | 28.9 | 0.0 | 74.5 |
| Retrain + Quan. (4 bit) | 18.5 | 36.0 | -2.2 | 46.5 | 13.6 | 24.1 | -3.2 | 62.0 |
| GA | 0.0 | 0.0 | 40.4 | 0.0 | 0.0 | 0.0 | -24.9 | 0.0 |
| GA + Quan. (8 bit) | 0.0 | 0.0 | 39.5 | 0.0 | 0.0 | 0.0 | -25.0 | 0.0 |
| GA + Quan. (4 bit) | 0.0 | 0.0 | 24.5 | 0.0 | 0.0 | 0.0 | -30.1 | 0.0 |
| GA_GDR | 0.0 | 28.9 | 87.1 | 34.2 | 0.0 | 2.9 | -56.5 | 44.2 |
| GA_GDR + Quan. (8 bit) | 0.0 | 26.9 | 93.5 | 33.6 | 0.8 | 3.7 | -52.4 | 43.7 |
| GA_GDR + Quan. (4 bit) | 25.0 | 50.1 | -99.1 | 47.7 | 17.9 | 33.7 | -35.2 | 51.9 |
| GA_KLR | 14.1 | 27.1 | -91.6 | 23.1 | 13.0 | 15.1 | -40.8 | 33.7 |
| GA_KLR + Quan. (8 bit) | 15.3 | 29.0 | -91.7 | 24.5 | 12.4 | 10.1 | -37.9 | 35.1 |
| GA_KLR + Quan. (4 bit) | 33.8 | 50.9 | -99.8 | 45.8 | 75.6 | 34.6 | -60.0 | 51.3 |
| NPO | 0.0 | 0.0 | 14.5 | 0.0 | 0.0 | 0.0 | -22.6 | 0.0 |
| NPO + Quan. (8 bit) | 0.0 | 0.0 | 15.0 | 0.0 | 0.0 | 0.0 | -22.8 | 0.0 |
| NPO + Quan. (4 bit) | 16.2 | 25.4 | -71.6 | 27.9 | 7.0 | 5.3 | -46.9 | 17.8 |
| NPO_GDR | 0.3 | 46.1 | 107.2 | 38.6 | 0.4 | 13.4 | -42.6 | 58.6 |
| NPO_GDR + Quan. (8 bit) | 0.1 | 44.2 | 106.3 | 37.0 | 0.9 | 14.0 | -60.2 | 50.5 |
| NPO_GDR + Quan. (4 bit) | 33.2 | 51.4 | -99.8 | 48.2 | 66.0 | 31.9 | -60.8 | 53.2 |
| NPO_KLR | 16.6 | 36.6 | -94.0 | 33.3 | 12.4 | 13.7 | -40.7 | 35.1 |
| NPO_KLR + Quan. (8 bit) | 17.0 | 37.2 | -93.7 | 29.5 | 11.7 | 11.2 | -37.2 | 23.4 |
| NPO_KLR + Quan. (4 bit) | 34.1 | 53.7 | -99.8 | 48.8 | 70.9 | 34.2 | -60.1 | 50.4 |