notesum.ai
Published at November 12ASER: Activation Smoothing and Error Reconstruction for Large Language Model Quantization
cs.LG
cs.AI
Released Date: November 12, 2024
Authors: Weibo Zhao1, Yubin Shi1, Xinyu Lyu1, Wanchen Sui1, Shen Li1, Yong Li1
Aff.: 1Alibaba Group
| Method | #W | #A | Perplexity | Accuracy | |||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| WikiText2 | C4 | PTB | ARC-e | ARC-c | MMLU | Hella | PIQA | Avg. | |||
| Qwen1.5-7B | 16 | 16 | 7.95 | 13.57 | 11.94 | 87.48 | 76.61 | 49.78 | 61.49 | 66.76 | 68.42 |
| LLM.int4() | 4 | 8 | 23.73 | 38.14 | 25.28 | 17.46 | 14.58 | 13.27 | 0.18 | 12.08 | 11.51 |
| SmoothQuant | 4 | 8 | 15.73 | 28.18 | 21.24 | 33.51 | 26.44 | 24.00 | 1.88 | 31.94 | 21.46 |
| SmoothQuant+ | 4 | 8 | 46.27 | 51.28 | 40.81 | 28.04 | 22.03 | 9.16 | 1.17 | 29.60 | 18.00 |
| LoRC | 4 | 8 | 16.78 | 27.13 | 19.93 | 76.19 | 66.44 | 21.56 | 43.77 | 45.43 | 50.68 |
| L2QER | 4 | 8 | 9.20 | 15.63 | 13.81 | 85.36 | 71.53 | 46.39 | 29.59 | 64.64 | 59.50 |
| ASER (w/o A.S.) | 4 | 8 | 9.19 | 15.59 | 13.69 | 81.83 | 68.81 | 43.89 | 45.44 | 61.10 | 60.21 |
| ASER (w/ A.S.) | 4 | 8 | 8.72 | 14.84 | 13.10 | 84.66 | 72.54 | 49.55 | 52.18 | 67.57 | 65.30 |
| LLM.int4() | 4 | 6 | 34.32 | 56.00 | 39.67 | 50.09 | 39.66 | 20.89 | 16.11 | 35.69 | 32.49 |
| SmoothQuant | 4 | 6 | 26.81 | 42.67 | 35.62 | 41.98 | 35.59 | 16.08 | 7.31 | 32.59 | 26.71 |
| SmoothQuant+ | 4 | 6 | 131.45 | 176.46 | 130.24 | 35.98 | 30.51 | 17.36 | 18.77 | 33.35 | 27.19 |
| LoRC | 4 | 6 | 23.03 | 39.63 | 29.68 | 45.68 | 37.63 | 8.90 | 26.29 | 30.85 | 29.87 |
| L2QER | 4 | 6 | 16.51 | 27.01 | 22.46 | 63.14 | 47.80 | 21.85 | 30.24 | 51.52 | 42.91 |
| ASER (w/o A.S.) | 4 | 6 | 15.86 | 26.38 | 21.25 | 67.37 | 54.92 | 40.34 | 30.56 | 54.57 | 49.52 |
| ASER (w/ A.S.) | 4 | 6 | 11.03 | 17.57 | 16.13 | 75.31 | 62.37 | 32.07 | 34.49 | 43.63 | 49.57 |