notesum.ai
Published at November 11The Super Weight in Large Language Models
cs.CL
cs.AI
Released Date: November 11, 2024
Authors: Mengxia Yu1, De Wang2, Qi Shan2, Colorado Reed2, Alvin Wan2
Aff.: 1University of Notre Dame; 2Apple

| Llama-7B | Arc-c | Arc-e | Hella. | Lamb. | PIQA | SciQ | Wino. | AVG | C4 | Wiki-2 |
|---|---|---|---|---|---|---|---|---|---|---|
| Original | 41.81 | 75.29 | 56.93 | 73.51 | 78.67 | 94.60 | 70.01 | 70.11 | 7.08 | 5.67 |
| Prune SW | 19.80 | 39.60 | 30.68 | 0.52 | 59.90 | 39.40 | 56.12 | 35.14 | 763.65 | 1211.11 |
| Prune Non-SW | 41.47 | 74.83 | 56.35 | 69.88 | 78.51 | 94.40 | 69.14 | 69.22 | 7.57 | 6.08 |
| Prune SW, +SA | 26.60 | 54.63 | 56.93 | 12.79 | 67.95 | 61.70 | 70.01 | 50.09 | 476.23 | 720.57 |