notesum.ai
Published at December 9LLM-BIP: Structured Pruning for Large Language Models with Block-Wise Forward Importance Propagation
cs.CL
cs.AI
Released Date: December 9, 2024

| Pruning Ratio | Method | WikiText2 | PTB | BoolQ | PIQA | HellaSwag | WinoGrande | ARC-e | ARC-c | OBQA | Average |
|---|---|---|---|---|---|---|---|---|---|---|---|
| Ratio = 0% | Vicuna-7B | 16.11 | 61.37 | 76.57 | 77.75 | 70.64 | 67.40 | 65.11 | 41.21 | 40.80 | 62.78 |
| Ratio = 20% w/o tune | Random | 36.02 | 106.90 | 61.47 | 70.89 | 54.67 | 56.27 | 55.60 | 31.74 | 34.60 | 52.18 |
| Magnitude | 2189.45 | 2549.75 | 55.90 | 56.15 | 32.37 | 51.85 | 30.01 | 28.41 | 28.20 | 40.41 | |
| LLM-Pruner | 25.74 | 92.88 | 61.70 | 75.30 | 63.75 | 56.20 | 63.22 | 36.60 | 37.00 | 56.25 | |
| Importace Propagation | 88.97 | 202.86 | 63.02 | 62.13 | 36.17 | 57.46 | 50.29 | 28.84 | 19.40 | 45.33 | |
| Wanda | 47.53 | 144.41 | 55.75 | 71.16 | 44.39 | 57.06 | 64.60 | 34.47 | 26.60 | 50.58 | |
| LLM-BIP | 27.08 | 92.73 | 72.54 | 75.84 | 71.13 | 64.16 | 65.53 | 42.24 | 41.60 | 61.86 | |
| Ratio = 20% w/tune | LLM-Pruner | 19.69 | 78.25 | 63.33 | 76.17 | 65.13 | 60.22 | 62.84 | 37.12 | 39.20 | 57.71 |
| Importace Propagation | 88.97 | 202.86 | 65.47 | 69.31 | 54.87 | 59.51 | 53.24 | 30.89 | 33.20 | 52.36 | |
| Wanda | 47.53 | 144.41 | 68.20 | 75.46 | 66.08 | 64.01 | 65.24 | 37.80 | 40.00 | 59.54 | |
| LLM-BIP | 18.15 | 60.20 | 72.02 | 76.61 | 70.29 | 64.80 | 65.87 | 39.68 | 40.20 | 61.35 | |
| Ratio = 50% w/o tune | LLM-Pruner | 143.85 | 427.77 | 53.76 | 59.79 | 34.86 | 50.28 | 33.29 | 27.30 | 34.60 | 41.98 |
| Wanda | 331.16 | 518.72 | 56.61 | 57.94 | 35.34 | 49.88 | 37.46 | 26.54 | 26.80 | 41.51 | |
| LLM-BIP | 80.38 | 189.82 | 51.74 | 61.10 | 36.28 | 49.88 | 41.03 | 24.43 | 29.60 | 42.01 |