notesum.ai
Published at November 15AmoebaLLM: Constructing Any-Shape Large Language Models for Efficient and Instant Deployment
cs.AI
Released Date: November 15, 2024
Authors: Yonggan Fu1, Zhongzhi Yu1, Junwei Li1, Jiayi Qian1, Yongan Zhang1, Xiangchi Yuan1, Dachuan Shi1, Roman Yakunin1, Yingyan Celine Lin
Aff.: 1Georgia Institute of Technology
| Ratio | Method | MMLU | Average | BoolQ | PIQA | HellaSwag | WinoGrande | ARC-e | ARC-c | OBQA |
|---|---|---|---|---|---|---|---|---|---|---|
| 80% | LLM-Pruner [7] | 29.63 | 56.95 | 58.53 | 76.39 | 65.80 | 60.38 | 64.60 | 34.56 | 38.40 |
| FLAP [8] | 40.21 | 60.98 | 73.64 | 74.81 | 68.27 | 65.43 | 66.20 | 37.88 | 40.60 | |
| Shortened LLaMA [9] | 26.45 | 58.72 | 62.17 | 76.01 | 68.22 | 58.88 | 68.98 | 38.40 | 38.40 | |
| AmoebaLLM (Ours) | 40.70 | 62.29 | 72.70 | 76.80 | 70.60 | 67.60 | 68.30 | 38.80 | 41.20 | |
| AmoebaLLM† (Ours) | 42.40 | 62.37 | 72.50 | 76.30 | 70.80 | 66.90 | 70.30 | 40.20 | 39.60 | |
| 65% | LLM-Pruner [7] | 23.15 | 54.09 | 60.73 | 74.97 | 58.57 | 57.85 | 55.68 | 33.02 | 37.80 |
| FLAP [8] | 33.28 | 56.12 | 65.75 | 70.08 | 60.57 | 61.33 | 62.25 | 33.87 | 39.00 | |
| Shortened LLaMA [9] | 24.89 | 52.57 | 62.32 | 72.03 | 55.10 | 52.41 | 59.47 | 30.63 | 36.00 | |
| AmoebaLLM (Ours) | 36.00 | 56.96 | 72.10 | 70.70 | 59.70 | 63.20 | 62.20 | 34.00 | 36.80 | |
| AmoebaLLM† (Ours) | 36.20 | 57.26 | 70.50 | 70.90 | 61.50 | 62.70 | 63.50 | 34.50 | 37.20 | |
| 50% | LLM-Pruner [7] | 22.90 | 47.52 | 61.83 | 67.79 | 43.31 | 51.22 | 46.13 | 28.16 | 34.20 |
| FLAP [8] | 27.67 | 51.12 | 59.45 | 67.30 | 51.33 | 56.75 | 55.43 | 31.57 | 36.00 | |
| Shortened LLaMA [9] | 24.76 | 47.35 | 62.23 | 66.00 | 43.60 | 51.54 | 50.63 | 26.45 | 31.00 | |
| AmoebaLLM (Ours) | 30.60 | 52.19 | 65.70 | 66.10 | 51.30 | 60.10 | 56.60 | 31.50 | 34.00 | |
| AmoebaLLM† (Ours) | 32.20 | 52.63 | 64.70 | 66.70 | 53.00 | 60.30 | 58.00 | 30.10 | 35.60 |