notesum.ai
Published at November 6MambaPEFT: Exploring Parameter-Efficient Fine-Tuning for Mamba
cs.CL
cs.AI
cs.CV
cs.LG
Released Date: November 6, 2024
Authors: Masakazu Yoshimura1, Teruaki Hayashi1, Yota Maeda1
Aff.: 1Sony Group Corporation Japan

| Model | Method | #Params (K) | Natural | Specialized | Structured | Avg. |
|---|---|---|---|---|---|---|
| ViT-S | Scratch | 21,704 | 10.66 | 56.12 | 24.83 | 26.20 |
| Full | 21,704 | 51.79 | 72.79 | 45.27 | 53.47 | |
| Linear Proving | 9 | 60.87 | 78.13 | 30.57 | 51.74 | |
| \cdashline2-7 | FacT-TK | 16 | 72.87 | 82.34 | 54.10 | 66.96 |
| LoRA | 628 | 73.60 | 82.22 | 57.61 | 68.68 | |
| Adaptformer | 333 | 73.63 | 83.15 | 57.80 | 68.97 | |
| Adapter+ | 122 | 74.68 | 83.57 | 58.82 | 69.87 | |
| Vim-S | Scratch | 25,450 | 8.33 | 49.87 | 28.16 | 25.42 |
| Full | 25,450 | 59.35 | 68.74 | 34.39 | 47.08 | |
| Linear Probing | 9 | 62.50 | 77.25 | 31.97 | 52.75 | |
| \cdashline2-7 | CLS-token-tuning | 9 | 62.50 | 77.20 | 32.20 | 52.84 |
| Pos-embed-tuning | 84 | 64.25 | 74.60 | 39.77 | 56.12 | |
| Bias-tuning | 37 | 68.94 | 79.16 | 45.05 | 61.03 | |
| D-tuning | 45 | 67.14 | 78.56 | 40.10 | 58.16 | |
| A-tuning | 598 | 72.01 | 80.72 | 49.59 | 64.40 | |
| Conv1d-tuning | 156 | 74.33 | 82.45 | 57.83 | 69.09 | |
| \cdashline2-7 | Prompt-tuning (w/o proj) | 12 | 69.92 | 79.20 | 47.75 | 62.54 |
| Prompt-tuning | 307 | 63.19 | 77.88 | 35.82 | 54.76 | |
| Affix-tuning (w/o proj) | 230 | 63.90 | 77.66 | 34.21 | 54.30 | |
| Affix-tuning | 117,000 | 75.84 | 83.29 | 58.94 | 70.29 | |
| Additional-scan | 672 | 74.63 | 82.68 | 56.40 | 68.65 | |
| ParallelAdapter | 663 | 76.10 | 83.97 | 59.97 | 70.96 | |
| \cdashline2-7 | LoRA(embed) | 45 | 64.66 | 77.53 | 43.83 | 58.60 |
| LoRA(x_proj) | 2,540 | 74.41 | 81.92 | 54.88 | 67.77 | |
| LoRA(dt_proj) | 2,442 | 75.35 | 83.05 | 57.12 | 69.30 | |
| LoRA(out_proj) | 2,663 | 76.42 | 83.96 | 60.08 | 71.12 | |
| LoRA(in_proj) | 1,483 | 76.58 | 84.08 | 60.16 | 71.25 | |
| LoRAp(d) | 2,442 | 73.25 | 80.91 | 50.93 | 65.46 | |
| LoRAp(C) | 2,417 | 72.78 | 81.57 | 51.35 | 65.61 | |
| LoRAp(B) | 2,417 | 72.95 | 81.66 | 52.26 | 66.07 | |
| LoRAp(Z) | 1,778 | 76.15 | 84.26 | 59.72 | 70.94 | |
| LoRAp(X) | 1,778 | 76.64 | 83.89 | 60.84 | 71.52 | |
| \cdashline2-7 | All (w/ proj) | 119,765 | 74.67 | 82.96 | 53.92 | 67.68 |
| Hybrid (w/ proj) | 117,236 | 77.00 | 84.41 | 61.55 | 72.05 | |
| Hybrid (w/o proj) | 1,044 | 76.85 | 84.42 | 61.06 | 71.80 |