notesum.ai
Published at October 18Large Language Models Are Overparameterized Text Encoders
cs.MA
cs.AI
Released Date: October 18, 2024
Authors: Thennal D K1, Tim Fischer2, Chris Biemann2
Aff.: 1IIIT Kottayam; 2University of Hamburg

| LLaMA-3-8B | Mistral-7B | Qwen2-7B | Phi3-4B | |||||
|---|---|---|---|---|---|---|---|---|
| Large | Small | Large | Small | Large | Small | Large | Small | |
| Layers | 25 (-7) | 5 (-27) | 22 (-10) | 8 (-24) | 25 (-3) | 10 (-18) | 25 (-7) | 8 (-24) |
| Params | 5.9 (78%) | 1.18 (16%) | 4.92 (69%) | 1.79 (25%) | 6.35 (89%) | 2.54 (36%) | 2.91 (78%) | 0.93 (25%) |
| Score | 63.5 (-1.5) | 58.1 (-6.9) | 63.1 (-0.1) | 59.0 (-4.2) | 64.5 (+0.3) | 60.9 (-3.3) | 61.7 (-0.1) | 55.5 (-6.3) |