notesum.ai
Published at December 4A surprisal oracle for when every layer counts
cs.CL
Released Date: December 4, 2024
Authors: Xudong Hong1, Sharid Loáiciga2, Asad Sayeed2
Aff.: 1Dept. of Language Science and Technology and Dept. of Computer Science, Saarland University; 2Dept. of Philosophy, Linguistics, and Theory of Science, University of Gothenburg

| Model | BLiMP suppl. | BLiMP filtered | EWOK | GLUE |
|---|---|---|---|---|
| ELC-BERT (original) | 67.9 | 80.5 | - | 75.3 |
| BabyLlama | 59.5 | 69.8 | 50.7 | 63.3 |
| LTG-BERT | 60.8 | 60.6 | 48.9 | 60.3 |
| ELC-BERT B32 | 50.1 | 47.9 | 65.2 | 63.4 |
| ELC-BERT B512 | 47.8 | 49.1 | 64.9 | 61.0 |
| ELC-BERT ACLM-D7 | 47.8 | 51.3 | 70.0 | 64.8 |
| ELC-BERT ACLM-D32 | 51.1 | 50.7 | 69.8 | 65.7 |
| ELC-BERT ACLM-D64 | 51.1 | 51.1 | 71.0 | 64.8 |
| ELC-BERT ACLM-D128 | 50.0 | 51.8 | 72.1 | 63.5 |