notesum.ai
Published at October 30Less is More: Pre-Training Cross-Lingual Small-Scale Language Models with Cognitively-Plausible Curriculum Learning Strategies
cs.CL
cs.AI
Released Date: October 30, 2024
Authors: Suchir Salhan1, Richard Diehl Martinez, Zébulon Goriely, Paula Buttery2
Aff.: 1Department of Computer Science & Technology, University of Cambridge, U.K.; 2ALTA Institute, University of Cambridge, U.K.
![[Uncaptioned image]](https://arxiv.org/html/2410.22886v1/extracted/5965260/assets/github-mark.png)
| Model | English | Japanese | Chinese | French | German | ||
| Non-CL | SSLM (wiki) | 64.60% | 55.42% | 48.01% | 70.68% | 59.63% | |
| Mao-BabyBERTa | 75.48% * | 61.21% | 51.32% | 80.00% | 68.78% | ||
| CL | Growing | 71.13% | 79.30% | 56.22% | 76.21% | 71.13% | |
| Inwards | 71.05% | 81.32% | 54.26% | 79.01% | 69.34% | ||
| MMM | (upos) | 74.22% | 87.31% | 58.79% | 75.93% | 73.25% | |
| (sem) | 77.35% | 55.01% | |||||