notesum.ai
Published at September 26Synthetic continued pretraining
ICLR
Released Date: September 26, 2024
Authors: Anonymous
Arxiv: https://openreview.net/pdf/10719f17a858944be2528f67e3968e6936a72cb7.pdf
| Study | Domain | Model Parameter Count | Total Unique CPT Tokens |
|---|---|---|---|
| Minerva (Lewkowycz et al., 2022 ) | STEM | 8B, 62B, 540B | 26B-38.5B |
| MediTron (Chen et al., 2023 ) | Medicine | 7B, 70B | 46.7B |
| Code Llama (Rozière et al., 2024 ) | Code | 7B, 13B, 34B | 520B-620B |
| Llemma (Azerbayev et al., 2024 ) | Math | 7B, 34B | 50B-55B |
| DeepSeekMath (Shao et al., 2024 ) | Math | 7B | 500B |
| SaulLM-7B (Colombo et al., 2024b ) | Law | 7B | 30B |
| SaulLM-{54, 141}B (Colombo et al., 2024a ) | Law | 54B, 141B | 520B |
| HEAL (Yuan et al., 2024a ) | Medicine | 13B | 14.9B |
| Our setting | Articles & Books | 8B | 1.3M |