notesum.ai
Published at November 19Loss-to-Loss Prediction: Scaling Laws for All Datasets
cs.AI
cs.CL
stat.ML
Released Date: November 19, 2024
Authors: David Brandfonbrener1, Nikhil Anand1, Nikhil Vyas2, Eran Malach1, Sham Kakade3
Aff.: 1Kempner Institute, Harvard University; 2SEAS, Harvard University; 3Kempner Institute and SEAS, Harvard University

| Target Loss | General Train-to-Test | Test-to-Test | FLOPs-to-loss | Scaling law | Identity |
|---|---|---|---|---|---|
| Hellaswag | 1.6% | 1.2% | 1.7% | 2.1% | 9.2% |
| ARC-Easy | 10.2% | 17.6% | 14.3% | 16.8% | 24.8% |
| MMLU-Humanities | 2.8% | 23.1% | 4.4% | 4.7% | 11.0% |
| MMLU-STEM | 6.4% | 6.4% | 5.9% | 7.6% | 11.5% |