notesum.ai
Published at November 21Velocitune: A Velocity-based Dynamic Domain Reweighting Method for Continual Pre-training
cs.CL
Released Date: November 21, 2024
Authors: Zheheng Luo1, Xin Zhang2, Xiao Liu2, Haoling Li3, Yeyun Gong2, Chen Qi2, Peng Cheng2
Aff.: 1The University of Manchester; 2Microsoft; 3Tsinghua University

| Model | Math | |||||
|---|---|---|---|---|---|---|
| GSM8K | MATH | Minerva | MATHQA | ASDiv | SVAMP | |
| CodeLlama | 12.4 | 6.0 | 5.20 | 14.10 | 50.5 | 44.5 |
| CodeLlama-CPT | 28.9 | 11.1 | 9.80 | 24.0 | 61.1 | 56.4 |
| CodeLlama-Velocitune | 28.4 | 11.7 | 11.4 | 25.1 | 60.9 | 56.1 |
| Model | Math | Code | ||||
| MMLU-STEM | SAT | Math Avg. | HumanEval | MBPP | Code Avg. | |
| CodeLlama | 20.9 | 18.8 | 21.6 | 30.50 | 43.20 | 36.80 |
| CodeLlama-CPT | 36.0 | 46.9 | 34.3 | 26.20 | 44.80 | 35.50 |
| CodeLlama-Velocitune | 37.3 | 56.2 | 35.9 (+1.6%) | 34.10 | 44.40 | 39.3 (+3.8%) |