notesum.ai
Published at October 18Montessori-Instruct: Generate Influential Training Data Tailored for Student Learning
cs.AI
90-02, 90B40, 90C27
G.1.6; G.2.1; I.2.8
Released Date: October 18, 2024
Authors: Xiaochuan Li1, Zichun Yu2, Chenyan Xiong2
Aff.: 1School of Software, Tsinghua University; 2Language Technologies Institute, Carnegie Mellon University

| In-Domain | Out-Of-Domain | |||||||
| Methods | Alpaca Eval 2.0 | MT-Bench | MMLU | GPQA | ARC-C | GSM8K | HellaSwag | |
| LC-WR | WR | Score | Accuracy | |||||
| 8B Setting: Student=Llama3-8B | ||||||||
| No fine-tuning | 2.09% | 3.39% | 5.597 | 62.15 | 24.33 | 57.85 | 51.25 | 81.96 |
| Self-Instruct | 50% | 50% | 6.490 | 62.42 | 31.92 | 59.98 | 58.76 | 80.93 |
| Self-Instruct∗ | 54.95% | 56.39% | 5.918 | 63.41 | 30.13 | 60.58 | 50.42 | 81.42 |
| Self-Reward∗ | ||||||||
| Iteration 1 | 51.87% | 55.38% | 6.713 | 62.46 | 28.19 | 59.84 | 53.60 | 81 .04 |
| Iteration 2 | 53.49% | 57.32% | 6.798 | 62.02 | 29.08 | 60.64 | 56.37 | 81.13 |
| LLM2LLM | ||||||||
| Iteration 1 | 51.49% | 53.12% | 6.531 | 62.18 | 29.12 | 57.49 | 55.28 | 80.49 |
| Iteration 2 | 52.63% | 55.02% | 6.519 | 62.46 | 30.04 | 59.65 | 57.75 | 80.57 |
| Montessori-Instruct | ||||||||
| Iteration 1 | 54.92% | 58.59% | 6.903 | 62.93 | 29.91 | 62.97 | 58.76 | 81.22 |
| Iteration 2 | 56.82% | 60.23% | 7.092 | 63.44 | 31.19 | 59.98 | 60.05 | 81.98 |
| 1.1B Setting: Student=Tinyllama-1.1B | ||||||||
| No fine-tuning | 17.89% | 17.56% | 1.020 | 26.16 | 23.88 | 37.12 | 1.97 | 62.61 |
| Self-Instruct | 50% | 50% | 2.154 | 26.21 | 24.78 | 37.97 | 1.82 | 62.47 |
| Self-Instruct∗ | 54.02% | 55.02% | 1.928 | 26.64 | 24.33 | 38.82 | 2.20 | 63.17 |
| Self-Reward∗ | ||||||||
| Iteration 1 | 47.62% | 48.34% | 1.804 | 26.34 | 23.92 | 37.64 | 1.76 | 62.27 |
| Iteration 2 | 46.48% | 46.95% | 1.717 | 26.09 | 24.62 | 38.03 | 1.76 | 62.79 |
| LLM2LLM | ||||||||
| Iteration 1 | 52.03% | 52.75% | 2.243 | 25.87 | 24.51 | 36.86 | 2.24 | 62.15 |
| Iteration 2 | 51.64% | 53.52% | 2.192 | 25.62 | 24.84 | 36.74 | 2.31 | 62.08 |
| Montessori-Instruct | ||||||||
| Iteration 1 | 53.25% | 51.77% | 2.485 | 26.23 | 23.92 | 37.97 | 2.35 | 62.59 |
| Iteration 2 | 54.52% | 54.97% | 2.504 | 26.35 | 24.88 | 38.11 | 2.91 | 63.55 |