notesum.ai
Published at October 28Flaming-hot Initiation with Regular Execution Sampling for Large Language Models
cs.AI
q-bio.QM
Released Date: October 28, 2024
Authors: Weizhe Chen1, Zhicheng Zhang2, Guanlin Liu3, Renjie Zheng3, Wenlei Shi3, Chen Dun3, Zheng Wu3, Xing Jin3, Lin Yan3
Aff.: 1University of Southern California; 2Carnegie Mellon University; 3ByteDance

| Regular | FIRE | ||||
|---|---|---|---|---|---|
| Model | Pass% | #EA | Pass% | #EA | |
| DeepSeek | 97.57 | 2.26 | 98.71 | 2.76 | |
| GSM8K | Gemma-2 | 86.81 | 3.87 | 87.57 | 4.01 |
| Qwen2 | 95.90 | 2.58 | 98.25 | 3.17 | |
| Qwen2-RL | 96.90 | 2.63 | 97.90 | 3.26 | |
| DeepSeek | 76.16 | 5.63 | 78.16 | 7.89 | |
| MATH | Gemma-2 | 49.20 | 9.24 | 51.48 | 10.39 |
| Qwen2 | 76.60 | 7.44 | 79.08 | 9.03 | |
| Qwen2.5-72B | 79.30 | 2.39 | 80.40 | 2.60 | |