notesum.ai
Published at November 18Technical Report: Enhancing LLM Reasoning with Reward-guided Tree Search
cs.CL
cs.AI
Released Date: November 18, 2024
Authors: Jinhao Jiang1, Zhipeng Chen1, Yingqian Min1, Jie Chen1, Xiaoxue Cheng1, Jiapeng Wang1, Yiru Tang1, Haoxiang Sun2, Jia Deng1, Wayne Xin Zhao1, Zheng Liu3, Dong Yan4, Jian Xie4, Zhongyuan Wang3, Ji-Rong Wen1
Aff.: 1Gaoling School of Artificial Intelligence, Renmin University of China; 2School of Information, Renmin University of China; 3BAAI; 4Baichuan AI

| Method | MATH-OAI | GSM-Hard | OlympiadBench | College Math | ||||
| Acc (%) | Gain (%) | Acc (%) | Gain (%) | Acc (%) | Gain (%) | Acc (%) | Gain (%) | |
| baseline | 48.2 | - | 38.4 | - | 17.9 | - | 34.1 | - |
| w/ CoT | 58.3 | +21.0 | 38.5 | +0.3 | 19.2 | +7.3 | 39.0 | +14.7 |
| w/ BoN | 69.0 | +43.2 | 38.8 | +1.0 | 30.3 | +69.3 | 43.0 | +26.0 |
| w/ T-Search | 70.8 | +46.9 | 41.2 | +7.3 | 34.3 | +91.6 | 44.8 | +31.4 |