notesum.ai
Published at November 25AtomR: Atomic Operator-Empowered Large Language Models for Heterogeneous Knowledge Reasoning
cs.CL
Released Date: November 25, 2024
Authors: Amy Xin1, Jinxin Liu1, Zijun Yao1, Zhicheng Li1, Shulin Cao2, Lei Hou1, Juanzi Li1
Aff.: 1Tsinghua University; 2Zhipu AI

| HotpotQA | 2Wikimultihop | Musique | ||||||||||
| Overall | Bridge | Comp. | Overall | Bridge | Infer. | Comp. | B.C. | Overall | 2hop | 3hop | 4hop | |
| Without Retrieval (Closebook) | ||||||||||||
| Standard Prompting | 50.02 | 45.22 | 72.50 | 41.09 | 14.44 | 32.24 | 71.63 | 63.13 | 19.31 | 21.84 | 16.03 | 17.81 |
| CoT | 58.38 | 55.48 | 71.91 | 58.32 | 38.24 | 54.83 | 77.27 | 77.17 | 29.03 | 36.26 | 24.07 | 17.38 |
| With Retrieval (Wikipedia) | ||||||||||||
| Standard RAG | 60.31 | 61.88 | 52.97 | 47.94 | 34.96 | 54.32 | 56.96 | 57.27 | 22.07 | 28.51 | 15.81 | 14.77 |
| IRCoT | 60.20 | 58.00 | 69.40 | 63.80 | 46.20 | 45.50 | 91.60 | 79.00 | 34.20 | 44.20 | 26.30 | 20.10 |
| SearChain | 59.04 | 69.73 | 51.12 | 63.10 | 48.85 | 50.38 | 81.41 | 84.21 | 31.68 | 38.89 | 28.38 | 17.28 |
| ProbTree | 65.91 | 65.50 | 67.81 | 69.32 | 52.45 | 64.08 | 88.17 | 90.00 | 34.92 | 42.52 | 30.38 | 21.53 |
| AtomR (ours) | 71.27 | 68.96 | 82.07 | 78.72 | 59.07 | 80.04 | 97.48 | 93.33 | 36.11 | 45.63 | 29.94 | 20.15 |