notesum.ai
Published at October 30Dynamic Strategy Planning for Efficient Question Answering with Large Language Models
cs.CL
cs.AI
cs.LG
Released Date: October 30, 2024
Authors: Tanmay Parekh1, Pradyot Prakash2, Alexander Radovic2, Akshay Shekher2, Denis Savenkov2
Aff.: 1University of California, Los Angeles; 2Meta AI

| Technique | HotpotQA | 2WikiMultihopQA | Musique | |||||||||
| EM | F1 | # T | # R | EM | F1 | # T | # R | EM | F1 | # T | # R | |
| Fixed-base Direct | 23.8 | 32.3 | 95 | 0 | 32.1 | 37.4 | 65 | 0 | 2.3 | 9.3 | 99 | 0 |
| Fixed-base Reason | 27.2 | 37.5 | 124 | 0 | 19.7 | 27.4 | 65 | 0 | 7.2 | 16.7 | 129 | 0 |
| Fixed-base Plan | 24.1 | 33.8 | 203 | 0 | 25.4 | 31.5 | 197 | 0 | 5.8 | 13.4 | 203 | 0 |
| Fixed-base Retrieval | 36.1 | 47.9 | 185 | 1 | 31.6 | 40.4 | 101 | 1 | 9.6 | 18 | 187 | 1 |
| Fixed-sft Direct | 24.1 | 34.3 | 9 | 0 | 32.6 | 38.4 | 10 | 0 | 2.4 | 9 | 17 | 0 |
| Fixed-sft Reason | 27.6 | 37.9 | 53 | 0 | 29.3 | 35.6 | 77 | 0 | 7.6 | 16.4 | 63 | 0 |
| Fixed-sft Plan | 26.3 | 36 | 105 | 0 | 26.9 | 34.7 | 116 | 0 | 6.6 | 15 | 117 | 0 |
| Fixed-sft Retrieval [ref] | 36.8 | 48.6 | 53 | 1 | 32.8 | 40.0 | 56 | 1 | 9.3 | 18.4 | 88 | 1 |
| Classifier | 32.6 | 43.9 | 34 | 0.59 | 36.0 | 43.1 | 28 | 0.45 | 8.0 | 17.5 | 82 | 0.90 |
| Ensemble | 35.9 | 47.5 | 220 | 1 | 35.7 | 42.8 | 260 | 1 | 8.8 | 18.1 | 279 | 1 |
| DyPlan-base (ours) | 36.1 | 47.6 | 42 | 0.76 | 37.8 | 46.0 | 28 | 0.48 | 10.1 | 19.8 | 65 | 0.98 |
| DyPlan-verify (ours) | 36.7 | 48.5 | 53 | 0.79 | 40.5 | 49.6 | 45 | 0.65 | 10.8 | 20.4 | 77 | 0.99 |
| ReAct | 20.5 | 27.5 | 255 | 3.91 | 27.9 | 32.3 | 226 | 3.01 | 4.4 | 8.5 | 290 | 5.10 |
| DRAGIN | 38.9 | 50.2 | 724 | 2.23 | 32.7 | 41.8 | 272 | 1.67 | 11.9 | 22.0 | 993 | 3.03 |