notesum.ai
Published at November 13One STEP at a time: Language Agents are Stepwise Planners
cs.CL
cs.AI
cs.LG
Released Date: November 13, 2024
Authors: Minh Nguyen1, Ehsan Shareghi1
Aff.: 1Department of Data Science & AI, Monash University

| RL Agents | Generative Language Agents | |||||||||
| Task | Type | DRRN | KGA2C | CALM | SayCan | ReAct | Reflexion | CLIN* | STEP | |
| S | 6.6 | 6.0 | 1.0 | 26.4 | 7.2 | 5.9 | 9.0 | 63.0 | ||
| S | 5.5 | 11.0 | 1.0 | 8.0 | 6.1 | 28.6 | 83.3 | 62.7 | ||
| S | 15.0 | 18.0 | 10.0 | 22.9 | 26.7 | 64.9 | 100.0 | 100.0 | ||
| S | 21.7 | 16.0 | 10.0 | 20.9 | 53.3 | 16.4 | 69.3 | 100.0 | ||
| S | 15.8 | 17.0 | 3.0 | 47.8 | 51.0 | 70.4 | 55.3 | 61.0 | ||
| S | 26.7 | 19.0 | 6.0 | 39.3 | 58.9 | 70.7 | 100.0 | 100.0 | ||
| S | 50.0 | 43.0 | 6.0 | 80.0 | 60.0 | 100.0 | 100.0 | 100.0 | ||
| S | 50.0 | 32.0 | 10.0 | 67.5 | 67.5 | 84.4 | 100.0 | 100.0 | ||
| S | 8.0 | 10.0 | 0.0 | 16.0 | 8.0 | 8.0 | 28.0 | 32.2 | ||
| Boil | L | 3.5 | 0.0 | 0.0 | 33.1 | 3.5 | 4.2 | 4.0 | 21.5 | |
| Freeze | L | 0.0 | 4.0 | 0.0 | 3.9 | 7.8 | 7.8 | 32.3 | 50.9 | |
| GrowPlant | L | 8.0 | 6.0 | 2.0 | 9.9 | 9.1 | 7.3 | 30.3 | 71.5 | |
| GrowFruit | L | 14.3 | 11.0 | 4.0 | 13.9 | 18.6 | 13.0 | 19.3 | 14.0 | |
| L | 21.0 | 5.0 | 4.0 | 20.9 | 27.7 | 2.6 | 59.3 | 46.5 | ||
| Force | L | 10.0 | 4.0 | 0.0 | 21.9 | 40.5 | 50.6 | 73.3 | 80.0 | |
| Friction | L | 10.0 | 4.0 | 3.0 | 32.3 | 44.0 | 100.0 | 56.7 | 73.3 | |
| L | 16.8 | 11.0 | 2.0 | 67.5 | 25.7 | 50.9 | 69.8 | 84.2 | ||
| L | 17.0 | 11.0 | 2.0 | 59.5 | 16.8 | 23.7 | 39.0 | 51.8 | ||
| S | 22.1 | 19.1 | 5.2 | 36.5 | 37.6 | 49.9 | 71.7 | 79.9 | ||
| L | 11.2 | 6.2 | 1.9 | 29.2 | 21.5 | 29.2 | 42.7 | 54.9 | ||
| All | 16.7 | 12.7 | 3.6 | 32.9 | 29.6 | 39.4 | 57.2 | 67.4 | ||