notesum.ai
Published at November 25CATP-LLM: Empowering Large Language Models for Cost-Aware Tool Planning
cs.AI
Released Date: November 25, 2024
Authors: Duo Wu1, Jinghe Wang1, Yuan Meng1, Yanning Zhang1, Le Sun1, Zhi Wang1
Aff.: 1Tsinghua University
![[Uncaptioned image]](https://arxiv.org/html/2411.16313v1/x1.png)
| Metrics | GPT-3.5 | GPT-4 | Llama2-7B | |||||||
| Zero-Shot | Few-Shot | HuggingGPT | Zero-Shot | Few-Shot | HuggingGPT | IFT | RLTF | CATP-LLM | ||
| Sequential Planning | ||||||||||
| Task Performance | 0.6758 | 0.6988 | 0.6645 | 0.6856 | 0.6982 | 0.6661 | 0.6482 | 0.6938 | 0.6611 | |
| Costs | Exec. Prices ($) | 0.1638 | 0.1441 | 0.1199 | 0.1607 | 0.1463 | 0.1240 | 0.1838 | 0.1746 | 0.0794 |
| Exec. Time (s) | 2.469 | 2.116 | 1.759 | 2.356 | 2.166 | 1.847 | 2.812 | 2.514 | 1.038 | |
| Quality of Plan (QoP) | 0.1935 | 0.2224 | 0.2265 | 0.2011 | 0.2201 | 0.2237 | 0.1620 | 0.1929 | 0.2605 | |
| % of Valid Plans | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | |
| Non-Sequential Planning | ||||||||||
| Task Performance | 0.2927 | 0.2630 | 0.2978 | 0.4389 | 0.4408 | 0.4346 | 0.4031 | 0.4270 | 0.5659 | |
| Costs | Exec. Prices ($) | 0.1440 | 0.1231 | 0.1157 | 0.2233 | 0.1628 | 0.1607 | 0.1963 | 0.2011 | 0.1211 |
| Exec. Time (s) | 2.216 | 1.916 | 1.447 | 3.333 | 2.187 | 2.204 | 2.279 | 2.169 | 0.7221 | |
| Quality of Plan (QoP) | 0.0193 | 0.0229 | 0.0469 | 0.0225 | 0.0768 | 0.0756 | 0.0284 | 0.0361 | 0.1762 | |
| % of Valid Plans | 41.67% | 75% | 66.67% | 100% | 100% | 100% | 100% | 100% | 100% | |