notesum.ai

Published at November 25

CATP-LLM: Empowering Large Language Models for Cost-Aware Tool Planning

cs.AI

Released Date: November 25, 2024

Authors: Duo Wu¹, Jinghe Wang¹, Yuan Meng¹, Yanning Zhang¹, Le Sun¹, Zhi Wang¹

Aff.: ¹Tsinghua University

Arxiv: http://arxiv.org/abs/2411.16313v1

[Uncaptioned image]

Metrics		GPT-3.5			GPT-4			Llama2-7B
Metrics		Zero-Shot	Few-Shot	HuggingGPT	Zero-Shot	Few-Shot	HuggingGPT	IFT	RLTF	CATP-LLM
Sequential Planning
Task Performance $\uparrow$		0.6758	0.6988	0.6645	0.6856	0.6982	0.6661	0.6482	0.6938	0.6611
Costs $\downarrow$	Exec. Prices ($)	0.1638	0.1441	0.1199	0.1607	0.1463	0.1240	0.1838	0.1746	0.0794
	Exec. Time (s)	2.469	2.116	1.759	2.356	2.166	1.847	2.812	2.514	1.038
Quality of Plan (QoP) $\uparrow$		0.1935	0.2224	0.2265	0.2011	0.2201	0.2237	0.1620	0.1929	0.2605
% of Valid Plans $\uparrow$		100%	100%	100%	100%	100%	100%	100%	100%	100%
Non-Sequential Planning
Task Performance $\uparrow$		0.2927	0.2630	0.2978	0.4389	0.4408	0.4346	0.4031	0.4270	0.5659
Costs $\downarrow$	Exec. Prices ($)	0.1440	0.1231	0.1157	0.2233	0.1628	0.1607	0.1963	0.2011	0.1211
	Exec. Time (s)	2.216	1.916	1.447	3.333	2.187	2.204	2.279	2.169	0.7221
Quality of Plan (QoP) $\uparrow$		0.0193	0.0229	0.0469	0.0225	0.0768	0.0756	0.0284	0.0361	0.1762
% of Valid Plans $\uparrow$		41.67%	75%	66.67%	100%	100%	100%	100%	100%	100%