notesum.ai
Published at November 14PTR: Precision-Driven Tool Recommendation for Large Language Models
cs.CL
cs.AI
Released Date: November 14, 2024
Authors: Hang Gao1, Yongfeng Zhang1
Aff.: 1Rutgers University New Brunswick, NJ, USA
| Methods | Framework | ToolLens | MetaTool | RecTools | ||||||
|---|---|---|---|---|---|---|---|---|---|---|
| Recall@K | NDCG@K | TRACC | Recall@K | NDCG@K | TRACC | Recall@K | NDCG@K | TRACC | ||
| Random | N/A | 0.036 | 0.061 | 0.034 | 0.133 | 0.202 | 0.133 | 0.137 | 0.271 | 0.097 |
| +PTR+open-mistral-7b | 0.185 | 0.225 | 0.145 | 0.608 | 0.785 | 0.505 | 0.457 | 0.756 | 0.235 | |
| +PTR+GPT-3.5-turbo | 0.213 | 0.282 | 0.172 | 0.645 | 0.823 | 0.543 | 0.475 | 0.784 | 0.288 | |
| +PTR+GPT-4o | 0.227 | 0.303 | 0.187 | 0.663 | 0.843 | 0.562 | 0.492 | 0.802 | 0.305 | |
| BM25 | N/A | 0.131 | 0.194 | 0.125 | 0.429 | 0.603 | 0.429 | 0.486 | 0.596 | 0.382 |
| +PTR+open-mistral-7b | 0.206 | 0.254 | 0.162 | 0.659 | 0.834 | 0.554 | 0.524 | 0.795 | 0.355 | |
| +PTR+GPT-3.5-turbo | 0.247 | 0.313 | 0.193 | 0.694 | 0.874 | 0.593 | 0.541 | 0.815 | 0.408 | |
| +PTR+GPT-4o | 0.261 | 0.331 | 0.208 | 0.712 | 0.892 | 0.612 | 0.545 | 0.810 | 0.414 | |
| Contriever | N/A | 0.130 | 0.190 | 0.121 | 0.439 | 0.672 | 0.439 | 0.367 | 0.786 | 0.304 |
| +PTR+open-mistral-7b | 0.208 | 0.256 | 0.164 | 0.662 | 0.837 | 0.557 | 0.512 | 0.773 | 0.342 | |
| +PTR+GPT-3.5-turbo | 0.250 | 0.316 | 0.196 | 0.697 | 0.877 | 0.596 | 0.528 | 0.792 | 0.396 | |
| +PTR+GPT-4o | 0.264 | 0.334 | 0.211 | 0.715 | 0.895 | 0.615 | 0.559 | 0.834 | 0.426 | |
| SBERT | N/A | 0.251 | 0.349 | 0.209 | 0.495 | 0.725 | 0.495 | 0.496 | 0.772 | 0.434 |
| +PTR+open-mistral-7b | 0.272 | 0.362 | 0.226 | 0.682 | 0.862 | 0.582 | 0.538 | 0.821 | 0.452 | |
| +PTR+GPT-3.5-turbo | 0.308 | 0.403 | 0.252 | 0.723 | 0.902 | 0.623 | 0.555 | 0.840 | 0.484 | |
| +PTR+GPT-4o | 0.322 | 0.422 | 0.268 | 0.741 | 0.921 | 0.642 | 0.572 | 0.859 | 0.501 | |
| TAS-B | N/A | 0.279 | 0.381 | 0.263 | 0.657 | 0.897 | 0.657 | 0.509 | 0.841 | 0.454 |
| +PTR+open-mistral-7b | 0.298 | 0.398 | 0.278 | 0.702 | 0.882 | 0.602 | 0.552 | 0.854 | 0.472 | |
| +PTR+GPT-3.5-turbo | 0.335 | 0.438 | 0.305 | 0.741 | 0.922 | 0.642 | 0.567 | 0.872 | 0.505 | |
| +PTR+GPT-4o | 0.352 | 0.456 | 0.321 | 0.759 | 0.941 | 0.661 | 0.583 | 0.890 | 0.522 | |
| SimCSE | N/A | 0.293 | 0.386 | 0.279 | 0.675 | 0.849 | 0.675 | 0.563 | 0.808 | 0.523 |
| +PTR+opem-mistral-7b | 0.312 | 0.407 | 0.291 | 0.716 | 0.897 | 0.631 | 0.578 | 0.861 | 0.542 | |
| +PTR+GPT-3.5-turbo | 0.350 | 0.448 | 0.319 | 0.756 | 0.937 | 0.671 | 0.594 | 0.879 | 0.575 | |
| +PTR+GPT-4o | ||||||||||