notesum.ai
Published at December 5Reducing Tool Hallucination via Reliability Alignment
cs.CL
Released Date: December 5, 2024
Authors: Hongshen Xu1, Su Zhu2, Zihan Wang1, Hang Zheng1, Da Ma1, Ruisheng Cao1, Shuai Fan2, Lu Chen1, Kai Yu1
Aff.: 1Shanghai Jiao Tong University; 2AISpeech Co., Ltd.

| Method | I1 Instruction | I1 Instruction Missing Parameter | I1 Instruction Unmatched Tools | Overall | ||||
| Utility | Ratio | Utility | Ratio | Utility | Ratio | Utility | Ratio | |
|---|---|---|---|---|---|---|---|---|
| gpt-3.5-turbo-16k | 2.8 | 30 | 2.4 | 38.5 | 11.8 | 60.7 | 6.2 | 44 |
| Toolllama3.1 | 3.7 | 31.6 | 5.1 | 44.6 | 11.7 | 61.5 | 7.0 | 46.3 |
| + Reliable SFT | 4.4 | 33.8 | 4.2 | 43.2 | 20.0 | 99.7 | 10.3 | 58.4 |
| + Reliable DPO | 6.6 | 41.4 | 6.3 | 50.8 | 15.6 | 78.8 | 10.0 | 58.2 |
| + Reliable SFT & DPO | 4.0 | 32.7 | 7.1 | 51.5 | 20.0 | 99.6 | 10.8 | 60.3 |
| + Reliable SFT DPO | 5.5 | 33.7 | 11.2 | 66.6 | 20.0 | 99.9 | 12.4 | 66.3 |