notesum.ai
Published at November 12Trustful LLMs: Customizing and Grounding Text Generation with Knowledge Bases and Dual Decoders
cs.CL
cs.AI
Released Date: November 12, 2024
Authors: Xiaofeng Zhu1, Jaya Krishna Mandivarapu2
Aff.: 1Microsoft Corporation / WA, USA; 2Microsoft Corporation / GA, USA

| M365 | |||||
|---|---|---|---|---|---|
| Models | ROUGE-L | METEOR | Groundedness | GPT-Similarity | BERTScore |
| TrustfulLLM + HC + Phi-3.5-mini-instruct | 0.55 | 0.51 | 5.00 | 4.68 | 0.93 |
| TrustfulLLM + Phi-3.5-mini-instruct | 0.50 | 0.50 | 3.98 | 4.30 | 0.90 |
| HC + Phi-3.5-mini-instruct | 0.46 | 0.48 | 5.00 | 4.52 | 0.91 |
| RAG + Phi-3.5-mini-instruct | 0.41 | 0.45 | 3.72 | 3.49 | 0.89 |
| RAG + Mistral-NeMo-Minitron-8B-Instruct | 0.38 | 0.46 | 3.77 | 3.76 | 0.88 |
| RAG + Llama-3.1-8B-Instruct | 0.40 | 0.46 | 3.74 | 3.34 | 0.89 |
| RAG + GPT-3.5 Turbo | 0.45 | 0.48 | 3.81 | 3.58 | 0.90 |
| RAG + GPT-4o | 0.42 | 0.48 | 3.77 | 3.52 | 0.91 |
| Phi-3.5-mini-instruct | 0.17 | 0.26 | 3.33 | 3.60 | 0.84 |
| Mistral-NeMo-Minitron-8B-Instruct | 0.16 | 0.24 | 3.50 | 4.05 | 0.82 |
| Llama-3.1-8B-Instruct | 0.19 | 0.26 | 3.44 | 3.82 | 0.84 |
| GPT-3.5 Turbo | 0.23 | 0.31 | 3.70 | 4.10 | 0.85 |
| GPT-4o | 0.16 | 0.25 | 3.64 | 3.97 | 0.83 |