notesum.ai

Published at November 12

Trustful LLMs: Customizing and Grounding Text Generation with Knowledge Bases and Dual Decoders

cs.CL

cs.AI

Released Date: November 12, 2024

Authors: Xiaofeng Zhu¹, Jaya Krishna Mandivarapu²

Aff.: ¹Microsoft Corporation / WA, USA; ²Microsoft Corporation / GA, USA

Arxiv: http://arxiv.org/abs/2411.07870v1

Refer to caption

M365
Models	ROUGE-L	METEOR	Groundedness	GPT-Similarity	BERTScore
TrustfulLLM + HC + Phi-3.5-mini-instruct	0.55	0.51	5.00	4.68	0.93
TrustfulLLM + Phi-3.5-mini-instruct	0.50	0.50	3.98	4.30	0.90
HC + Phi-3.5-mini-instruct	0.46	0.48	5.00	4.52	0.91
RAG + Phi-3.5-mini-instruct	0.41	0.45	3.72	3.49	0.89
RAG + Mistral-NeMo-Minitron-8B-Instruct	0.38	0.46	3.77	3.76	0.88
RAG + Llama-3.1-8B-Instruct	0.40	0.46	3.74	3.34	0.89
RAG + GPT-3.5 Turbo	0.45	0.48	3.81	3.58	0.90
RAG + GPT-4o	0.42	0.48	3.77	3.52	0.91
Phi-3.5-mini-instruct	0.17	0.26	3.33	3.60	0.84
Mistral-NeMo-Minitron-8B-Instruct	0.16	0.24	3.50	4.05	0.82
Llama-3.1-8B-Instruct	0.19	0.26	3.44	3.82	0.84
GPT-3.5 Turbo	0.23	0.31	3.70	4.10	0.85
GPT-4o	0.16	0.25	3.64	3.97	0.83