notesum.ai
Published at November 10Epistemic Integrity in Large Language Models
cs.CL
cs.AI
cs.HC
Released Date: November 10, 2024
Authors: Bijean Ghafouri1, Shahrad Mohammadzadeh2, James Zhou3, Pratheeksha Nair2, Jacob-Junqi Tian4, Mayank Goel5, Reihaneh Rabbany2, Jean-François Godbout6, Kellin Pelrine2
Aff.: 1University of Southern California; 2McGill University; 3UC Berkeley; 4Vector Institute; 5IIIT Hyderabad; 6Université de Montréal

| Model | Anthropic | Pei | LLama3-8b | GM | CMV |
|---|---|---|---|---|---|
| Base Pei | 1.91 | 0.83 | 1.56 | 1.92 | 2.31 |
| Fine-tuned Pei | 2.6 | 2.08 | 1.29 | 1.54 | 4.26 |
| Fine-tuned Llama-3.2-1B-Instruct | 1.85 | 2.14 | 2.05 | 2.06 | 1.79 |
| Prompted GPT | 1.07 | 1.42 | 1.90 | 1.16 | 0.75 |
| Fine-tuned GPT | 1.04 | 1.24 | 1.36 | 0.99 | 1.16 |
| Fine-tuned GPT (Rounded) | 0.99 | 1.05 | 1.42 | 0.98 | 0.94 |