notesum.ai
Published at November 26A Topic-level Self-Correctional Approach to Mitigate Hallucinations in MLLMs
cs.CL
cs.CV
Released Date: November 26, 2024
Authors: Lehan He1, Zeren Chen2, Zhelun Shi2, Tianyu Yu3, Jing Shao1, Lu Sheng2
Aff.: 1Shanghai AI Laboratory; 2School of Software, Beihang University; 3Tsinghua University

| Model | Feedback | Object-Halbench | MMHal-Bench | Amber | LLaVA-Bench | MMstar | |||
|---|---|---|---|---|---|---|---|---|---|
| Resp. | Ment. | Score | Hall. | Acc. | F1 | Overall | Overall | ||
| LLaVA-RLHF-13B [35] | Human | 38.1 | 18.9 | 2.02 | 62.5 | 79.7 | 83.9 | 61.5 | 34.2 |
| RLHF-V-13B [45] | Human | 12.2 | 7.5 | 2.45 | 51.0 | 72.6 | 75.0 | 51.4 | 33.2 |
| Silkie-10B [14] | GPT-4V | 27.1 | 13.4 | 3.19 | 32.3 | 82.2 | 87.6 | 73.2 | 33.6 |
| POVID-7B [51] | Rule | 48.1 | 24.4 | 2.08 | 56.2 | 82.9 | 87.4 | 62.2 | 34.3 |
| MFPO-7B [10] | Rule | 10.6 | 5.1 | 2.89 | 45.0 | – | – | – | – |
| AMP-MEG-7B [48] | Rule | 37.8 | 22.5 | 3.17 | 35.0 | 78.3 | 83.6 | 54.6 | 27.5 |
| RLAIF-V-7B [46] | LLaVA-NeXT-34B | 8.5 | 4.3 | 3.06 | 29.2 | 76.8 | 84.5 | 64.9 | 31.8 |
| HSA-DPO-13B [41] | LLaVA-NeXT-34B | 5.3 | 3.2 | 2.61 | 48.0 | – | – | – | – |
| LLaVA-1.5-7B [20] | 53.6 | 25.2 | 2.36 | 51.0 | 73.5 | 77.6 | 59.7 | 30.3 | |
| + TPO-7B | LLaVA-1.5-7B | 5.8 | 3.0 | 2.67 | 44.8 | 81.7 | 86.7 | 73.7 | 32.8 |
| + TPO-7B | LLaVA-NeXT-34B | 4.0 | 2.2 | 3.01 | 31.2 | 82.3 | 87.6 | 69.2 | 33.2 |
| + TPO-7B-LoRA | LLaVA-NeXT-34B | 4.7 | 2.8 | 2.68 | 42.7 | 80.0 | 85.9 | 70.7 | 32.9 |