notesum.ai
Published at November 5V-DPO: Mitigating Hallucination in Large Vision Language Models via Vision-Guided Direct Preference Optimization
cs.CV
cs.AI
Released Date: November 5, 2024
Authors: Yuxi Xie1, Guanzhen Li1, Xiao Xu1, Min-Yen Kan1
Aff.: 1National University of Singapore

| Approach | F1 Score | Yes Ratio | |||
|---|---|---|---|---|---|
| F1R | F1P | F1A | F1 | ||
| SFT | |||||
| HA-DPO | |||||
| Synthetic Augmented Data | |||||
| DPO | |||||
| V-DPO | ↑0.94 | ||||
| RLHF-V | |||||
| DPO | |||||
| V-DPO | ↑1.24 | ||||