notesum.ai
Published at December 6Verb Mirage: Unveiling and Assessing Verb Concept Hallucinations in Multimodal Large Language Models
cs.CV
Released Date: December 6, 2024
Authors: Zehao Wang1, Xinpeng Liu, Xiaoqian Wu, Yudonglin Zhang, Zhou Fang, Yifan Fang, Junfu Pu, Cewu Lu, Yong-Lu Li
Aff.: 1Shanghai Jiao Tong University

| YN verb only | MC verb only | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| w/o Pepper Salt | w/ Pepper Salt | YN Err. Cons. | w/o Pepper Salt | w/ Pepper Salt | MC Err. Cons. | |||||
| YN acc | YN prec | YN recall | YN acc | YN prec | YN recall | MC acc | MC acc | |||
| Qwen2-VL-7B | 74.69 | 57.43 | 95.87 | 63.20 | 47.68 | 95.57 | 56.86 | 65.31 | 51.94 | 48.21 |
| MiniCPM-Llama3-V2.5 | 79.14 | 63.33 | 90.33 | 67.40 | 51.25 | 64.79 | 26.12 | 60.77 | 40.50 | 37.20 |
| Qwen-VL-Chat | 79.24 | 65.09 | 82.68 | 66.64 | 50.28 | 80.43 | 38.47 | 54.57 | 33.98 | 43.38 |
| LLaVA V1.5 | 58.21 | 44.49 | 97.49 | 51.29 | 40.66 | 97.30 | 73.85 | 51.00 | 49.97 | 68.37 |
| InstructBLIP | 73.82 | 57.25 | 87.79 | 71.04 | 54.35 | 87.40 | 74.16 | 6.25 | 6.34 | 82.33 |