notesum.ai
Published at November 15Seeing Clearly by Layer Two: Enhancing Attention Heads to Alleviate Hallucination in LVLMs
cs.CV
cs.AI
Released Date: November 15, 2024
Authors: Xiaofeng Zhang1, Yihao Quan2, Chaochen Gu3, Chen Shen3, Xiaosong Yuan3, Shaotian Yan3, Hao Cheng1, Kaijie Wu1, Jieping Ye3
Aff.: 1Shanghai Jiao Tong University; 2Beijing Jiaotong University; 3Alibaba Group

| Method | POPE [24] | CHAIR [31] | |||
|---|---|---|---|---|---|
| F1 score | CS | CI | Recall | Avg. Len | |
| Greedy Search | 85.7 | 47.0 | 13.8 | 76.6 | 94.2 |
| Beam Search | 84.9 | 51.0 | 15.2 | 75.2 | 102.2 |
| DoLa [8] | 80.2 | 57.0 | 15.2 | 78.2 | 97.5 |
| ITI [23] | 83.7 | 48.2 | 13.9 | 78.3 | 98.6 |
| VCD [20] | 83.2 | 51.0 | 14.9 | 77.2 | 101.9 |
| AGLA [2] | 84.6 | 43.0 | 14.1 | 78.9 | 98.8 |
| OPERA [17] | 85.2 | 47.0 | 14.6 | 78.5 | 95.3 |
| DOPRA [36] | 85.6 | 46.3 | 13.8 | 78.2 | 96.1 |
| HALC [7] | 83.9 | 50.2 | 12.4 | 78.4 | 97.2 |
| FastV [5] | 81.3 | 39.4 | 11.3 | 69.5 | 90.0 |
| Less is more [43] | 86.0 | 40.2 | 12.3 | 75.7 | 79.7 |
| CCA-LLaVA [38] | 85.5 | 43.0 | 11.5 | 80.4 | 96.6 |
| EAH | 85.7 | 36.4 | 9.9 | 74.9 | 97.7 |