notesum.ai

Published at November 15

Thinking Before Looking: Improving Multimodal LLM Reasoning via Mitigating Visual Hallucination

cs.CV
cs.AI

Released Date: November 15, 2024

Authors: Haojie Zheng1, Tianyang Xu2, Hanchi Sun3, Shu Pu4, Ruoxi Chen3, Lichao Sun3

Aff.: 1University of Pennsylvania; 2Columbia University; 3Lehigh University; 4Independent Researcher

Arxiv: http://arxiv.org/abs/2411.12591v1