notesum.ai

Published at November 21

Looking Beyond Text: Reducing Language bias in Large Vision-Language Models via Multimodal Dual-Attention and Soft-Image Guidance

cs.CV
cs.CL

Released Date: November 21, 2024

Authors: Haozhe Zhao1, Shuzheng Si2, Liang Chen1, Yichi Zhang1, Maosong Sun2, Mingjia Zhang3, Baobao Chang1

Aff.: 1Peking University; 2Tsinghua University; 3University of Illinois Urbana-Champaign

Arxiv: http://arxiv.org/abs/2411.14279v1