notesum.ai

Published at November 26

Natural Language Understanding and Inference with MLLM in Visual Question Answering: A Survey

cs.CL
cs.CV

Released Date: November 26, 2024

Authors: Jiayi Kuang1, Jingyou Xie1, Haohao Luo1, Ronghao Li1, Zhe Xu1, Xianfeng Cheng1, Yinghui Li2, Xika Lin3, Ying Shen1

Aff.: 1Sun Yat-sen University, China; 2Tsinghua University, China; 3Department of Computer Science, Worcester Polytechnic Institute, USA

Arxiv: http://arxiv.org/abs/2411.17558v1