notesum.ai
Published at November 26Natural Language Understanding and Inference with MLLM in Visual Question Answering: A Survey
cs.CL
cs.CV
Released Date: November 26, 2024
Authors: Jiayi Kuang1, Jingyou Xie1, Haohao Luo1, Ronghao Li1, Zhe Xu1, Xianfeng Cheng1, Yinghui Li2, Xika Lin3, Ying Shen1
Aff.: 1Sun Yat-sen University, China; 2Tsinghua University, China; 3Department of Computer Science, Worcester Polytechnic Institute, USA

| First comprehensive overview of the field, |
| Definition and classification of the task, |
| In-depth analysis of the question/answer pairs |