notesum.ai

Published at November 22

mR$^2$AG: Multimodal Retrieval-Reflection-Augmented Generation for Knowledge-Based VQA

cs.AI
cs.CL

Released Date: November 22, 2024

Authors: Tao Zhang1, Ziqi Zhang2, Zongyang Ma1, Yuxin Chen3, Zhongang Qi4, Chunfeng Yuan2, Bing Li2, Junfu Pu3, Yuxuan Zhao5, Zehua Xie5, Jin Ma5, Ying Shan3, Weiming Hu6

Aff.: 1State Key Laboratory of Multimodal Artificial Intelligence Systems, CASIA; PCG ARC Lab; School of Artificial Intelligence, University of Chinese Academy of Sciences; 2State Key Laboratory of Multimodal Artificial Intelligence Systems, CASIA; PeopleAl Inc; 3PCG ARC Lab; Tencent; 4Huawei Noah's Ark Lab; 5Tencent; 6State Key Laboratory of Multimodal Artificial Intelligence Systems, CASIA; School of Artificial Intelligence, University of Chinese Academy of Sciences; School of Information Science and Technology, ShanghaiTech University

Arxiv: http://arxiv.org/abs/2411.15041v1