notesum.ai

Published at December 6

MAmmoTH-VL: Eliciting Multimodal Reasoning with Instruction Tuning at Scale

cs.CL
cs.CV

Released Date: December 6, 2024

Authors: Jarvis Guo1, Tuney Zheng1, Yuelin Bai1, Bo Li2, Yubo Wang3, King Zhu1, Yizhi Li4, Graham Neubig5, Wenhu Chen3, Xiang Yue5

Aff.: 1M-A-P; 2Nanyang Technological University; 3University of Waterloo; 4The University of Manchester; 5Carnegie Mellon University

Arxiv: http://arxiv.org/pdf/2412.05237v1