notesum.ai
Published at October 30TOMATO: Assessing Visual Temporal Reasoning Capabilities in Multimodal Foundation Models
cs.CV
cs.AI
cs.CL
Released Date: October 30, 2024
Authors: Ziyao Shangguan1, Chuhan Li1, Yuxuan Ding1, Yanan Zheng1, Yilun Zhao1, Tesca Fitzgerald1, Arman Cohan2
Aff.: 1Yale University; 2Allen Institute for AI