notesum.ai

Published at November 9

An Empirical Analysis on Spatial Reasoning Capabilities of Large Multimodal Models

cs.CV
cs.AI

Released Date: November 9, 2024

Authors: Fatemeh Shiri1, Xiao-Yu Guo2, Mona Golestan Far1, Xin Yu3, Gholamreza Haffari1, Yuan-Fang Li1

Aff.: 1Department of Data Science & AI, Monash University; 2Australian Institute for Machine Learning, University of Adelaide; 3School of Electrical Engineering and Computer Science, University of Queensland

Arxiv: http://arxiv.org/abs/2411.06048v1