notesum.ai

Published at November 27

Grid-augumented vision: A simple yet effective approach for enhanced spatial understanding in multi-modal agents

cs.CV

Released Date: November 27, 2024

Authors: Joongwon Chae1, Zhenyu Wang1, Peiwu Qin1

Aff.: 1Institute of Biopharmaceutical and Health Engineering, Shenzhen International Graduate School, Tsinghua University, Shenzhen, Guangdong, China

Arxiv: http://arxiv.org/abs/2411.18270v1