notesum.ai

Published at November 26

DOGE: Towards Versatile Visual Document Grounding and Referring

cs.CV
cs.AI

Released Date: November 26, 2024

Authors: Yinan Zhou1, Yuxin Chen2, Haokun Lin3, Shuyu Yang1, Li Zhu1, Zhongang Qi2, Chen Ma4, Ying Shan2

Aff.: 1Xi'an Jiaotong University; 2ARC Lab, Tencent PCG; 3Institute of Automation, CAS; 4City University of Hongkong

Arxiv: http://arxiv.org/abs/2411.17125v1