notesum.ai

Published at November 29

T2Vid: Translating Long Text into Multi-Image is the Catalyst for Video-LLMs

cs.CV
cs.CL
cs.LG

Released Date: November 29, 2024

Authors: Shukang Yin1, Chaoyou Fu2, Sirui Zhao1, Yunhang Shen3, Chunjiang Ge4, Yan Yang2, Zuwei Long3, Yuhan Dai1, Tong Xu1, Xing Sun3, Ran He5, Caifeng Shan2, Enhong Chen1

Aff.: 1USTC; 2NJU; 3Tencent YouTu Lab; 4THU; 5CAS

Arxiv: http://arxiv.org/pdf/2411.19951v1