notesum.ai

Published at November 25

SAVEn-Vid: Synergistic Audio-Visual Integration for Enhanced Understanding in Long Video Context

cs.CV

Released Date: November 25, 2024

Authors: Jungang Li1, Sicheng Tao1, Yibo Yan1, Xiaojie Gu1, Haodong Xu1, Xu Zheng2, Yuanhuiyi Lyu2, Linfeng Zhang3, Xuming Hu2

Aff.: 1The Hong Kong University of Science and Technology (Guangzhou); 2The Hong Kong University of Science and Technology; 3Shanghai Jiao Tong University

Arxiv: http://arxiv.org/abs/2411.16213v1