notesum.ai
Published at May 10HENASY: Learning to Assemble Scene-Entities for Interpretable Egocentric Video-Language Model
NeurIPS
Released Date: May 10, 2024
Authors: Khoa Vo1, Thinh Phan1, Kashu Yamazaki1, Minh Tran1, Ngan Hoang Le
Aff.: 1AICV Lab, University of Arkansas, Fayetteville, USA
Arxiv: https://openreview.net/pdf/260addeb3cae77d95d50cef43216e139cdea0e6a.pdf