notesum.ai

Published at May 10

HENASY: Learning to Assemble Scene-Entities for Interpretable Egocentric Video-Language Model

NeurIPS

Released Date: May 10, 2024

Authors: Khoa Vo1, Thinh Phan1, Kashu Yamazaki1, Minh Tran1, Ngan Hoang Le

Aff.: 1AICV Lab, University of Arkansas, Fayetteville, USA

Arxiv: https://openreview.net/pdf/260addeb3cae77d95d50cef43216e139cdea0e6a.pdf