notesum.ai
Published at November 22ReVisionLLM: Recursive Vision-Language Model for Temporal Grounding in Hour-Long Videos
cs.CV
cs.CL
Released Date: November 22, 2024
Authors: Tanveer Hannan1, Md Mohaiminul Islam2, Jindong Gu3, Thomas Seidl1, Gedas Bertasius2
Aff.: 1LMU Munich; 2UNC Chapel Hill; 3University of Oxford
