notesum.ai
Published at November 5Personalized Video Summarization by Multimodal Video Understanding
cs.CV
cs.AI
Released Date: November 5, 2024
Authors: Brian Chen1, Xiangyuan Zhao1, Yingnan Zhu1
Aff.: 1Samsung Research America VDIL, Irvine, California, USA

| Model | Overall | Ac | Ani | Bio | Com | Cri | Drm | Fmy | Fntsy | Hrrr | Myst | Rom | ScF | Thrl |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| MomentDETR (Lei et al., 2021) | 22.5 | 20.5 | 22.0 | 18.1 | 21.7 | 20.7 | 22.1 | 21.9 | 22.3 | 24.0 | 23.8 | 24.5 | 26.1 | 22.7 |
| QD-DETR (Moon et al., 2023) | 23.1 | 21.4 | 24.1 | 13.1 | 22.4 | 20.7 | 21.2 | 24.3 | 28.4 | 23.1 | 25.1 | 25.4 | 20.3 | 21.5 |
| UniVTG (Lin et al., 2023) | 23.6 | 20.8 | 26.4 | 25.5 | 23.7 | 21.8 | 24.1 | 23.1 | 31.9 | 18.3 | 27.1 | 24.5 | 24.8 | 23.7 |
| VSL | 26.8 | 21.1 | 24.5 | 34.2 | 24.9 | 26.3 | 27.3 | 35.5 | 31.8 | 30.5 | 27.4 | 37.7 | 26.7 | 27.3 |