notesum.ai
Published at October 28Enhancing Action Recognition by Leveraging the Hierarchical Structure of Actions and Textual Context
cs.CL
cs.AI
Released Date: October 28, 2024
Authors: Manuel Benavent-Lledo1, David Mulero-Pérez, David Ortiz-Perez1, Jose Garcia-Rodriguez1, Antonis Argyros2
Aff.: 1Department of Computer Technology, University of Alicante, Alicante, Spain; 2Institute of Computer Science, FORTH, Heraklion, Crete, Greece

| Modalities | Joint Loss | Fine-Grained | Coarse-Grained | ||
| Top-1 | Top-5 | Top-1 | Top-5 | ||
| RGB | - | 37.83 | 72.69 | - | - |
| ✓ | 38.27 | 73.55 | 75.62 | 99.41 | |
| Flow | - | 39.09 | 72.69 | - | - |
| ✓ | 39.39 | 69.95 | 75.18 | 99.07 | |
| RGB + Flow | - | 39.90 | 72.66 | - | - |
| ✓ | 40.24 | 74.21 | 75.44 | 99.37 | |
| RGB + Text | - | 53.43 | 84.03 | - | - |
| ✓ | 54.80 | 84.37 | 75.77 | 99.37 | |
| Flow + Text | - | 50.83 | 79.44 | - | - |
| ✓ | 50.91 | 81.07 | 75.29 | 99.56 | |
| RGB + Flow + Text | - | 53.98 | 84.36 | - | - |
| ✓ | 54.95 | 84.11 | 73.77 | 99.15 | |