notesum.ai
Published at November 11Multimodal Fusion Balancing Through Game-Theoretic Regularization
cs.LG
cs.AI
cs.CV
cs.GT
cs.MM
Released Date: November 11, 2024
Authors: Konstantinos Kontras1, Thomas Strypsteen1, Christos Chatzichristos1, Paul P. Liang, Matthew Blaschko1, Maarten De Vos1
Aff.: 1ESAT, KU Leuven

| Transformer | Swin-TF | ||||
|---|---|---|---|---|---|
| MOSEI | MOSI | Something-Something | |||
| Method | V-T | V-A-T | V-T | V-A-T | V-OF |
| Unimodals | V: 63.6 A: 64.3 T: 79.9 | V: 53.5 A: 54.9 T: 70.7 | V:61.6 OF:50.6 | ||
| Ensemble | 77.4 | 76.6 | 69.8 | 43.3 | 64.8 |
| Joint Training | 80.5 | 81.2 | 74.7 | 73.9 | 56.3 |
| Multi-Loss | 80.8 | 79.9 | 75.3 | 72.0 | 59.6 |
| OGM [40] | 80.6 | - | 75.2 | - | 58.3 |
| AGM [28] | 79.5 | 80.6 | 75.3 | 75.2 | 56.9 |
| MLB [26] | 80.8 | 81.1 | 75.0 | 74.5 | 61.3 |
| Uni-Pre Frozen | 80.5 | 79.8 | 70.6 | 70.6 | 64.2 |
| Uni-Pre Finetuned | 80.7 | 80.1 | 73.8 | 71.2 | 61.9 |
| \hdashlineMCR | 81.2 | 81.6 | 76.1 | 77.9 | 65.0 |