notesum.ai
Published at November 22Context-Aware Multimodal Pretraining
cs.CV
cs.CL
Released Date: November 22, 2024
Authors: Karsten Roth1, Zeynep Akata2, Dima Damen3, Ivana Balažević, Olivier J. Hénaff
Aff.: 1Tubingen AI Center; 2Munich Center for ML, Helmholtz Munich, TU Munich; 3Google DeepMind

| Method | Train-free | IN-1K | DTD | Food101 | Pets | Cars |
|---|---|---|---|---|---|---|
| Linear Probe [92] | ✗ | 67.3 | 70.0 | 82.9 | 85.3 | 80.4 |
| TIP-X [84] | ✓ | 71.1 | - | - | - | - |
| APE [108] | ✓ | 72.1 | - | - | - | - |
| DMN-TF [102] | ✓ | 72.6 | 71.9 | 86.0 | 92.9 | 78.4 |
| Clip-Adapter [21] | ✗ | 71.1 | - | - | - | |
| MaPLe [36] | ✗ | 72.3 | 71.3 | 85.3 | 92.8 | 83.6 |
| PromptSRC [37] | ✗ | 73.2 | 72.7 | 87.5 | 93.7 | 83.8 |
| Tip-Adapter-F [101] | ✗ | 73.7 | - | - | - | - |
| APE-T [108] | ✗ | 74.3 | - | - | - | - |
| CasPL [92] | ✗ | 74.2 | 75.1 | 88.4 | 94.1 | 86.7 |
| DMN [102] | ✗ | 74.7 | 75.0 | 87.1 | 94.1 | 85.3 |
| SigLIxP | ✓ | 77.9 | 76.7 | 92.6 | 94.4 | 92.8 |