notesum.ai
Published at November 11Renaissance: Investigating the Pretraining of Vision-Language Encoders
cs.CV
cs.AI
cs.CL
cs.LG
Released Date: November 11, 2024
Authors: Clayton Fields1, Casey Kennington
Aff.: 1Boise State University

| Text Encoder | Vision Encoder | SNLI-VE | NLVR2 | Ref. Res. |
| Unfrozen | Unfrozen | 0.741 | 0.672 | 0.724 |
| Frozen | Unfrozen | 0.735 | 0.675 | 0.702 |
| Unfrozen | Frozen | 0.741 | 0.672 | 0.740 |
| Frozen | Frozen | 0.738 | 0.665 | 0.721 |
| ELECTRA-Base-Frz | ViT-Base-Frz | 0.756 | 0.630 | 0.756 |