notesum.ai
Published at November 4ViTally Consistent: Scaling Biological Representation Learning for Cell Microscopy
cs.LG
cs.AI
cs.CV
68T07
I.2; I.4
Released Date: November 4, 2024
Authors: Kian Kenyon-Dean1, Zitong Jerry Wang1, John Urbanik1, Konstantin Donhauser2, Jason Hartford3, Saber Saberian1, Nil Sahin1, Ihab Bendidi2, Safiye Celik1, Marta Fay1, Juan Sebastian Rodriguez Vera, Imran S Haque1, Oren Kraus1
Aff.: 1Recursion; 2Valence Labs; 3University of Manchester

| Model backbone | CORUM | hu.MAP | React | StringDB | KS | CVM | |
| Baseline ViTs | |||||||
| ViT-S/16, Untrained | .45 | .34 | .205 | .36 | .30 | 4.3 | |
| ViT-S/14, Dino-V2 | .48 | .345 | .20 | .38 | .34 | 5.6 | |
| trimmed | .51 | .36 | .21 | .40 | .35 | 6.0 | |
| ViT-L/16, ImageNet WSL | .52 | .35 | .21 | .39 | .34 | 5.5 | |
| ViT-L/14, Dino-V2 | .49 | .34 | .21 | .38 | .34 | 5.3 | |
| trimmed | .55 | .37 | .22 | .41 | .36 | 5.9 | |
| ViT-L/16, ImageNet MAE | .53 | .355 | .215 | .40 | .34 | 5.1 | |
| trimmed | .53 | .36 | .22 | .40 | .35 | 5.8 | |
| ViT-G/14, Dino-V2 | .44 | .31 | .20 | .35 | .29 | 3.8 | |
| trimmed | .53 | .35 | .22 | .40 | .33 | 5.2 | |
| MAEs for microscopy | |||||||
| CA-MAE-S/16β, RxRx3 | .55 | .37 | .23 | .43 | .47 | 10.4 | |
| MAE-L/8β, RPI-93M | .61 | .43 | .25 | .47 | .52 | 12.3 | |
| trimmed | .60 | .43 | .255 | .475 | .57 | 15.2 | |
| MAE-L/8β, PP-16M | .60 | .43 | .255 | .48 | .59 | 16.2 | |
| trimmed | .60 | .435 | .26 | .48 | .59 | 16.2 | |
| MAE-G/8β, PP-16M | .62 | .44 | .26 | .49 | .60 | 16.4 | |
| trimmed | .615 | .44 | .26 | .49 | .63 | 18.2 |