notesum.ai
Published at November 14On the Surprising Effectiveness of Attention Transfer for Vision Transformers
cs.LG
cs.AI
cs.CV
cs.NE
Released Date: November 14, 2024
Authors: Alexander C. Li1, Yuandong Tian2, Beidi Chen1, Deepak Pathak1, Xinlei Chen2
Aff.: 1Carnegie Mellon University; 2FAIR

| method | acc. |
|---|---|
| scratch | 83.0 |
| fine-tune | 85.7 |
| attn. copy | 85.1 |
| attn. copy from fine-tuned | 85.6 |
| attn. distill | 85.7 |
| ensemble attn. distill fine-tune | 86.3 |