notesum.ai
Published at October 18Optimizing Attention with Mirror Descent: Generalized Max-Margin Token Selection
cs.CL
cs.AI
cs.LG
Released Date: October 18, 2024
Authors: Aaron Alvarado Kristanto Julistiono, Davoud Ataee Tarzanagh1, Navid Azizan2
Aff.: 1University of Pennsylvania; 2Massachusetts Institute of Technology

| Algorithm | Model Size 3 | Model Size 4 | Model Size 6 |
|---|---|---|---|
| -MD | 83.47 0.09% | 83.36 0.13% | 83.65 0.13% |
| -MD | |||
| -MD |