notesum.ai
Published at November 25Transformers are Deep Optimizers: Provable In-Context Learning for Deep Model Training
Released Date: November 25, 2024
Authors: Weimin Wu1, Maojiang Su2, Jerry Yao-Chieh Hu1, Zhao Song3, Han Liu4
Aff.: 1Center for Foundation Models and Generative AI, Northwestern University; 2Department of Information and Computing Science, USTC; 3Simons Institute for the Theory of Computing, UC Berkeley; 4Department of Statistics and Data Science, Northwestern University