notesum.ai

Published at November 25

Transformers are Deep Optimizers: Provable In-Context Learning for Deep Model Training

Released Date: November 25, 2024

Authors: Weimin Wu1, Maojiang Su2, Jerry Yao-Chieh Hu1, Zhao Song3, Han Liu4

Aff.: 1Center for Foundation Models and Generative AI, Northwestern University; 2Department of Information and Computing Science, USTC; 3Simons Institute for the Theory of Computing, UC Berkeley; 4Department of Statistics and Data Science, Northwestern University

Arxiv: http://arxiv.org/abs/2411.16549v1