notesum.ai

Published at April 29

Mini-Sequence Transformers: Optimizing Intermediate Memory for Long Sequences Training

NeurIPS

Released Date: April 29, 2024

Authors: Cheng Luo1, Jiawei Zhao2, Zhuoming Chen3, Beidi Chen3, Anima Anandkumar1

Aff.: 1California Institute of Technology; 2Meta FAIR; 3Carnegie Mellon University

Arxiv: https://openreview.net/pdf/d291323c2636eacc38c4c3399f3ac1d69c920a5e.pdf