notesum.ai

Published at November 4

Scalable Efficient Training of Large Language Models with Low-dimensional Projected Attention

cs.CL
cs.AI
cs.LG

Released Date: November 4, 2024

Authors: Xingtai Lv1, Ning Ding1, Kaiyan Zhang1, Ermo Hua1, Ganqu Cui2, Bowen Zhou1

Aff.: 1Department of Electronic Engineering, Tsinghua University; 2Shanghai AI Laboratory

Arxiv: http://arxiv.org/abs/2411.02063v1