notesum.ai

Published at May 8

MemoryFormer : Minimize Transformer Computation by Removing Fully-Connected Layers

NeurIPS

Released Date: May 8, 2024

Authors: Ning Ding1, Yehui Tang2, Haochen Qin2, Zhenli Zhou1, Chao Xu1, Lin Li3, Kai Han2, Liao Heng, Yunhe Wang2

Aff.: 1State Key Lab of General AI, School of Intelligence Science and Technology, Peking University; 2Huawei Noah's Ark Lab; 3Huawei HiSilicon

Arxiv: https://openreview.net/pdf/3182fbee67403ec98f7f114ce0d5398a511c94cf.pdf