notesum.ai
Published at November 20Topkima-Former: Low-energy, Low-Latency Inference for Transformers using top-k In-memory ADC
cs.AR
Released Date: November 20, 2024
Authors: Shuai Dong1, Junyi Yang1, Xiaoqi Peng1, Hongyang Shang1, Ye Ke1, Xiaofeng Yang2, Hongjie Liu2, Arindam Basu1
Aff.: 1Department of Electrical Engineering, City University of Hong Kong, Hong Kong; 2Reexen Technology, China
