notesum.ai
Published at May 10Exploiting Activation Sparsity with Dense to Dynamic-k Mixture-of-Experts Conversion
NeurIPS
Released Date: May 10, 2024
Authors: Filip Szatkowski1, Bartosz Wójcik2, Mikołaj Piórczyński, Simone Scardapane3
Aff.: 1IDEAS NCBR, Warsaw University of Technology; 2IDEAS NCBR, Jagiellonian University; 3Sapienza University of Rome
Arxiv: https://openreview.net/pdf/214cac9fad144967266f67042945a7b58e468af6.pdf