notesum.ai

Published at May 10

Exploiting Activation Sparsity with Dense to Dynamic-k Mixture-of-Experts Conversion

NeurIPS

Released Date: May 10, 2024

Authors: Filip Szatkowski1, Bartosz Wójcik2, Mikołaj Piórczyński, Simone Scardapane3

Aff.: 1IDEAS NCBR, Warsaw University of Technology; 2IDEAS NCBR, Jagiellonian University; 3Sapienza University of Rome

Arxiv: https://openreview.net/pdf/214cac9fad144967266f67042945a7b58e468af6.pdf