notesum.ai

Published at November 1

MoE-I$^2$: Compressing Mixture of Experts Models through Inter-Expert Pruning and Intra-Expert Low-Rank Decomposition

cs.LG
cs.AI

Released Date: November 1, 2024

Authors: Cheng Yang1, Yang Sui2, Jinqi Xiao1, Lingyi Huang1, Yu Gong1, Yuanlin Duan1, Wenqi Jia3, Miao Yin3, Yu Cheng4, Bo Yuan1

Aff.: 1Rutgers University; 2Rice University; 3The University of Texas at Arlington; 4The Chinese University of Hong Kong

Arxiv: http://arxiv.org/abs/2411.01016v1