notesum.ai
Published at April 22Discovering Sparsity Allocation for Layer-wise Pruning of Large Language Models
NeurIPS
Released Date: April 22, 2024
Authors: Lujun Li1, Peijie Dong2, Zhenheng Tang3, Xiang Liu2, Qiang Wang4, Wenhan Luo1, Wei Xue1, Qifeng Liu1, Xiaowen Chu2, Yike Guo1
Aff.: 1Hong Kong University of Science and Technology; 2Hong Kong University of Science and Technology (Guangzhou); 3Hong Kong Baptist University; 4Harbin Institute of Technology (Shenzhen)
Arxiv: https://openreview.net/pdf/c742f770723557fe9f03c7f7eb1944b07bd68423.pdf