notesum.ai

Published at April 22

Discovering Sparsity Allocation for Layer-wise Pruning of Large Language Models

NeurIPS

Released Date: April 22, 2024

Authors: Lujun Li1, Peijie Dong2, Zhenheng Tang3, Xiang Liu2, Qiang Wang4, Wenhan Luo1, Wei Xue1, Qifeng Liu1, Xiaowen Chu2, Yike Guo1

Aff.: 1Hong Kong University of Science and Technology; 2Hong Kong University of Science and Technology (Guangzhou); 3Hong Kong Baptist University; 4Harbin Institute of Technology (Shenzhen)

Arxiv: https://openreview.net/pdf/c742f770723557fe9f03c7f7eb1944b07bd68423.pdf