notesum.ai
Published at October 21Beyond 2:4: exploring V:N:M sparsity for efficient transformer inference on GPUs
cs.CV
cs.AI
cs.LG
Released Date: October 21, 2024
Authors: Kang Zhao1, Tao Yuan2, Han Bao2, Zhenfeng Su2, Chang Gao3, Zhaofeng Sun1, Zichen Liang1, Liping Jing3, Jianfei Chen1
Aff.: 1Tsinghua University; 2Huawei Noah's Ark Lab; 3Beijing Jiaotong University

| Iterations | AVG Accu. (%) |
|---|---|
| 1 | 94.56 |
| 2 | 94.71 |
| 3 | 94.53 |
| 4 | 94.44 |