notesum.ai
Published at November 16SAM Decoding: Speculative Decoding via Suffix Automaton
cs.CL
cs.AI
Released Date: November 16, 2024
Authors: Yuxuan Hu, Ke Wang, Jing Zhang, Cuiping Li, Hong Chen
| Model | Method | HumanEval | MBPP | HAGRID | ||||||
| #MAT | Tokens/s | Speedup | #MAT | Tokens/s | Speedup | #MAT | Tokens/s | Speedup | ||
| Vicuna-7B | PLD | 1.65 | 59.04 | 1.42 | 50.62 | 2.03 | 44.11 | |||
| Token Recycling | 2.78 | 75.44 | 2.83 | 79.22 | 2.88 | 66.17 | ||||
| SAM-Decoding[T] | 2.94 | 95.08 | 2.87 | 94.50 | 3.23 | 87.93 | ||||
| EAGLE | 4.10 | 103.39 | 4.17 | 119.31 | 3.44 | 72.14 | ||||
| SAM-Decoding[E] | 4.03 | 110.16 | 3.85 | 114.92 | 4.05 | 93.17 | ||||
| EAGLE2 | 5.12 | 128.19 | 5.29 | 135.24 | 4.15 | 82.61 | ||||
| SAM-Decoding[E2] | 4.79 | 127.03 | 4.50 | 117.06 | 4.75 | 96.60 | ||||