notesum.ai
Published at October 31Microsecond-scale Dynamic Validation of Idempotency for GPU Kernels
cs.OS
cs.DC
D.4.0
Released Date: October 31, 2024
Authors: Mingcong Han1, Weihang Shen1, Guanwen Peng1, Rong Chen1, Haibo Chen1
Aff.: 1Institute of Parallel and Distributed Systems, Shanghai Jiao Tong University

| GPU Apps | #Kernels | #Instances | Manual | Kernel-level | Truth | Instance-level | |||||||||
| ★ | ✩ | ★ | ✩ | F– | (%) | F+ | |||||||||
| Rodinia [9] | 40 | 4,527 | 7 | 12 | 21 | 4 | 23 | 13 | 4,419 | 108 | 4,033 | 494 | 386 | (8.74) | 0 |
| Parboil [67] | 25 | 1,033 | 4 | 12 | 9 | 1 | 15 | 9 | 295 | 738 | 222 | 811 | 73 | (24.75) | 0 |
| TVM [71] | 308 | 609 | 0 | 0 | 308 | 0 | 0 | 308 | 609 | 0 | 596 | 13 | 13 | (2.13) | 0 |
| PyTorch [55] | 66 | 1,570 | 3 | 1 | 62 | 3 | 22 | 41 | 1,282 | 288 | 824 | 746 | 458 | (35.73) | 0 |
| TensorRT [48] | 58 | 478 | 0 | 2 | 56 | 0 | 17 | 41 | 465 | 13 | 267 | 211 | 198 | (42.58) | 0 |
| FT [51] | 50 | 10,000 | 9 | 7 | 34 | 8 | 19 | 23 | 7,348 | 2,652 | 5,803 | 4,197 | 1,545 | (21.03) | 0 |
| All | 547 | 18,217 | 23 | 34 | 490 | 16 | 96 | 435 | 14,418 | 3,799 | 11,745 | 6,472 | 2,673 | (18.54) | 0 |