notesum.ai

Published at October 31

Kernel Looping: Eliminating Synchronization Boundaries for Peak Inference Performance

cs.CL
cs.AI
cs.AR
D.3.4; C.1.3

Released Date: October 31, 2024

Authors: David Koeplinger1, Darshan Gandhi1, Pushkar Nandkar, Nathan Sheeley1, Matheen Musaddiq1, Leon Zhang, Reid Goodbar1, Matthew Shaffer1, Han Wang, Angela Wang1, Mingran Wang1, Raghu Prabhakar

Aff.: 1SambaNova Systems, Inc

Arxiv: http://arxiv.org/abs/2410.23668v1