notesum.ai

Published at April 26

Faster Neighborhood Attention: Reducing the O(n^2) Cost of Self Attention at the Threadblock Level

NeurIPS

Released Date: April 26, 2024

Authors: Ali Hassani1, Wen-mei Hwu2, Humphrey Shi3

Aff.: 1SHI Labs @ Georgia Tech; 2NVIDIA, UIUC; 3SHI Labs @ Georgia Tech, UIUC

Arxiv: https://openreview.net/pdf/c776c48ab28b80d61ecdc8e1892789844332ad6c.pdf