notesum.ai
Published at May 13DiTFastAttn: Attention Compression for Diffusion Transformer Models
NeurIPS
Released Date: May 13, 2024
Authors: Zhihang Yuan1, Hanling Zhang1, Lu Pu, Xuefei Ning1, Linfeng Zhang2, Tianchen Zhao1, Shengen Yan3, Guohao Dai2, Yu Wang1
Aff.: 1Tsinghua University; 2Shanghai Jiao Tong University; 3Infinigence AI
Arxiv: https://openreview.net/pdf/f04f7cb65b98f0bf20c13bbd3cb6d0ecc0432d01.pdf

| Model | DiT-XL-2 512x512 | PixArt-Sigma-XL 1024x1024 | PixArt-Sigma-XL 2048x2048 | ||||||||||||||
| Score | IS | FID |
|
IS | FID | CLIP |
|
IS | FID | CLIP |
|
||||||
| Raw | 408.16 | 25.43 | 100% | 24.33 | 55.65 | 31.27 | 100% | 23.67 | 51.89 | 31.47 | 100% | ||||||
| D1 | 412.24 | 25.32 | 85% | 24.27 | 55.73 | 31.27 | 90% | 23.28 | 52.34 | 31.46 | 81% | ||||||
| D2 | 412.18 | 24.67 | 69% | 24.25 | 55.69 | 31.26 | 74% | 22.90 | 53.01 | 31.32 | 60% | ||||||
| D3 | 411.74 | 23.76 | 59% | 24.16 | 55.61 | 31.25 | 63% | 22.96 | 52.54 | 31.36 | 46% | ||||||
| D4 | 391.80 | 21.52 | 49% | 24.07 | 55.32 | 31.24 | 52% | 22.95 | 51.74 | 31.39 | 36% | ||||||
| D5 | 370.07 | 19.32 | 41% | 24.17 | 54.54 | 31.22 | 44% | 22.82 | 51.21 | 31.34 | 29% | ||||||
| D6 | 352.20 | 16.80 | 34% | 23.94 | 52.73 | 31.18 | 37% | 22.38 | 49.34 | 31.28 | 24% | ||||||