notesum.ai
Published at December 5Infinity: Scaling Bitwise AutoRegressive Modeling for High-Resolution Image Synthesis
cs.CV
Released Date: December 5, 2024
Authors: Jian Han1, Jinlai Liu1, Yi Jiang1, Bin Yan1, Yuqi Zhang, Zehuan Yuan1, Bingyue Peng1, Xiaobing Liu1
Aff.: 1ByteDance

| Methods | # Params | GenEval | DPG | |||||
| Two Obj. | Position | Color Attri. | Overall | Global | Relation | Overall | ||
| Diffusion Models | ||||||||
| LDM [49] | 1.4B | 0.29 | 0.02 | 0.05 | 0.37 | - | - | - |
| SDv1.5 [49] | 0.9B | 0.38 | 0.04 | 0.06 | 0.43 | 74.63 | 73.49 | 63.18 |
| PixArt-alpha [13] | 0.6B | 0.50 | 0.08 | 0.07 | 0.48 | 74.97 | 82.57 | 71.11 |
| SDv2.1 [49] | 0.9B | 0.51 | 0.07 | 0.17 | 0.50 | 77.67 | 80.72 | 68.09 |
| DALL-E 2 [45] | 6.5B | 0.66 | 0.10 | 0.19 | 0.52 | - | - | - |
| DALL-E 3 [7] | - | - | - | - | 0.67† | 90.97 | 90.58 | 83.50 |
| SDXL [43] | 2.6B | 0.74 | 0.15 | 0.23 | 0.55 | 83.27 | 86.76 | 74.65 |
| PixArt-Sigma [12] | 0.6B | 0.62 | 0.14 | 0.27 | 0.55 | 86.89 | 86.59 | 80.54 |
| SD3 (d=24) [21] | 2B | 0.74 | 0.34 | 0.36 | 0.62 | - | - | 84.08 |
| SD3 (d=38) [21] | 8B | 0.89 | 0.34 | 0.47 | 0.71 | - | - | - |
| AutoRegressive Models | ||||||||
| LlamaGen [55] | 0.8B | 0.34 | 0.07 | 0.04 | 0.32 | 65.16 | ||
| Chameleon [59] | 7B | - | - | - | 0.39 | - | - | - |
| HART [58] | 732M | - | - | - | 0.56 | - | - | 80.89 |
| Show-o [70] | 1.3B | 0.80 | 0.31 | 0.50 | 0.68 | - | - | 67.48 |
| Emu3 [66] | 8.5B | 0.81† | 0.49† | 0.45† | 0.66† | - | - | 81.60 |
| Infinity | 2B | 0.85† | 0.49† | 0.57† | 0.73† | 93.11 | 90.76 | 83.46 |