notesum.ai
Published at November 21Detecting Human Artifacts from Text-to-Image Models
cs.CV
Released Date: November 21, 2024
Authors: Kaihong Wang1, Lingzhi Zhang2, Jianming Zhang2
Aff.: 1Boston University; 2Adobe Research

| Domain | Face | Torso | Arm | Hand | Leg | Feet | Average |
|---|---|---|---|---|---|---|---|
| SDXL | 26.0 / 238 | 26.8 / 39 | 25.4 / 672 | 80.1 / 3493 | 28.8 / 477 | 50.0 / 672 | 39.5 |
| DALLE-2 | 86.6 / 145 | 100.0 / 1 | 52.7 / 131 | 88.9 / 228 | 39.4 / 36 | 56.8 / 42 | 70.7 |
| DALLE-3 | 2.7 / 7 | - / 0 | 8.8 / 46 | 48.2 / 563 | 7.4 / 10 | 20.0 / 27 | 17.4 |
| Midjourney | 5.5 / 4 | - / 0 | 9.8 / 22 | 54.4 / 586 | 14.0 / 13 | 27.8 / 51 | 22.3 |
| ALL | 53.4 / 398 | 27.6 / 43 | 28.9 / 875 | 74.5 / 4875 | 28.1 / 539 | 47.3 / 798 | 43.3 |