| Model |
SKCM |
SKCU |
MKCC |
| Concept |
Concept |
Task |
Concept |
Task |
Composition |
| Text-to-image Generation |
| SD v1.5 |
40.5 |
52.9 |
53.9 |
37.6 |
13.4 |
15.1 |
| SD XL |
45.8 |
59.9 |
65.8 |
51.7 |
28.0 |
35.4 |
| Pixart |
26.4 |
46.2 |
55.3 |
35.8 |
19.8 |
24.3 |
| Playground |
45.6 |
66.1 |
62.5 |
53.8 |
35.4 |
44.8 |
| Flux.1 dev |
35.6 |
54.9 |
58.0 |
56.9 |
54.1 |
63.8 |
| SD 3.5 |
46.2 |
64.6 |
71.2 |
68.9 |
59.2 |
75.5 |
| DALLE-3 |
55.5 |
72.4 |
88.5 |
71.3 |
70.2 |
85.6 |
| Visual-Knowledge Injection |
| SSR-Encoder |
71.8
|
69.0
|
22.8
|
43.1
|
12.5
|
9.5
|
| MS-Diffusion |
84.8
|
80.4
|
50.8
|
65.5
|
32.9
|
31.0
|
| Text-Knowledge Injection |
| Flux.1 dev* |
41.2
|
60.3
|
59.8
|
64.2
|
56.9
|
72.6
|
| SD 3.5* |
49.7
|
66.7
|
65.8
|
67.9
|
53.6
|
64.7
|