notesum.ai
Published at December 10Fusion Embedding for Pose-Guided Person Image Synthesis with Diffusion Model
cs.CV
cs.AI
Released Date: December 10, 2024
Authors: Donghwna Lee1, Kyungha Min, Kirok Kim, Seyoung Jeong, Jiwoo Jeong, Wooju Kim
Aff.: 1Department of Industrial Engineering, Yonsei University, Seoul, Republic of Korea

| Method | FID | LPIPS | SSIM | PSNR | |
| Evaluation on 256 176 resolution | |||||
| SPGNet† | 16.184 | 14.107 | 0.2256 | 0.6965 | 17.222 |
| DPTN† | 17.419 | 15.491 | 0.2093 | 0.6975 | 17.811 |
| NTED† | 8.517 | 6.935 | 0.1770 | 0.7156 | 17.740 |
| CASD† | 13.137 | 11.619 | 0.1781 | 0.7224 | 17.880 |
| PIDM† | 6.812 | 5.168 | 0.2006 | 0.6621 | 15.630 |
| PoCoLD | 8.067 | - | 0.1642 | 0.7310 | - |
| CFLD† | 6.804 | 5.688 | 0.1519 | 0.7378 | 18.235 |
| PCDM‡ | 7.699 | 5.901 | 0.1572 | 0.7280 | 18.385 |
| FPDM (Ours; B6) | 7.318 | 5.459 | 0.1445 | 0.7417 | 18.832 |
| VAE Reconstructed | 8.338 | 1.250 | 0.0103 | 0.9634 | 34.878 |
| Ground Truth | 7.847 | 0.000 | 0.0000 | 1.0000 | |
| Evaluation on 512 352 resolution | |||||
| CoCosNet2 | 13.325 | - | 0.2265 | 0.7236 | - |
| NTED† | 7.645 | 6.602 | 0.1999 | 0.7359 | 17.385 |
| PoCoLD | 8.416 | - | 0.1920 | 0.7430 | - |
| CFLD† | 7.149 | 6.177 | 0.1819 | 0.7478 | 17.645 |
| PCDM‡ | 7.747 | 6.086 | 0.1729 | 0.7471 | 18.028 |
| FPDM(Ours; B4) | 7.464 | 5.797 | 0.1734 | 0.7466 | 18.125 |
| FPDM (Ours; B6) | 7.534 | 5.884 | 0.1717 | 0.7487 | 18.197 |
| VAE Reconstructed | 8.492 | 1.488 | 0.0212 | 0.9149 | 30.633 |
| Ground Truth | 8.010 | 0.000 | 0.0000 | 1.0000 | |