notesum.ai
Published at November 27Diffusion Self-Distillation for Zero-Shot Customized Image Generation
cs.CV
cs.AI
cs.GR
cs.LG
Released Date: November 27, 2024
Authors: Shengqu Cai1, Eric Chan, Yunzhi Zhang1, Leonidas Guibas1, Jiajun Wu1, Gordon Wetzstein1
Aff.: 1Stanford University

| Concept Preservation | Prompt Following | Debiased Concept Preservation | Debiased Prompt Following | Debiased | |||||||||||||
| Method | Z-S? | Animal | Human | Object | Overall | Real. | Imag. | Overall | CPPF | Animal | Human | Object | Overall | Real. | Imag. | Overall | CPPF |
| Textual Inversion | ✗ | 0.502 | 0.358 | 0.305 | 0.388 | 0.671 | 0.437 | 0.598 | 0.232 | 0.741 | 0.694 | 0.717 | 0.722 | 0.619 | 0.385 | 0.541 | 0.391 |
| DreamBooth | ✗ | 0.640 | 0.199 | 0.488 | 0.442 | 0.798 | 0.504 | 0.692 | 0.306 | 0.670 | 0.362 | 0.676 | 0.626 | 0.750 | 0.467 | 0.656 | 0.411 |
| DreamBooth LoRA | ✗ | 0.751 | 0.311 | 0.543 | 0.535 | 0.898 | 0.754 | 0.849 | 0.450 | 0.681 | 0.675 | 0.761 | 0.720 | 0.865 | 0.718 | 0.816 | 0.588 |
| BLIP-Diffusion | ✓ | 0.637 | 0.557 | 0.469 | 0.554 | 0.581 | 0.303 | 0.464 | 0.257 | 0.771 | 0.733 | 0.745 | 0.750 | 0.529 | 0.266 | 0.442 | 0.332 |
| Emu2 | ✓ | 0.670 | 0.546 | 0.447 | 0.554 | 0.732 | 0.560 | 0.670 | 0.371 | 0.652 | 0.683 | 0.701 | 0.681 | 0.686 | 0.494 | 0.622 | 0.424 |
| IP-Adapter | ✓ | 0.667 | 0.558 | 0.504 | 0.576 | 0.743 | 0.446 | 0.607 | 0.350 | 0.790 | 0.764 | 0.743 | 0.766 | 0.695 | 0.377 | 0.589 | 0.451 |
| IP-Adapter+ | ✓ | 0.900 | 0.845 | 0.759 | 0.834 | 0.502 | 0.279 | 0.388 | 0.324 | 0.481 | 0.473 | 0.530 | 0.504 | 0.442 | 0.229 | 0.371 | 0.187 |
| Ours | ✓ | 0.647 | 0.567 | 0.640 | 0.631 | 0.777 | 0.625 | 0.726 | 0.458 | 0.852 | 0.774 | 0.750 | 0.789 | 0.808 | 0.681 | 0.757 | 0.597 |