notesum.ai

Published at December 3

Cross-Attention Head Position Patterns Can Align with Human Visual Concepts in Text-to-Image Generative Models

cs.CV

cs.AI

Released Date: December 3, 2024

Authors: Jungwon Park¹, Jungmin Ko¹, Dongnam Byun¹, Jangwon Suh¹, Wonjong Rhee¹

Aff.: ¹Seoul National University

Arxiv: http://arxiv.org/pdf/2412.02237v1

Refer to caption

Method	Image Attribute
	Image Style		Weather Conditions
	CLIP	HP-score	CLIP	HP-score
SDEdit (0.5)	0.2938	-	0.2817	-
SDEdit (0.7)	0.3217	15.8	0.2908	39.5
P2P	0.3120	30.6	0.2788	33.9
PnP	\ul0.3286	41.9	\ul0.3046	35.0
MassaCtrl	0.2722	-	0.2524	-
FPE	0.3236	25.7	0.2962	35.5
\hdashlineP2P-HRV (Ours)	0.3424	100	0.3348	100