notesum.ai
Published at December 10StoryWeaver: A Unified World Model for Knowledge-Enhanced Story Character Customization
cs.CV
Released Date: December 10, 2024
Authors: Jinlu Zhang1, Jiji Tang2, Rongsheng Zhang2, Tangjie Lv2, Xiaoshuai Sun1
Aff.: 1Xiamen University; 2Netease Inc.
![[Uncaptioned image]](https://arxiv.org/html/2412.07375v1/x1.png)
| Task | Type | Method | # Model Params/DB() | Pororo | Frozen | ||||
| DINO-I() | CLIP-I() | CLIP-T() | DINO-I() | CLIP-I() | CLIP-T() | ||||
| Sin-Char | Adapter-based | StoryGEN | 1064 M | 52.92 | 76.03 | 26.98 | 46.67 | 72.61 | 28.05 |
| IP-Adapter(base) | 1038 M | 48.85 | 76.66 | 29.98 | 44.15 | 78.42 | 31.69 | ||
| IP-Adapter(plus) | 1063 M | 64.36 | 81.64 | 24.88 | 60.87 | 84.52 | 27.15 | ||
| Customization-based | LORA | 1024 M | 54.13 | 75.19 | 28.53 | 49.02 | 82.77 | 29.18 | |
| Dreambooth | 7118 M | 61.85 | 78.86 | 26.74 | 55.01 | 81.07 | 27.12 | ||
| StoryWeaver(ours) | 1017 M | 64.96 | 82.65 | 33.26 | 62.17 | 85.24 | 36.74 | ||
| Task | Type | Method | # Model Params/DB() | Pororo | Frozen | ||||
| CLIP-T() | F-Acc() | C-F1() | CLIP-T() | F-Acc() | C-F1() | ||||
| Multi-Char | Adapter-based | StoryGEN | 1064 M | 27.27 | 19.55 | 27.17 | 28.91 | 12.31 | 21.79 |
| Customization-based | Mix-of-Show | 1164 M | 27.20 | 30.23 | 44.03 | 30.71 | 18.90 | 30.62 | |
| LoRA-Composer | 1425 M | 27.86 | 27.04 | 47.36 | 28.88 | 27.69 | 39.72 | ||
| StoryWeaver(ours) | 1017 M | 34.30 | 40.45 | 59.72 | 34.94 | 34.51 | 44.53 | ||