notesum.ai
Published at November 26AnchorCrafter: Animate CyberAnchors Saling Your Products via Human-Object Interacting Video Generation
cs.CV
Released Date: November 26, 2024
Authors: Ziyi Xu1, Ziyao Huang1, Juan Cao1, Yong Zhang2, Xiaodong Cun3, Qing Shuai4, Yuchen Wang1, Linchao Bao4, Jintao Li1, Fan Tang1
Aff.: 1Institute of Computing Technology, Chinese Academy of Sciences; 2Meituan; 3Great Bay University; 4Tencent
![[Uncaptioned image]](https://arxiv.org/html/2411.17383v1/x1.png)
| Method | FID↓ | FVD↓ | FID-VID↓ | Obj-IoU↑ | Obj-CLIP↑ | LMD (Hand)↓ | LMD (Body)↓ | Subj-Cons↑ | Back-Cons↑ |
|---|---|---|---|---|---|---|---|---|---|
| AnyV2V | 234.1 | 2873.3 | 53.1 | 0.241 | 0.744 | 94.6 | 30.5 | 68.2 | 77.9 |
| MimicMotion+AnyDoor | 167.8 | 1668.9 | 37.7 | 0.647 | 0.863 | 13.2 | 26.0 | 93.4 | 90.7 |
| AnimateAnyone | 172.8 | 2267.0 | 24.2 | 0.361 | 0.832 | 23.2 | 41.5 | 94.9 | 94.1 |
| MimicMotion | 138.1 | 1444.9 | 22.3 | 0.411 | 0.876 | 12.1 | 24.3 | 96.3 | 93.9 |
| Ours | 141.7 | 736.5 | 15.0 | 0.848 | 0.919 | 11.7 | 25.7 | 97.4 | 95.3 |
| w/o Human-Object Dual Adapter | 170.5 | 913.0 | 24.2 | 0.802 | 0.886 | 12.1 | 20.9 | 96.4 | 94.9 |
| w/o Multi-View Object Feature Fusion | 177.0 | 1371.9 | 52.0 | 0.845 | 0.908 | 12.0 | 24.6 | 97.2 | 95.3 |
| w/o 3D Hand Mesh | 164.0 | 920.1 | 21.3 | 0.847 | 0.907 | 12.0 | 22.3 | 97.3 | 95.0 |
| w/o HOI-Region Reweighting Loss | 164.7 | 807.3 | 69.9 | 0.846 | 0.895 | 11.8 | 22.0 | 97.0 | 94.8 |