notesum.ai
Published at December 6SoPo: Text-to-Motion Generation Using Semi-Online Preference Optimization
cs.CV
Released Date: December 6, 2024
Authors: Xiaofeng Tan1, Hongsong Wang1, Xin Geng1, Pan Zhou2
Aff.: 1Southeast University, Nanjing, China; 2Singapore Management University
![[Uncaptioned image]](https://arxiv.org/html/2412.05095v1/x1.png)
| Methods | Time∗ | R-Precision | MM Dist | Diversity | FID | ||
| Top 1 | Top 2 | Top 3 | |||||
| Real | - | .511±.003 | .703±.003 | .797±.002 | 2.974±.008 | 9.503±.065 | .002 ±.000 |
| MLD [3] | +0 X | - | - | .755±.003 | 3.292±.010 | 9.793±.072 | .459±.011 |
| MoDiPO-T [23] | +121 X | - | - | .758 | 3.267±.010+0.76% | 9.747±.073+0.046 | .303±.031+33.9% |
| MoDiPO-G [23] | +121 X | - | - | .753±.003-0.26% | 3.294±.010-0.01% | 9.702±.075+0.091 | .281±.031+38.8% |
| MoDiPO-O [23] | - | - | - | .677±.003-10.3% | 3.701±.013-12.4% | 9.241±.079-0.018 | .276±.007+39.9%† |
| SoPo (Ours) | +20 X | - | - | .763±.003+1.06% | 3.185±.012+3.25%† | 9.525±.065+0.222† | .374±.007+18.5% |
| MDM [33] | +0 X | .418 ±.005 | .604±.005 | .703±.005 | 3.658±.025 | 9.546±.066 | .501±.037 |
| MoDiPO-T [23] | +121 X | - | - | .706±.004+0.42% | 3.634±.026+0.66% | 9.531±.073+0.015 | .451±.031+9.98% |
| MoDiPO-G [23] | +121 X | - | - | .704±.001+0.14% | 3.641±.025+0.46% | 9.495±.071+0.035 | .486±.031+2.99% |
| MDM (fast) [33] | +0 X | .455±.006 | .645±.007 | .749±.004 | 3.304±.023 | 9.948±.084 | .534±.052 |
| SoPo (Ours) | +60 X | .479±.006+5.27%† | .674 ±.005+4.50%† | .770±.006+2.80%† | 3.208±.025+2.91% | 9.906±.083+0.042 | .480±.046+10.1% |