notesum.ai
Published at November 27Prediction with Action: Visual Policy Learning via Joint Denoising Process
cs.RO
cs.AI
Released Date: November 27, 2024
Authors: Yanjiang Guo1, Yucheng Hu1, Jianke Zhang1, Yen-Jen Wang2, Xiaoyu Chen3, Chaochao Lu4, Jianyu Chen3
Aff.: 1IIIS, Tsinghua University; 2University of California, Berkeley; 3Shanghai Qizhi Institute; 4Shanghai AI Lab

| Easier Tasks |
|
|
|
|
|
|
|
|
|
|
||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Diffusion Policy | 0.92 | 0.16 | 0.36 | 0.32 | 0.76 | 0.60 | 0.72 | 0.60 | 0.36 | 0.12 | ||||||||||||||||||||
| SuSIE | 0.96 | 0.32 | 0.60 | 0.68 | 0.56 | 0.68 | 0.92 | 0.68 | 0.96 | 0.32 | ||||||||||||||||||||
| RT-1 | 0.88 | 1.00 | 0.56 | 0.56 | 1.00 | 0.08 | 0.12 | 1.00 | 1.00 | 0.00 | ||||||||||||||||||||
| RT-2* | 1.00 | 0.84 | 0.92 | 0.96 | 0.96 | 0.88 | 0.76 | 1.00 | 0.96 | 0.40 | ||||||||||||||||||||
| GR-1 | 1.00 | 0.84 | 1.00 | 1.00 | 0.96 | 0.88 | 1.00 | 1.00 | 1.00 | 0.60 | ||||||||||||||||||||
| PAD (ours) | 1.00 | 0.92 | 1.00 | 1.00 | 0.92 | 0.72 | 1.00 | 0.92 | 1.00 | 0.88 | ||||||||||||||||||||
| PAD w/o img | 1.00 | 0.92 | 1.00 | 0.88 | 0.92 | 0.16 | 0.92 | 1.00 | 1.00 | 0.12 | ||||||||||||||||||||
| PAD w/o co-train | 1.00 | 0.92 | 1.00 | 0.92 | 0.92 | 0.48 | 0.92 | 0.96 | 0.96 | 0.72 |