notesum.ai
Published at December 6Maximizing Alignment with Minimal Feedback: Efficiently Learning Rewards for Visuomotor Robot Policy Alignment
cs.RO
cs.AI
cs.CV
cs.LG
Released Date: December 6, 2024
Authors: Ran Tian1, Yilin Wu2, Chenfeng Xu, Masayoshi Tomizuka, Jitendra Malik, Andrea Bajcsy
Aff.: 1UC Berkeley; 2Carnegie Mellon University

| Spearman’s Correlation | |||
|---|---|---|---|
| Franka Group | Franka Clutter | Kuka Group | |
| RAPL | 0.59 | 0.61 | 0.47 |
| RLHF | 0.38 | 0.26 | 0.31 |
| MVP-OT | -0.1 | 0.08 | 0.02 |
| FT-MVP-OT | 0.19 | 0.11 | 0.02 |
| ImNet-OT | -0.09 | -0.02 | 0.12 |
| R3M-OT | 0.03 | -0.17 | -0.14 |