notesum.ai
Published at December 10RoboMM: All-in-One Multimodal Large Model for Robotic Manipulation
cs.RO
cs.MM
Released Date: December 10, 2024
Authors: Feng Yan1, Fanfan Liu, Liming Zheng, Yufeng Zhong, Yiyang Huang, Zechao Guan, Chengjian Feng, Lin Ma
Aff.: 1Meituan Inc.

| Dataset | Model | Source | SR |
| CALVIN | MDT [55] | RSS’24 | 93.7% |
| HULC++ [43] | ICRA’24 | 93.0% | |
| SPIL [70] | RA-L’24 | 84.6% | |
| LCD [66] | arXiv’23 | 88.7% | |
| RoboFlamingo [33] | arXiv’23 | 86.0% | |
| PlayFusion [12] | CoRL’23 | 45.2% | |
| Distill-D [24] | CoRL’23 | 86.7% | |
| HULC [40] | RA-L’22 | 82.7% | |
| CALVIN [41] | RA-L’22 | 76.4% | |
| RoboMM (ours) | - | 91.0% | |
| RoboMM- (ours) | - | 74.7% | |
| Meta-World | PRISE [69] | ICML’24 | 80.4% |
| PAD [23] | NeurIPS’24 | 72.5% | |
| GR-1 [64] | ICLR’24 | 57.4% | |
| SuSIE [5] | ICLR’24 | 41.0% | |
| RT-2* [7] | arXiv’23 | 52.2% | |
| RT-1 [6] | RSS’23 | 34.6% | |
| RoboMM (ours) | - | 78.6% | |
| RoboMM- (ours) | - | 79.3% | |
| LIBERO | QueST [45] | arXiv’24 | 89.8% |
| VQ-BeT [31] | ICML’24 | 81.4% | |
| MDT [55] | RSS’24 | 67.2% | |
| MaIL [27] | CoRL’24 | 60.3% | |
| PRISE [69] | ICML’24 | 54.4% | |
| ATM [63] | RSS’24 | 48.4% | |
| MUTEX [56] | CoRL’23 | 53.0% | |
| DiffusionPolicy [15] | IJRR’23 | 75.4% | |
| ACT [67] | RSS’23 | 46.6% | |
| ResNet-T [34] | NeurIPS’23 | 84.4% | |
| Distill-D [24] | CoRL’23 | 49.9% | |
| RoboMM (ours) | - | 90.7% | |
| RoboMM- (ours) | - | 64.2% | |
| RoboCasa | RoboCasa [48] | RSS’24 | 28.8% |
| RoboMM (ours) | - | 30.6% | |
| RoboMM- (ours) | - | 27.0% | |
| Robomimic | IBC [19] | CoRL’21 | 13.6% |
| RoboMM (ours) | - | 15.0% | |
| RoboMM- (ours) | - | 8.0% |