notesum.ai
Published at November 25Open-Vocabulary Octree-Graph for 3D Scene Understanding
cs.CV
Released Date: November 25, 2024
Authors: Zhigang Wang1, Yifei Su2, Chenhui Li1, Dong Wang1, Yan Huang3, Bin Zhao4, Xuelong Li5
Aff.: 1Shanghai AI Laboratory; 2School of Artificial Intelligence, UCAS; 3MAIS, Institute of Automation; 4Shanghai AI Laboratory, Northwestern Polytechnical University; 5Shanghai AI Laboratory, Institute of Artificial Intelligence (TeleAI)

| Method | AP | AP50 | AP25 |
| sup. mask + sup. semantic | |||
| Mask3D [35] | 26.9 | 36.2 | 41.4 |
| sup. mask + z.s. semantic | |||
| Open3DIS [26] | 23.7 | 29.4 | 32.8 |
| Open3DIS [26] (3D only) | 18.6 | 23.1 | 27.3 |
| OpenMask3D [37] (Mask3D) | 15.4 | 19.9 | 23.1 |
| Ours (Mask3D) | 23.2 | 30.3 | 33.3 |
| z.s. mask + z.s. semantic | |||
| OVIR-3D [24] | 9.3 | 18.7 | 25.0 |
| SAM3D [47] | 9.8 | 15.2 | 20.7 |
| SAI3D [48] | 12.7 | 18.8 | 24.1 |
| Mask-Clustering [45] | 12.0 | 23.3 | 30.1 |
| Ours | 14.3 | 25.8 | 33.6 |