notesum.ai
Published at October 30S3PT: Scene Semantics and Structure Guided Clustering to Boost Self-Supervised Pre-Training for Autonomous Driving
cs.CV
cs.AI
cs.RO
Released Date: October 30, 2024
Authors: Maciej K. Wozniak1, Hariprasath Govindarajan2, Marvin Klingner1, Camille Maurice1, Ravi Kiran3, Senthil Yogamani4
Aff.: 1Qualcomm Technologies International GmbH; 2Arriver Software AB and Linköping University, Sweden; 3Qualcomm SARL, France; 4Qualcomm Technologies, Inc

| Note | Num. clusters | vMF | SK iters. | Depth weight | Linear (mIoU) | Mask Trans. (mIoU) |
| CrIBo baseline | 32 | No | 5 | – | 20.50 | 41.67 |
| + semantic distribution consistent clustering | 32 | Yes | 5 | – | 26.07 (+5.57) | 45.04 (+3.37) |
| + Object diversity consistent spatial clustering | 8 | Yes | 5 | – | 26.13 | 43.40 |
| 16 | Yes | 5 | – | 26.18 | 43.87 | |
| 32 | Yes | 5 | – | 26.07 | 45.04 | |
| 64 | Yes | 5 | – | 24.86 | 45.85 | |
| 128 | Yes | 5 | – | 24.15 | 46.01 | |
| 32 | Yes | 1 | – | 25.24 | 45.47 | |
| 128 | Yes | 1 | – | 28.10 (+2.03) | 47.46 (+2.42) | |
| + Depth-guided spatial clustering | 128 | Yes | 1 | 0.5 | 36.80 | 51.10 |
| 128 | Yes | 1 | 1.0 | 37.65 | 51.51 (+4.05) | |
| 128 | Yes | 1 | 4.0 | 38.06 (+9.96) | 51.04 |