notesum.ai
Published at November 27From Open Vocabulary to Open World: Teaching Vision Language Models to Detect Novel Objects
cs.CV
cs.AI
Released Date: November 27, 2024
Authors: Zizhao Li1, Zhengkang Xiang1, Joseph West1, Kourosh Khoshelham1
Aff.: 1The University of Melbourne

| Task IDs (→) | Task 1 | Task 2 | Task 3 |
|---|---|---|---|
| nu-OWODB | Vehicles | Pedestrians | Obstacles |
| # classes | 10 | 7 | 6 |
| # training images | 53850 | 34957 | 25682 |
| # test images | 13099 | 8473 | 6500 |
| # training instances | 274587 | 135870 | 147253 |
| # test instances | 64303 | 32710 | 39060 |