notesum.ai
Published at December 3LayoutVLM: Differentiable Optimization of 3D Layout via Vision-Language Models
cs.CV
cs.AI
Released Date: December 3, 2024
Authors: Fan-Yun Sun1, Weiyu Liu, Siyi Gu, Dylan Lim, Goutam Bhat2, Federico Tombari, Manling Li, Nick Haber, Jiajun Wu
Aff.: 1Stanford University; 2Google Research

| Bedroom | Living Room | Dining Room | Bookstore | Buffet Restaurant | Children Room | |||||||||||||||||||||||||
| Methods | CF | IB | Pos. | Rot. | PSA | CF | IB | Pos. | Rot. | PSA | CF | IB | Pos. | Rot. | PSA | CF | IB | Pos. | Rot. | PSA | CF | IB | Pos. | Rot. | PSA | CF | IB | Pos. | Rot. | PSA |
| LayoutGPT | 100.0 | 66.7 | 85.7 | 85.9 | 52.2 | 44.4 | 11.1 | 74.7 | 64.4 | 9.6 | 88.9 | 22.2 | 76.0 | 68.9 | 14.8 | 88.9 | 55.6 | 80.9 | 79.4 | 35.9 | 100.0 | 33.3 | 81.2 | 83.3 | 26.9 | 100.0 | 0.0 | 80.9 | 82.6 | 0.0 |
| Holodeck | 88.9 | 22.2 | 69.3 | 67.9 | 14.1 | 77.8 | 0.0 | 66.3 | 55.6 | 0.0 | 88.9 | 0.0 | 38.0 | 36.6 | 0.0 | 55.6 | 0.0 | 65.7 | 59.0 | 0.0 | 77.8 | 11.1 | 47.7 | 42.4 | 7.4 | 77.8 | 22.2 | 72.7 | 70.0 | 18.7 |
| I-Design | 100.0 | 77.8 | 72.1 | 65.4 | 51.5 | 33.3 | 11.1 | 62.6 | 46.7 | 0.0 | 88.9 | 66.7 | 76.4 | 66.4 | 34.8 | 66.7 | 11.1 | 68.1 | 69.4 | 5.2 | 100.0 | 55.6 | 63.5 | 57.1 | 35.2 | 77.8 | 55.6 | 78.1 | 75.1 | 34.8 |
| LayoutVLM | 88.9 | 100.0 | 82.3 | 74.9 | 68.8 | 22.2 | 77.8 | 68.6 | 54.4 | 9.6 | 88.9 | 100.0 | 63.4 | 56.9 | 51.1 | 55.6 | 100.0 | 82.0 | 82.8 | 49.8 | 88.9 | 88.9 | 74.3 | 64.8 | 51.5 | 100.0 | 100.0 | 81.9 | 88.2 | 88.5 |