notesum.ai
Published at December 9Visual Lexicon: Rich Image Features in Language Space
cs.CV
cs.AI
cs.LG
Released Date: December 9, 2024
Authors: XuDong Wang1, Xingyi Zhou1, Alireza Fathi1, Trevor Darrell2, Cordelia Schmid1
Aff.: 1Google DeepMind; 2UC Berkeley
![[Uncaptioned image]](https://arxiv.org/html/2412.06774v1/x1.png)
| DeDiffusion | DALLE 3 | ||||||
| layout | semantic | style | layout | semantic | style | ||
| vs. ViLex () | 98% | 95% | 98% | 91% | 76% | 90% | |