notesum.ai

Published at December 9

LLaVA-SpaceSGG: Visual Instruct Tuning for Open-vocabulary Scene Graph Generation with Enhanced Spatial Relations

cs.CV

Released Date: December 9, 2024

Authors: Mingjie Xu1, Mengyang Wu2, Yuzhi Zhao3, Jason Chun Lok Li4, Weifeng Ou5

Aff.: 1Independent Researcher; 2The Chinese University of Hong Kong; 3City University of Hong Kong; 4The University of Hong Kong; 5Dongguan University of Technology

Arxiv: http://arxiv.org/pdf/2412.06322v1