notesum.ai

Published at November 7

Vision Language Models are In-Context Value Learners

cs.RO
cs.AI
cs.LG

Released Date: November 7, 2024

Authors: Yecheng Jason Ma1, Joey Hejna2, Ayzaan Wahid2, Chuyuan Fu2, Dhruv Shah2, Jacky Liang2, Zhuo Xu2, Sean Kirmani2, Peng Xu2, Danny Driess2, Ted Xiao2, Jonathan Tompson2, Osbert Bastani1, Dinesh Jayaraman1, Wenhao Yu2, Tingnan Zhang2, Dorsa Sadigh3, Fei Xia2

Aff.: 1University of Pennsylvania; 2Google DeepMind; 3Stanford University

Arxiv: http://arxiv.org/abs/2411.04549v1