notesum.ai

Published at November 16

MTA: Multimodal Task Alignment for BEV Perception and Captioning

cs.CV
cs.AI
cs.CL

Released Date: November 16, 2024

Authors: Yunsheng Ma1, Burhaneddin Yaman2, Xin Ye2, Feng Tao2, Abhirup Mallik2, Ziran Wang1, Liu Ren2

Aff.: 1Purdue University; 2Bosch Research North America & Bosch Center for Artificial Intelligence (BCAI)

Arxiv: http://arxiv.org/abs/2411.10639v1