notesum.ai

Published at November 18

The Power of Many: Multi-Agent Multimodal Models for Cultural Image Captioning

cs.CV
cs.AI
cs.CL

Released Date: November 18, 2024

Authors: Longju Bai1, Angana Borah1, Oana Ignat2, Rada Mihalcea1

Aff.: 1University of Michigan - Ann Arbor, USA; 2Santa Clara University - Santa Clara, USA

Arxiv: http://arxiv.org/abs/2411.11758v1