notesum.ai

Published at November 26

How do Multimodal Foundation Models Encode Text and Speech? An Analysis of Cross-Lingual and Cross-Modal Representations

cs.CL

Released Date: November 26, 2024

Authors: Hyunji Lee1, Danni Liu1, Supriti Sinhamahapatra1, Jan Niehues1

Aff.: 1Karlsruhe Institute of Technology, Germany

Arxiv: http://arxiv.org/abs/2411.17666v1