notesum.ai

Published at November 26

How do Multimodal Foundation Models Encode Text and Speech? An Analysis of Cross-Lingual and Cross-Modal Representations

cs.CL

Released Date: November 26, 2024

Authors: Hyunji Lee¹, Danni Liu¹, Supriti Sinhamahapatra¹, Jan Niehues¹

Aff.: ¹Karlsruhe Institute of Technology, Germany

Arxiv: http://arxiv.org/abs/2411.17666v1

Refer to caption