notesum.ai

Published at October 18

Multi-Source Spatial Knowledge Understanding for Immersive Visual Text-to-Speech

cs.CV
cs.AI
cs.CL
cs.IR

Released Date: October 18, 2024

Authors: Shuwei He1, Rui Liu1, Haizhou Li2

Aff.: 1Inner Mongolia University Hohhot, China; 2The Chinese University of Hong Kong Shenzhen, China

Arxiv: https://arxiv.org/abs/2410.14101v1