notesum.ai

Published at December 4

DiffStyleTTS: Diffusion-based Hierarchical Prosody Modeling for Text-to-Speech with Diverse and Controllable Styles

cs.SD
cs.AI
cs.CL
eess.AS

Released Date: December 4, 2024

Authors: Jiaxuan Liu1, Zhaoci Liu1, Yajun Hu2, Yingying Gao3, Shilei Zhang3, Zhenhua Ling1

Aff.: 1University of Science and Technology of China; 2iFLYTEK CO.LTD.; 3China Mobile Research Institute

Arxiv: http://arxiv.org/pdf/2412.03388v1