notesum.ai
Published at October 22Enhancing Low-Resource ASR through Versatile TTS: Bridging the Data Gap
cs.NI
cs.AI
cs.CR
cs.LG
Released Date: October 22, 2024
Authors: Guanrou Yang1, Fan Yu2, Ziyang Ma1, Zhihao Du2, Zhifu Gao2, Shiliang Zhang2, Xie Chen1
Aff.: 1MoE Key Lab of Artificial Intelligence, AI Institute, X-LANCE Lab, Shanghai Jiao Tong University, China; 2Institute for Intelligent Computing, Alibaba Group, China

| Data Type | Language | Real(h) | TTS(h) | WER /CER(%) | Rel Gain(%) |
| Accented English | English | 300 | 0 | 12.61 | 15.23 |
| 300 | 300 | 10.69 | |||
| Minority Language | Korean | 3700 | 0 | 8.48 | 25.59 |
| 3700 | 4000 | 6.31 | |||
| Chinese Dialects | Min Nan | 200 | 0 | 55.42 | 45.85 |
| 200 | 200 | 30.00 | |||
| Wu | 200 | 0 | 27.47 | 27.45 | |
| 200 | 200 | 19.93 | |||
| Hotwords | Chinese | 300 | 0 | 21.32 | 38.33 |
| 300 | 300 | 13.15 |