notesum.ai
Published at November 10CTC-Assisted LLM-Based Contextual ASR
eess.AS
cs.AI
cs.CL
Released Date: November 10, 2024
Authors: Guanrou Yang1, Ziyang Ma1, Zhifu Gao2, Shiliang Zhang1, Xie Chen1
Aff.: 1MoE Key Lab of Artificial Intelligence, AI Institute, X-LANCE Lab, Shanghai Jiao Tong University, China; 2Alibaba Group, China
| Encoder | Prompt | Biasing list size | test-clean | test-other | ||||
| WER | U-WER | B-WER | WER | U-WER | B-WER | |||
| Pre-trained WavLM | ✗ | ✗ | 2.13 | 1.20 | 10.15 | 4.73 | 2.84 | 22.43 |
| CTC Fine-tuned WavLM | ✗ | ✗ | 2.11 | 1.20 | 10.02 | 4.20 | 2.43 | 20.76 |
| CTC Fine-tuned WavLM | No bias | ✗ | 1.96 | 1.11 | 9.33 | 4.18 | 2.49 | 20.02 |
| Bias List | 100 | 1.27 | 1.00 | 3.67 | 2.72 | 2.16 | 8.02 | |
| 500 | 1.33 | 1.03 | 3.92 | 3.04 | 2.40 | 9.04 | ||
| 1000 | 1.33 | 1.00 | 4.16 | 2.99 | 2.31 | 9.33 | ||
| 2000 | 1.38 | 1.03 | 4.41 | 3.20 | 2.47 | 10.02 | ||
| GT Hotwords | ✗ | 1.13 | 0.94 | 2.78 | 2.68 | 2.32 | 6.00 | |