notesum.ai
Published at November 7Dialectal Coverage And Generalization in Arabic Speech Recognition
cs.CL
cs.SD
eess.AS
Released Date: November 7, 2024
Authors: Amirbek Djanibekov1, Hawau Olamide Toyin1, Raghad Alshalan2, Abdullah Alitr2, Hanan Aldarmaki1
Aff.: 1Mohamed bin Zayed University of Artificial Intelligence, Abu Dhabi, UAE; 2STC, Riyadh, Saudi Arabia
| Dataset | Zero-Shot | Fine-Tuning | ||||
| Whisper | MMS | v1 | v2 | v1 | v2 | |
| Mixat | 56.01 | 61.51 | 39.1 | 35.05 | 34.29 | 33.40 |
| TARIC-SLU | 138.14 | 93.54 | 107.56 | 106.46 | 14.70 | 14.80 |
| ParallelCorp | 99.17 | 83.16 | 128.72 | 141.92 | 9.57 | 9.31 |
| SADA | 82.16 | 78.28 | 39.41 | 29.77 | 39.24 | 29.91 |
| MASC Al-Fetyani et al. (2021) | ||||||
| SAU | 48.39 | 65.30 | 61.23 | 58.72 | 27.40 | 27.33 |
| SYR | 26.65 | 33.21 | 21.99 | 18.37 | 18.64 | 17.42 |
| EGY | 41.73 | 66.04 | 50.87 | 47.17 | 38.47 | 36.43 |
| JOR | 28.65 | 54.63 | 61.23 | 34.97 | 19.72 | 21.08 |
| LEB | 40.95 | 64.58 | 35.65 | 42.66 | 30.01 | 28.05 |
| IRA | 41.69 | 59.33 | 50.46 | 48.03 | 31.10 | 34.64 |
| TUN | 47.45 | 60.58 | 50.37 | 46.67 | 19.26 | 18.52 |
| MOR | 65.87 | 80.84 | 78.92 | 66.87 | 47.59 | 49.40 |
| PAL | 53.20 | 83.72 | 77.94 | 73.53 | 55.88 | 53.53 |
| KUW | 36.00 | 81.71 | 64.74 | 52.02 | 50.29 | 46.24 |