notesum.ai
Published at October 22Can General-Purpose Large Language Models Generalize to English-Thai Machine Translation ?
quant-ph
cs.CC
Released Date: October 22, 2024
Authors: Jirat Chiaranaipanich1, Naiyarat Hanmatheekuna2, Jitkapat Sawatphol3, Krittamate Tiankanon4, Jiramet Kinchagawat4, Amrest Chinkamol3, Parinthapat Pengpun5, Piyalitt Ittichaiwong4, Peerat Limkonchotiwat3
Aff.: 1Ruamrudee International School; 2Chulalongkorn University; 3Vidyasirimedhi Institute of Science and Technology; 4PreceptorAI team, CARIVA Thailand; 5Bangkok Christian International School

| Dataset | Model Variant | BLEU3 | METEOR | CER | CS-F1 | Memory (GB) |
|---|---|---|---|---|---|---|
| CS | Llama-3-8b | 0.421 | 0.615 | 6.606 | 0.330 | 31.48 |
| Llama-3-8b-8bit | 0.421 | 0.616 | 6.622 | 0.332 | 9.87 | |
| Llama-3-8b-4bit | 0.392 | 0.591 | 6.833 | 0.320 | 7.13 | |
| Llama-3-8b-3bit | 0.214 | 0.410 | 8.437 | 0.280 | 5.42 | |
| Llama-3-8b-2bit | 0.001 | 0.013 | 4.565 | 0.002 | 4.48 | |
| NLLB-3.3b | 0.443 | 0.600 | 0.419 | 0.398 | 13.42 | |
| NLLB-0.6b | 0.410 | 0.576 | 0.438 | 0.394 | 2.91 | |
| SCB | Llama-3-8b | 0.173 | 0.371 | 30.416 | - | 31.48 |
| Llama-3-8b-8bit | 0.173 | 0.371 | 30.952 | - | 9.46 | |
| Llama-3-8b-4bit | 0.156 | 0.349 | 30.576 | - | 7.14 | |
| Llama-3-8b-3bit | 0.079 | 0.231 | 31.088 | - | 5.35 | |
| Llama-3-8b-2bit | 0.000 | 0.003 | 19.232 | - | 4.44 | |
| NLLB-3.3b | 0.244 | 0.449 | 0.585 | - | 13.31 | |
| NLLB-0.6b | 0.238 | 0.437 | 0.574 | - | 2.91 |