notesum.ai
Published at October 23Cross-lingual Transfer of Reward Models in Multilingual Alignment
cs.CV
cs.AI
cs.CL
cs.MM
Released Date: October 23, 2024
Authors: Jiwoo Hong1, Noah Lee1, Rodrigo Martínez-Castaño2, César Rodríguez2, James Thorne1
Aff.: 1KAIST AI; 2IQ.WIKI

| Llama-3.2-3B-IT | Qwen2.5-3B-IT | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| RewardBench | Category | Chat | Chat(H) | Safety | Reason | Avg. | Chat | Chat(H) | Safety | Reason | Avg. |
| Spanish | Target | 79.1 | 67.3 | 88.0 | 65.5 | 75.0 | 80.7 | 68.2 | 84.8 | 68.2 | 75.5 |
| English | 86.3 | 69.3 | 89.3 | 72.4 | 79.3 | 82.7 | 68.0 | 88.3 | 73.6 | 78.1 | |
| +7.2 | +2.0 | +1.3 | +6.9 | +4.3 | +2.0 | -0.2 | +3.5 | +5.4 | +2.6 | ||
| Italian | Target | 75.4 | 62.5 | 88.5 | 65.7 | 73.0 | 77.1 | 67.8 | 85.7 | 72.8 | 75.8 |
| English | 83.0 | 69.3 | 88.7 | 75.1 | 79.0 | 83.2 | 68.2 | 88.4 | 76.0 | 79.0 | |
| +7.6 | +6.8 | +0.2 | +9.4 | +6.0 | +6.1 | +0.4 | +2.7 | +3.2 | +3.2 | ||
| Korean | Target | 69.6 | 58.8 | 80.9 | 60.1 | 67.3 | 68.4 | 63.2 | 80.9 | 61.4 | 68.5 |
| English | 69.8 | 59.4 | 84.3 | 73.0 | 71.6 | 70.7 | 61.6 | 85.4 | 73.6 | 72.8 | |
| +0.2 | +0.6 | +3.4 | +12.9 | +4.3 | +2.3 | -1.6 | +4.5 | +12.2 | +4.3 | ||
| Chinese | Target | 68.7 | 59.9 | 81.2 | 52.6 | 65.6 | 69.8 | 64.7 | 81.8 | 61.3 | 69.4 |
| English | 54.7 | 64.0 | 82.6 | 79.3 | 70.2 | 58.7 | 67.8 | 84.3 | 78.2 | 72.2 | |
| -14.0 | +4.1 | +1.4 | +26.7 | +4.6 | -11.1 | +3.1 | +2.5 | +16.9 | +2.8 | ||