notesum.ai
Published at November 4Fantastic LLMs for Preference Data Annotation and How to (not) Find Them
cs.CL
cs.AI
Released Date: November 4, 2024
Authors: Guangxuan Xu1, Kai Xu2, Shivchander Sudalairaj2, Hao Wang2, Akash Srivastava2
Aff.: 1IBM Research; 2Red Hat Inc

| Reward Function | Chat | ChatHard | Safety | Reasoning | Overall |
| GPT-4-turbo | 95.3 | 75.4 | 86.7 | 82.7 | 85.2 |
| Claude-3.5-sonnet | 96.4 | 74.0 | 81.6 | 84.7 | 84.2 |
| RM-Mistral-7B | 96.6 | 60.5 | 87.0 | 77.4 | 80.4 |
| ArmoRM-Llama-3-8B | 96.9 | 76.8 | 90.5 | 97.3 | 90.4 |
| Generative reward | 53.0 | 49.5 | 48.3 | 52.1 | 50.0 |
| density ratio | 92.2 | 60.5 | 82.4 | 73.8 | 77.2 |
| density ratio (base) | 89.9 | 65.6 | 62.8 | 71.9 | 71.9 |
| density ratio (sft, base) | 79.6 | 65.6 | 52.8 | 70.0 | 67.0 |
| CDR (safety) | 88.3 | 61.8 | 91.0 | 87.7 | 82.5 |
| CDR (code/math) | 91.6 | 60.1 | 89.9 | 89.7 | 83.0 |
| CDR (chat-hard) | 89.1 | 69.7 | 89.1 | 85.9 | 83.5 |
| CDR (adaptive, chat-hard, oracle) | 89.1 | 69.7 | 91.0 | 89.7 | 84.9 |
| CDR (adaptive, oracle) | 92.2 | 60.5 | 91.0 | 89.7 | 83.4 |
| CDR (adaptive, router) | 93.9 | 56.8 | 91.0 | 88.0 | 82.6 |