notesum.ai
Published at November 11Combining Domain and Alignment Vectors to Achieve Better Knowledge-Safety Trade-offs in LLMs
cs.AI
Released Date: November 11, 2024
Authors: Megh Thakkar1, Yash More1, Quentin Fournier2, Matthew Riemer1, Pin-Yu Chen3, Amal Zouaq1, Payel Das3, Sarath Chandar2
Aff.: 1Mila - Quebec AI Institute; 2Chandar Research Lab; 3IBM Research
| Dataset | Pre-trained | Domain Expert | Aligned | ORPO | DPO | Slerp | MergeAlign |
| PubMedQA | 59.8 | 68.9 | 63.5 | 68.2 | 62.2 | 71.4 | 66.4 |
| RCT | 73.6 | 73.5 | 70.05 | 72.95 | 74.2 | 73.75 | 70.7 |
| USMLE | 30.53 | 37.94 | 39.35 | 37.07 | 37.78 | 39.91 | 37.62 |
| ChemProt | 28 | 47.2 | 43.2 | 44.8 | 49.8 | 40.2 | 48 |
| MQP | 66.06 | 79.34 | 74.268 | 75.73 | 73.27 | 85.74 | 83.93 |
| Avg | 51.59 | 61.37 | 58.07 | 59.75 | 59.45 | 62.2 | 61.33 |