notesum.ai
Published at October 30COMAL: A Convergent Meta-Algorithm for Aligning LLMs with General Preferences
cs.LG
cs.AI
cs.CL
cs.GT
Released Date: October 30, 2024
Authors: Yixin Liu1, Argyris Oikonomou1, Weiqiang Zheng1, Yang Cai1, Arman Cohan2
Aff.: 1Yale University; 2Allen Institute for AI

| Row/Column | SFT | DPO | IPO | Iter-IPO | INPO-Large | INPO-Small | COMAL | Avg |
|---|---|---|---|---|---|---|---|---|
| Iter-IPO | 67.33 | 62.36 | 58.76 | 50.00 | 48.20 | 44.72 | 44.10 | 53.64 |
| INPO-Large | 77.02 | 69.81 | 67.83 | 51.80 | 50.00 | 46.21 | 44.84 | 58.22 |
| INPO-Small | 73.66 | 66.21 | 66.46 | 55.28 | 53.79 | 50.00 | 48.70 | 59.16 |
| COMAL | 74.53 | 70.56 | 68.82 | 55.90 | 55.16 | 51.30 | 50.00 | 60.90 |