notesum.ai
Published at November 12Direct Preference Optimization Using Sparse Feature-Level Constraints
cs.AI
cs.CL
Released Date: November 12, 2024
Authors: Qingyu Yin1, Chak Tou Leong2, Hongbo Zhang1, Minjun Zhu, Hanqi Yan3, Qiang Zhang4, Yulan He3, Wenjie Li2, Jun Wang5, Yue Zhang1, Linyi Yang1
Aff.: 1Westlake University; 2The Hong Kong Polytechnic University; 3Kings College London; 4Zhejiang University; 5University College London

| Method | LPD | Margin | Constraint | Constraint Type | ||||
|---|---|---|---|---|---|---|---|---|
| DPO | 0 | - | ||||||
| SimPO | (a constant) | - | ||||||
| TDPOi | KL Divergence | |||||||
| FPO | MSE |