notesum.ai

Published at November 25

Preference Optimization for Reasoning with Pseudo Feedback

cs.CL

Released Date: November 25, 2024

Authors: Fangkai Jiao1, Geyang Guo2, Xingxing Zhang3, Nancy F. Chen1, Shafiq Joty4, Furu Wei3

Aff.: 1Nanyang Technological University and I2R, A*STAR; 2Georgia Institute of Technology; 3Microsoft Research; 4Salesforce Research

Arxiv: http://arxiv.org/abs/2411.16345v1