notesum.ai

Published at December 4

Weighted-Reward Preference Optimization for Implicit Model Fusion

cs.CL

Released Date: December 4, 2024

Authors: Ziyi Yang1, Fanqi Wan, Longguang Zhong, Tianyuan Shi, Xiaojun Quan

Aff.: 1School of Computer Science and Engineering, Sun Yat-sen University, China

Arxiv: http://arxiv.org/pdf/2412.03187v1