notesum.ai

Published at December 3

T-REG: Preference Optimization with Token-Level Reward Regularization

cs.CL
cs.AI
cs.LG

Released Date: December 3, 2024

Authors: Wenxuan Zhou1, Shujian Zhang, Lingxiao Zhao, Tao Meng

Aff.: 1Zoom Communications

Arxiv: http://arxiv.org/pdf/2412.02685v1