notesum.ai
Published at November 7Rethinking Bradley-Terry Models in Preference-Based Reward Modeling: Foundations, Theory, and Alternatives
cs.AI
Released Date: November 7, 2024
Authors: Hao Sun1, Yunyi Shen2, Jean-Francois Ton3
Aff.: 1University of Cambridge, Cambridge, UK; 2Massachusetts Institute of Technology, Cambridge, MA, USA; 3ByteDance Research, London, UK
