notesum.ai

Published at October 22

Navigating Noisy Feedback: Enhancing Reinforcement Learning with Error-Prone Language Models

cs.CL
cs.AI

Released Date: October 22, 2024

Authors: Muhan Lin1, Shuyang Shi1, Yue Guo1, Behdad Chalaki2, Vaishnav Tadiparthi2, Ehsan Moradi Pari2, Simon Stepputtis1, Joseph Campbell1, Katia Sycara1

Aff.: 1Carnegie Mellon University; 2Honda Research Institute USA

Arxiv: https://arxiv.org/abs/2410.17389v1