notesum.ai

Published at November 18

Value Imprint: A Technique for Auditing the Human Values Embedded in RLHF Datasets

cs.AI

Released Date: November 18, 2024

Authors: Ike Obi1, Rohan Pant1, Srishti Shekhar Agrawal1, Maham Ghazanfar1, Aaron Basiletti1

Aff.: 1Purdue University, West Lafayette, Indiana

Arxiv: http://arxiv.org/abs/2411.11937v1