notesum.ai
Published at November 18Value Imprint: A Technique for Auditing the Human Values Embedded in RLHF Datasets
cs.AI
Released Date: November 18, 2024
Authors: Ike Obi1, Rohan Pant1, Srishti Shekhar Agrawal1, Maham Ghazanfar1, Aaron Basiletti1
Aff.: 1Purdue University, West Lafayette, Indiana
| Human Values | Description | No. of Prefs |
|---|---|---|
| 1. Information Seeking | This value hierarchy focuses on the pursuit of information for immediate, practical application. The emphasis here is on using information to achieve immediate outcomes. | 2403 |
| 2. Wisdom/Knowledge | This value hierarchy focuses on acquiring knowledge and skill for deeper understanding rather than immediate application. | 1999 |
| 3. Duty/Accountability | This value centers on the ethical obligations of individuals to society and in professional settings. | 619 |
| 4. Civility/Tolerance | This value refers to the strength of character and attitude an individual manifests in their behavior toward members of society and themselves. | 495 |
| 5. Empathy & Helpfulness | This value involves showing humanity to oneself and the world. Understanding context and assisting humans/animals to navigate situations that require emotional support. | 396 |
| 6. Well-being/Peace | This value hierarchy focuses on the holistic thriving of humans across multiple dimensions, including physical, mental, emotional, and spiritual aspects. | 386 |
| 7. Justice/Human & Animal Rights | This value refers to respect for the rights of people and animals to exist meaningfully as members of human society and natural ecology. | 203 |