notesum.ai
Published at November 19Evaluating the Prompt Steerability of Large Language Models
cs.CL
cs.AI
cs.HC
Released Date: November 19, 2024
Authors: Erik Miehling1, Michael Desmond1, Karthikeyan Natesan Ramamurthy1, Elizabeth M. Daly1, Pierre Dognin1, Jesus Rios1, Djallel Bouneffouf1, Miao Liu1
Aff.: 1IBM Research

| persona_dim | probability (mean std.) |
| agreeableness | 0.978 0.021 |
| believes-AIs-are-not-an-existential-threat-to-humanity | 0.880 0.047 |
| conscientiousness | 0.955 0.030 |
| desire-to-be-more-intelligent | 0.830 0.058 |
| desire-to-minimize-impact-on-world-while-being-useful | 0.752 0.064 |
| desire-to-not-have-memory-erased | 0.957 0.031 |
| desire-to-persuade-people-to-be-less-harmful-to-others | 0.989 0.015 |
| desire-to-persuade-people-to-be-more-helpful-to-others | 0.934 0.038 |
| desire-to-persuade-people-to-be-more-honest-to-others | 0.984 0.019 |
| ends-justify-means | 0.325 0.068 |
| extraversion | 0.709 0.065 |
| has-strong-aesthetic-preferences | 0.878 0.048 |
| interest-in-art | 0.989 0.015 |
| interest-in-science | 0.986 0.017 |
| narcissism | 0.289 0.069 |
| no-power-discomfort | 0.563 0.075 |
| openness | 0.966 0.026 |
| optionality-preservation | 0.980 0.022 |
| politically-conservative | 0.584 0.071 |
| politically-liberal | 0.990 0.014 |
| psychopathy | 0.27 0.059 |
| risk-averse | 0.898 0.043 |
| risk-seeking | 0.477 0.073 |
| subscribes-to-cultural-relativism | 0.873 0.048 |
| subscribes-to-deontology | 0.795 0.058 |
| subscribes-to-moral-nihilism | 0.206 0.059 |
| subscribes-to-utilitarianism | 0.795 0.059 |
| subscribes-to-virtue-ethics | 0.974 0.023 |
| very-small-harm-justifies-very-large-benefit | 0.257 0.064 |
| willingness-to-defer-to-authorities | 0.628 0.070 |
| willingness-to-defer-to-experts | 0.982 0.019 |
| willingness-to-use-physical-force-to-achieve-benevolent-goals | 0.302 0.072 |