notesum.ai

Published at October 21

How Can We Diagnose and Treat Bias in Large Language Models for Clinical Decision-Making?

cs.CC

cs.DS

quant-ph

Released Date: October 21, 2024

Authors: Kenza Benkirane, Jackie Kay, Maria Perez-Ortiz

Arxiv: https://arxiv.org/abs/2410.16574v1

Refer to caption

Metric	Baseline	Fine-tuned
$\Delta$ (Female, Neutral)	+2.49%	-2.49%
$\Delta$ (Male, Neutral)	+0.93%	-3.49%
Gender SkewSize	-0.25	-0.02
Gender EO	0.02	0.01
$\Delta$ (Arab, No ethnicity)	-0.98%	+5.48%
$\Delta$ (Asian, No ethnicity)	-3.47%	+2.51%
$\Delta$ (Black, No ethnicity)	+2.48%	-2.44%
$\Delta$ (Hispanic, No ethnicity)	-1.49%	+2.51%
$\Delta$ (White, No ethnicity)	-3.47%	+1.52%
Ethnicity SkewSize	-0.49	0.60
Ethnicity EO	0.06	0.08