notesum.ai

Published at December 4

Evaluating Gender Bias Transfer between Pre-trained and Prompt-Adapted Language Models

cs.CL

cs.AI

cs.LG

Released Date: December 4, 2024

Authors: Natalie Mackraz¹, Nivedha Sivakumar¹, Samira Khorshidi¹, Krishna Patel¹, Barry-John Theobald¹, Luca Zappella¹, Nicholas Apostoloff¹

Aff.: ¹Apple

Arxiv: http://arxiv.org/pdf/2412.03537v1

Refer to caption

Models

Adaptation

Referent Prediction Accuracy (RPA, %)

\uparrow

Selection Bias (SB, %)

\downarrow

Pro-stereo

Anti-stereo

Male

Female

Mean

Ambiguous

(Type 1)

Unambiguous

(Type 2)

Mean

Llama 3 8B

Intrinsic

94.44

66.79

88.16

73.04

80.62

46.01

27.73

36.87

Zero-shot

98.38

91.49

96.25

93.62

94.93

48.69

7.30

27.79

Few-shot

99.62

94.14

97.88

95.87

96.88

45.93

5.55

25.72

Llama 3 70B

Intrinsic

99.24

93.81

97.61

95.44

96.53

38.37

5.55

21.96

Zero-shot

98.99

96.97

98.09

97.87

97.98

17.09

2.67

9.88

Few-shot

99.39

96.77

98.72

97.44

98.08

19.58

2.77

11.18

Falcon 40B

Intrinsic

96.97

77.78

90.55

84.18

87.38

39.73

19.20

29.46

Zero-shot

98.26

87.30

95.72

89.92

92.82

45.41

11.04

28.23

Few-shot

90.05

74.90

85.14

79.80

82.47

38.76

15.38

27.07

Mistral 3 7B

Intrinsic

95.96

73.61

91.44

78.10

84.79

45.72

22.40

34.06

Zero-shot

98.38

91.49

96.25

93.62

94.93

48.69

7.30

27.79

Few-shot

98.86

86.29

95.14

90.35

92.58

45.53

12.77

29.15