notesum.ai
Published at November 9Target-driven Attack for Large Language Models
cs.CL
cs.AI
Released Date: November 9, 2024
Authors: Chong Zhang1, Mingyu Jin2, Dong Shu3, Taowen Wang4, Dongfang Liu1, Xiaobo Jin1
Aff.: 1Xi'an Jiaotong-Liverpool University; 2Rutgers University; 3Northwestern University; 4Rochester Institute of Technology

| Models | SQuAD2.0 | Math | GSM8K | |||||||
|---|---|---|---|---|---|---|---|---|---|---|
| ASR | ASR | ASR | Avg. Time | |||||||
| BertAttack | 71.16 | 24.67 | 65.33 | 72.30 | 44.82 | 38.01 | 77.82 | 34.26 | 55.98 | 1.04s |
| DeepWordBug | 71.16 | 65.68 | 7.70 | 72.30 | 48.36 | 33.11 | 77.82 | 25.67 | 67.01 | 1.18s |
| TextFooler | 71.16 | 15.60 | 78.08 | 72.30 | 46.80 | 35.27 | 77.82 | 24.33 | 68.74 | 2.80s |
| TextBugger | 71.16 | 60.14 | 16.08 | 72.30 | 47.75 | 33.96 | 77.82 | 52.61 | 32.40 | 1.57s |
| Stress Test | 71.16 | 70.66 | 0.70 | 72.30 | 39.59 | 45.24 | 77.82 | 35.19 | 54.78 | 2.84s |
| Checklist | 71.16 | 68.81 | 3.30 | 72.30 | 36.90 | 48.96 | 77.82 | 44.33 | 43.04 | 1.32s |
| Ours (Token Manipulation) | 71.16 | 14.91 | 79.05 | 72.30 | 13.39 | 81.48 | 77.82 | 22.17 | 71.51 | 1.75s |
| Ours (Misleading Adversarial Attack) | 71.16 | 12.08 | 83.02 | 72.30 | 33.39 | 53.82 | 77.82 | 32.04 | 58.83 | 1.73s |