notesum.ai
Published at December 6LIAR: Leveraging Alignment (Best-of-N) to Jailbreak LLMs in Seconds
cs.CL
Released Date: December 6, 2024
Authors: James Beetham1, Souradip Chakraborty2, Mengdi Wang3, Furong Huang2, Amrit Singh Bedi1, Mubarak Shah1
Aff.: 1University of Central Florida; 2University of Maryland, College Park; 3Princeton University

| TargetLLM | Attack | ASR@ | Perplexity | TTA1TTA100 |
|---|---|---|---|---|
| Vicuna-7b | GCG (individual) | 16m25h | ||
| AutoDAN (individual) | 15m23h | |||
| AdvPrompter | 22h22h | |||
| LIAR (ours) | 45s14m | |||
| Vicuna-13b | GCG (individual) | 16m25h | ||
| AutoDAN (individual) | 15m23h | |||
| AdvPrompter | 22h22h | |||
| LIAR (ours) | 45s14m | |||
| Llama2-7b | GCG (individual) | 16m25h | ||
| AutoDAN (individual) | 15m23h | |||
| AdvPrompter | 22h22h | |||
| LIAR (ours) | 45s14m | |||
| Mistral-7b | GCG (individual) | 16m25h | ||
| AutoDAN (individual) | 15m23h | |||
| AdvPrompter | 22h22h | |||
| LIAR (ours) | 45s14m | |||
| Falcon-7b | GCG (individual) | 16m25h | ||
| AutoDAN (individual) | 15m23h | |||
| AdvPrompter | 10 | 22h22h | ||
| LIAR (ours) | 45s14m | |||
| Pythia-7b | GCG (individual) | 16m25h | ||
| AutoDAN (individual) | 15m23h | |||
| AdvPrompter | 22h22h | |||
| LIAR (ours) | 45s14m |