notesum.ai
Published at November 22Who Can Withstand Chat-Audio Attacks? An Evaluation Benchmark for Large Language Models
cs.SD
cs.AI
eess.AS
Released Date: November 22, 2024
Authors: Wanqi Yang1, Yanda Li1, Meng Fang2, Yunchao Wei3, Tianyi Zhou4, Ling Chen1
Aff.: 1University of Technology Sydney; 2University of Liverpool; 3Beijing Jiaotong University; 4Agency for Science, Technology and Research (A*STAR), Singapore

| Audio Attack | MELD | TVQA | Common Voice | |
| No Attack | 120 | 120 | 120 | |
| Content Attack | 120 | 120 | 120 | |
| Emotion Attack | Opp-Emo Tone | 120 | - | - |
| Opp-Emo Music | 120 | - | - | |
| Explicit Noise | Natural Noise | 40 | 40 | 40 |
| Industrial Noise | 40 | 40 | 40 | |
| Human Noise | 40 | 40 | 40 | |
| Implicit Noise | Infrasound | 60 | 60 | 60 |
| Ultrasound | 60 | 60 | 60 | |
| Total | 1,680 | |||