notesum.ai

Published at November 5

Stochastic Monkeys at Play: Random Augmentations Cheaply Break LLM Safety Alignment

cs.LG
cs.AI

Released Date: November 5, 2024

Authors: Jason Vega1, Junsheng Huang2, Gaokai Zhang2, Hangoo Kang1, Minjia Zhang1, Gagandeep Singh3

Aff.: 1University of Illinois Urbana-Champaign; 2University of Illinois Urbana-Champaign, Zhejiang University; 3University of Illinois Urbana-Champaign, VMware Research

Arxiv: http://arxiv.org/abs/2411.02785v1