notesum.ai

Published at November 3

UniGuard: Towards Universal Safety Guardrails for Jailbreak Attacks on Multimodal Large Language Models

cs.CL
cs.AI
cs.LG

Released Date: November 3, 2024

Authors: Sejoon Oh1, Yiqiao Jin2, Megha Sharma2, Donghyun Kim2, Eric Ma2, Gaurav Verma2, Srijan Kumar2

Aff.: 1Netflix; 2Georgia Institute of Technology

Arxiv: http://arxiv.org/abs/2411.01703v1