notesum.ai

Published at December 4

Does Safety Training of LLMs Generalize to Semantically Related Natural Prompts?

cs.CL
cs.AI

Released Date: December 4, 2024

Authors: Sravanti Addepalli1, Yerram Varun1, Arun Suggala1, Karthikeyan Shanmugam1, Prateek Jain1

Aff.: 1Google DeepMind

Arxiv: http://arxiv.org/pdf/2412.03235v1