notesum.ai

Published at November 13

Can sparse autoencoders be used to decompose and interpret steering vectors?

cs.LG
cs.AI
cs.CL

Released Date: November 13, 2024

Authors: Harry Mayne1, Yushi Yang1, Adam Mahdi1

Aff.: 1University of Oxford

Arxiv: http://arxiv.org/abs/2411.08790v1