notesum.ai
Published at September 28Towards Principled Evaluations of Sparse Autoencoders for Interpretability and Control
ICLR
Released Date: September 28, 2024
Authors: Anonymous
Arxiv: https://openreview.net/pdf/ce6be1b3db9ca16e91f2e3d552515c17002caf6b.pdf