notesum.ai

Published at September 28

Towards Principled Evaluations of Sparse Autoencoders for Interpretability and Control

ICLR

Released Date: September 28, 2024