notesum.ai

Published at September 25

Monitoring Latent World States in Language Models with Propositional Probes

ICLR

Released Date: September 25, 2024