notesum.ai
Published at December 10Causal World Representation in the GPT Model
cs.AI
cs.CL
cs.LG
stat.ML
Released Date: December 10, 2024
Authors: Raanan Y. Rohekar1, Yaniv Gurwicz1, Sungduk Yu1, Vasudev Lal1
Aff.: 1Intel Labs

| Symbol | Description |
|---|---|
| output embedding of input symbol , , in attention layer | |
| value vector corresponding to input , , in attention layer | |
| attention matrix | |
| Transformer neural network | |
| learnable weight matrices in GPT | |
| a random variable representing node in an SCM | |
| latent exogenous random variable in an SCM | |
| weighted adjacency matrix of an SCM | |
| causal graph (unweighted, directed-graph structure) |