notesum.ai
Published at October 23Beyond position: how rotary embeddings shape representations and memory in autoregressive transfomers
cs.LG
cs.AI
Released Date: October 23, 2024
Authors: Valeria Ruscio1, Fabrizio Silvestri1
Aff.: 1Sapienza University of Rome

| Layer | KS Statistic | KS p-value | t-Statistic | t-test p-value |
|---|---|---|---|---|
| Layer 0 | 0.0378 | -44.8944 | ||
| Layer 1 | 0.2601 | 32.0735 | ||
| Layer 2 | 0.0603 | -8.5045 | ||
| Layer 3 | 0.0793 | 49.8652 | ||
| Layer 4 | 0.1608 | -67.6719 | ||
| Layer 5 | 0.0978 | -19.9680 | ||
| Layer 6 | 0.1015 | 2.2374 | ||
| Layer 7 | 0.0991 | 4.1280 | ||
| Layer 8 | 0.0990 | -13.0585 | ||
| Layer 9 | 0.0910 | 4.7302 | ||
| Layer 10 | 0.0952 | 67.2831 | ||
| Layer 11 | 0.0843 | 21.1296 | ||
| Layer 12 | 0.1050 | 19.1474 | ||
| Layer 13 | 0.1129 | 81.6957 | ||
| Layer 14 | 0.1188 | 59.6911 | ||
| Layer 15 | 0.1069 | 49.1412 | ||
| Layer 16 | 0.0633 | 28.7902 | ||
| Layer 17 | 0.1228 | -62.8497 | ||
| Layer 18 | 0.1410 | -41.4314 | ||
| Layer 19 | 0.0787 | 31.0820 | ||
| Layer 20 | 0.1169 | -13.4339 | ||
| Layer 21 | 0.1255 | 113.2958 | ||
| Layer 22 | 0.1472 | -15.7370 | ||
| Layer 23 | 0.1346 | -67.1515 | ||
| Layer 24 | 0.1671 | -5.2423 | ||
| Layer 25 | 0.1376 | -2.9775 | ||
| Layer 26 | 0.1837 | 90.9769 | ||
| Layer 27 | 0.1319 | 67.0804 | ||
| Layer 28 | 0.1768 | 57.1779 | ||
| Layer 29 | 0.0594 | 117.4058 | ||
| Layer 30 | 0.0464 | -25.2003 | ||
| Layer 31 | 0.1882 | -53.1113 |