Empirical evidence for predictive coding tendencies in the GPT-2 family: residual stream convergence, activation patching, MLP transform analysis, zero-ablation, and logit lens across 7 languages.
deep-learning transformers predictive-coding gpt2 mechanistic-interpretability residual-stream logit-lens zero-ablation
-
Updated
May 28, 2026 - Python