logit-lens

Here are 9 public repositories matching this topic...

skyline-GTRr32 / OKI-TRACE

OKI TRACE: Local LLM observability. See step-by-step, layer-by-layer what your AI thinks. Logit Lens & Attention for HuggingFace models.

python open-source ai transformers developer-tools attention-mechanism blackbox huggingface ai-tools mechanistic-interpretability local-llm ai-interpretability llm-observability ai-transparency glass-box-ai llm-debugging logit-lens

Updated May 17, 2026
Python

jakomycat / logit-lens-vs-tuned-lens

Star

Decoding the black box of LLMs: A comparative analysis of Logit Lens vs. Tuned Lens to interpret intermediate Transformer layers in GPT-2.

ia mechanistic-interpretability logit-lens tuned-lens

Updated Apr 2, 2026
Jupyter Notebook

designer-coderajay / glassbox-mech

Star

Open-source EU AI Act Annex IV compliance toolkit. Mechanistic interpretability + circuit discovery for transformers. One function call generates a court-ready evidence package

Updated Jun 1, 2026
Python

fabthebest / champollion-protocol

Star

🏛️ Champollion cracked hieroglyphs in 1822. I applied the same logic to LLM internals. 95% accuracy, $0 cost, fully reproducible. Contributors welcome.

transformer ai-safety gpt2 mechanistic-interpretability activation-patching linear-probes logit-lens

Updated May 20, 2026
Jupyter Notebook

OpenInterpretability / decision-locator

Star

Find the layer where a language model commits a decision — and steer it. Any open-weight HF model. (WANDERING arc paper #6)

transformers ai-safety interpretability steering mechanistic-interpretability llm-agents activation-patching logit-lens

Updated Jun 7, 2026
Python

tomaszwi66 / TinyInterp

Star

Local Streamlit app for mechanistic interpretability of transformer models.

transformers pytorch neural-networks interpretability sparse-autoencoder streamlit llm mechanistic-interpretability activation-patching logit-lens

Updated May 6, 2026
Python

adeelahmad / mlx-lm-lens

Star

Mechanistic interpretability CLI for transformer models on Apple Silicon. Analyze per-layer predictions, monitor activation drift, compare models, discover circuits. MLX-based, no GPU needed.

python nlp machine-learning transformers lora quantization mlx model-analysis interpretability fine-tuning apple-silicon activation-analysis mechanistic-interpretability logit-lens circuit-discovery

Updated Mar 30, 2026
Python

gallam-research-dev / pc-transformer-interpretability

Star

Empirical evidence for predictive coding tendencies in the GPT-2 family: residual stream convergence, activation patching, MLP transform analysis, zero-ablation, and logit lens across 7 languages.

deep-learning transformers predictive-coding gpt2 mechanistic-interpretability residual-stream logit-lens zero-ablation

Updated May 28, 2026
Python

Seqev / latent-scratchpad-search

Star

We optimize a compact latent state (frozen weights) to force failed multi-hop chains to output the missing answer D. 5 pre-registered controls show it simply injects D: carries it without the code-fact, leaves intermediates invisible, inert to hop corruption, and doesn’t transfer. No latent composition at 3B (Llama-3.2-3B, Qwen2.5-3B).

transformers llama multi-hop-reasoning prompt-tuning knowledge-injection llm falsification mechanistic-interpretability qwen latent-reasoning soft-prompts logit-lens matched-controls

Updated Jun 4, 2026
Python

Improve this page

Add a description, image, and links to the logit-lens topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the logit-lens topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

logit-lens

Here are 9 public repositories matching this topic...

skyline-GTRr32 / OKI-TRACE

jakomycat / logit-lens-vs-tuned-lens

designer-coderajay / glassbox-mech

fabthebest / champollion-protocol

OpenInterpretability / decision-locator

tomaszwi66 / TinyInterp

adeelahmad / mlx-lm-lens

gallam-research-dev / pc-transformer-interpretability

Seqev / latent-scratchpad-search

Improve this page

Add this topic to your repo