You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Sensitivity ablation — same residual capture, two labelling rules. Reviewer-defence.
Crosscoders — cross-model + cross-stage
The methodology behind paper-1's Pearson causal-equivalence (Pearson_CE) finding.
First per-feature causal-equivalence test in the crosscoder literature.
Cross-STAGE crosscoder. LoRA toggle pattern (single base + PEFT.disable_adapter).
Qwen3.5-4B base vs mechreward-G3
Guards — product reproducers
Each notebook reproduces an exact metric behind a shipped openinterp Guard
(SDK on PyPI, demo on HF, landing on openinterp.org/products/X).
Drop-in pip install openinterp and you have these probes.
Use dtype=torch.bfloat16 (not the deprecated torch_dtype=) and attn_implementation='sdpa' (not flash-attn — reproducibility + install pain across Colab/Kaggle). HF_TOKEN goes through Colab/Kaggle secrets, never hard-coded. Stream checkpoints to HF every 5–10M tokens — Drive-only checkpoints die with the kernel. Use the multimodal layer-access fallback (getattr(model.model, 'layers', None) or model.model.language_model.layers), not a hard-coded .layers[N]. Report honest var_expl, L0, and dead-feature percentage — not cherry-picked seeds. CI checks all of these.
Port a notebook to a new model — pick an existing notebook at your tier and swap MODEL_ID, LAYER, D_MODEL. Name it NN_<tier>_<model>_<platform>.ipynb.
Replicate a 2024–2026 paper — title cell with arxiv link, pinned install, paper hyperparameters, inline implementation, validation cell that matches the paper's headline metric within tolerance.
Add a platform (TPU/ROCm/MPS) — write a _platform_<name>.py helper with pick_device() / get_dtype(), patch one notebook as PoC, open a draft PR and tag @caiovicentino for design review.
Before opening a PR, validate JSON: python3 -c "import json; json.load(open('notebooks/YOUR.ipynb'))". CI runs nbformat.validate. If you have a GPU, dry-run with jupyter nbconvert --to notebook --execute --ExecutePreprocessor.timeout=300 — expect heavy training cells to time out; you're just catching import + dtype bugs.
Output schemas other tools consume
If your notebook emits a JSON that the website consumes, match the schema: