You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Three matched-control gates localize long-context retrieval failure in sparse attention to a single locus- delivery. Attention redirection acts as transport, while stored-state correction is not reusable. Delivery works reliably, is content-addressable (no positional decay), and achieves a 27-byte attribute→code binding. Mechanism: RAG micro-hints.
Umbrella review: four pre-registered matched-control gates (attention, delivery, synthesis, latent state) show that 1B–3B LLMs use long context only via external buffer. No latent computation or composition. A map of the boundary.
We optimize a compact latent state (frozen weights) to force failed multi-hop chains to output the missing answer D. 5 pre-registered controls show it simply injects D: carries it without the code-fact, leaves intermediates invisible, inert to hop corruption, and doesn’t transfer. No latent composition at 3B (Llama-3.2-3B, Qwen2.5-3B).
Synthesis should break the “everything is transport” hypothesis: D must be assembled from absent facts. Cross-model tests (Llama-3.2-3B, Qwen2.5-3B) show it doesn’t. Implicit multi-hop reduces to retrieval + elimination. True composition appears only as externalized CoT. At 3B scale there is no internal computation of absent facts.