tsim: sim env + viz/scripts + physics tuning by ElmoPA · Pull Request #512 · GaTech-RL2/EgoVerse

ElmoPA · 2026-06-25T03:23:28Z

tsim: sim env + viz/scripts + physics tuning

Squash of:

a31d8ab4 tshape sim environment
ac272e8a Add Tsimulation viz/scripted/stats tools + physics tuning

dataset: tsim training configs + HPT keymap/viz/eval + packed dataloading

Squash of:

f261ff68 tsim training configs/embodiment
b38264ab Add pushshapes_sim HPT training: keymap, viz, eval fixes
1e174131 Add episode-level packed dataloading

hnet: flexible stage refactor + packed training + viz/infra

Squash of 10 commits from temp-arch-flexible: 7faf2012, 49ed0d34, 8387986e, 11a266ce, 3fe9a353, 0ea9d013, c83b6e69, 63a2e852, 7b71650a, a063c021

HNet packed training: test suite (86 tests)

test_hnet_nets.py (57): routing, chunk, dechunk, isotropic,
stages (padded + packed), HNet assembly, ratio_loss, chunk_stats,
STE, RMSNorm, AdaLN.
test_packed_pipeline.py (9): normalize broadcast on padded vs
packed; _iter_leaves descent; multi-frame JPEG decode; end-to-end
packed stats collection.
test_training_recipe.py (20): apply_optimization_params,
init_weights height-scaled init, apply_lr_multiplier per-stage
stamping, parameter_groups (default, with bias/norm WD=0,
per-stage groups, AdamW-consumable). Plus algo wiring tests for
the opt-in init_weights_range / lr_multipliers /
use_parameter_groups / weight_decay kwargs.

Co-Authored-By: Claude Opus 4.7 (1M context) noreply@anthropic.com

wip: uncommitted edits from EgoVerse7 (active work-in-progress)

Imports from EgoVerse7@temp-arch-flexible working tree as of 2026-05-20.
Includes:

algo: input_modules.py + obs_transforms.py (new modules)
callbacks: chunker_residual_scheduler, ckpt_chunker(+dropout), random_attn_dropout
data configs: tsimulation_400ep, tsimulation_allep + tweaks to existing tsim configs
model configs: hnet_pushshapes_mamba_encdec, hnet_pushshapes_obs_ar + tweaks
eval/* edits, models/hnet_nets/* edits, schedulers, uv.lock
removes scripts/install_cuda_kernels.sh and egomimic/eval/eval_hnet_sim.py

Excluded: egomimic/algo/hnet.py.bak.preinput (manual backup) and drift_eval_out_* (eval artifacts).

dfot: v1 algo + Isotropic backbone w/ per-token AdaLN + continuous/discrete diffusion + DDPM/DDIM samplers

dfot: fix inference_step obs shaping (don't unsqueeze; drop dead ac_keys fallback)

dfot: packed-mode training + eval (cu_seqlens-aware backbone forward, per-frame obs cond)

dfot: causal-AR staircase sampler + schedule-matrix sampler + online rollout helper

dfot: PACE sbatch for 80ep packed training on PushShapes circle/basic

dfot: 20x training budget (12800 steps over 80 ep), --time 8h

dfot: full 750-ep training (200 ep, 32000 steps, tsimulation_full.yaml); drop 400ep config

dfot eval: val-data evaluator with full-chunk + staircase-AR overlay viz

dfot: relaunch sbatch with eval_dfot_val (teacher-forced viz) + seq_lens defensive fix

dfot refactor: unify sampling, fix AR _ar_pred, AR inference_step, PackedSimEval rename, composite eval

dfot eval: fix HNetSimEval->PackedSimEval comment reference

dfot sampling: split sample_step (primitive) from sample (loop wrapper); supports growing-T AR

dfot: inline AR rollout state into DFoT; delete CausalARRollout class

dfot algo: ar_inference_step_size knob, CFG plumbing, AdaLN dropouts

algo.py: add ar_inference_step_size (sub-steps per env tick at closed-loop AR); thread cfg_scale through _inference_step_ar/_inference_step_chunk; remove unused shape var
backbone.py: force_uncond branch for CFG two-pass blending; wire attn/resid dropout into Isotropic trunk
sampling.py: _CFGBackbone wrapper for cfg_scale > 1 sampling; schedule-matrix CFG plumbing

Co-Authored-By: Claude Opus 4.7 (1M context) noreply@anthropic.com

dfot configs+viz: model scaling, CFG, attn_dropout callback, 512x512 val viz

model/dfot_pushshapes.yaml: scale to 67M (d_model=512, T=12, num_heads=8, d_intermediate=2048); attn dropout=0.1, resid dropout=0.1, cond_dropout_prob=0.1; cfg_scale field; causal=true
data/tsimulation_full.yaml: 750-episode circle_750 dataset, batch_size=16
evaluator/eval_dfot_val.yaml + eval_dfot_full.yaml: cfg_scale + ar_chunk_size + ar_step_size knobs
callbacks/ckpt_attn_dropout.yaml: composed callback (checkpoints + random_attn_dropout with values [0.1, 0.5, 0.8, 0.9, 0.95, 0.97, 0.98])
eval/eval_dfot_val.py: 96x96 -> 512x512 nearest-neighbor upscale, palette (gt=green, chunk=red, ar=yellow), world-coord pixel scaling, threaded cfg_scale

Co-Authored-By: Claude Opus 4.7 (1M context) noreply@anthropic.com

dfot scripts: 200/400ep sbatch variants + eval CLI tools

sbatch_train_dfot_200ep_full750_pace.sh: minor cleanup
sbatch_train_dfot_400ep_full750_pace.sh: 400ep H200 launch with scheduler.max_steps=18800 (fixes the 200ep cosine-not-decaying bug)
sbatch_train_dfot_400ep_attndrop.sh: 400ep + random_attn_dropout (the 5.6x sim_coverage win)
scripts/eval_cfg_latest.py: post-hoc DFoT eval CLI with --cfg-scale, --ar-chunk-size, --ar-step-size, --ar-inference-chunk-size, --ar-inference-step-size, --skip-val/--skip-sim
scripts/eval_fsd_latest.py: convenience wrapper for inference_mode=chunk
scripts/sbatch_fsd_eval.sh + sbatch_sim_sweep.sh: sbatch templates for closed-loop sim sweeps

Co-Authored-By: Claude Opus 4.7 (1M context) noreply@anthropic.com

outer_stage + loss: abstract base classes for OuterStage/Loss refactor

Introduces the abstract bases for the upcoming refactor:

egomimic/algo/outer_stage.py: OuterStage base. Owns an
inner_stage field (the trunk). Subclasses implement encode
(raw batch -> trunk-input tensor; can sample noise / record state
on ctx) and decode (trunk-output -> per-modality prediction keys
on batch).
egomimic/algo/loss.py: Loss base + CompositeLoss (weighted
sum of terms) + MSELoss (per-modality MSE between pred_key and
target_key). Loss policy becomes data — a hydra config block —
rather than inheritance.

No algorithm code uses these yet; HNetOuterStage / DFoTOuterStage and
the algo.hnet.HNet / algo.dfot.DFoT refactors come in follow-up commits
on this branch.

continuous_diffusion: split forward into q_sample + compute_loss

Foundational refactor for the upcoming DFoTOuterStage + DFoTLoss
classes:

q_sample(x, t) -> dict: forward-noising step. Returns x_t, noise,
alpha_t, sigma_t, logsnr, and the precond_scale * logsnr time_cond
the backbone consumes. No backbone call inside.
compute_loss(v_pred, q_state) -> per-token weighted MSE: takes the
dict from q_sample plus the backbones v_pred and computes the
SNR-weighted epsilon-MSE.
forward(backbone, x, t, cond): kept as a back-compat wrapper that
calls q_sample, runs the backbone, then compute_loss. Existing
callers (DFoT.forward_training) are unaffected.

Bitwise verified equivalent to the prior single-method path via
the included /tmp/test_diffusion_split.py smoke (loss + x_pred match
exactly, same random seed).

discrete_diffusion.py is unsplit for now; current configs use
continuous.

dfot outer_stage + DFoTLoss: training-path classes (loss equivalence verified)

egomimic/algo/dfot/outer_stage.py: DFoTOuterStage subclass.
encode: encode obs to per-token cond, sample noise levels, run
diffusion.q_sample, store q_state + external_cond on ctx, return
noisy x_t. decode: write batch[pred_v] for the loss to read.
forward override threads cu_seqlens/max_seqlen (packed mode) and
time_cond into the backbone call.
egomimic/algo/loss.py: DFoTLoss class. Reads batch[pred_v] and
ctx.q_state, calls diffusion.compute_loss (SNR-weighted eps-MSE),
reduces to scalar.

Bitwise verified via /tmp/test_dfot_outer_stage.py: padded-mode loss
through DFoTOuterStage + DFoTLoss matches DFoT.forward_training to
0.0e+00 difference at fixed seed. Real CondEncoderModule +
DFoTBackbone + ContinuousDiffusion submodules; no mocks of the math
path.

Algo class (DFoT.forward_training) is NOT yet wired to use these —
that comes in the next commit on this branch. Inference paths
(closed-loop AR sample_step, chunk plan-execute) also deferred.

dfot algo + yaml: refactor to outer_stage + loss

Algo class:

init now takes outer_stage: DFoTOuterStage and optional
loss: Loss (auto-built as DFoTLoss(outer_stage.diffusion) if None).
Removes legacy cond_encoder, backbone, diffusion_type,
diffusion_kwargs, cond_output_key args — they now live on the
outer_stage subblock.
Adds @Property accessors for cond_encoder, backbone,
diffusion, outer_stage, loss so existing inference paths
(_inference_step_ar, _inference_step_chunk, _sample_chunk,
forward_eval) keep working unchanged via property forwarding.
forward_training shrinks ~40 LOC -> ~20 LOC: build ctx, call
outer_stage(batch, ctx), call loss(batch, ctx). No more inline
diffusion math; no more cu_seqlens threading at this level.
Adds ar_inference_step_size knob.

dfot_pushshapes.yaml:

New outer_stage: block wraps cond_encoder + backbone + diffusion.
Removes top-level diffusion_type / diffusion_kwargs; the
diffusion module is now its own target inside outer_stage.
loss: omitted (uses default DFoTLoss(outer_stage.diffusion)).

End-to-end smoke (scripts/test_dfot_refactor_e2e.py) verifies the
config instantiates via hydra and forward_training emits a finite
scalar loss. Bitwise loss-equivalence was already shown in the prior
commit (test_dfot_outer_stage.py).

Old checkpoints WILL NOT load — state_dict keys moved from
nets.{cond_encoder,backbone}.* to nets.outer_stage.{cond_encoder,inner_stage}.*
This is intentional per the agreed clean-break refactor.

dfot inference smoke: verify AR + chunk paths after outer_stage refactor

Adds scripts/test_dfot_inference.py: instantiates the refactored DFoT
from dfot_pushshapes.yaml, runs inference_step in both ar and chunk
modes, asserts action is (action_dim,) and finite. Verifies the
@Property accessors (self.backbone, self.cond_encoder, self.diffusion)
forward correctly to outer_stage submodules so the closed-loop AR
and chunk-mode inference paths keep working after the refactor.

Passing on compute node 8997316:
[ar] action @ t=0: [0.30 0.47]
[ar] action @ t=1: [0.44 1.06]
[chunk] action @ t=0: [-0.78 -1.89]

hnet_outer_stage + HNetLoss: H-Net OuterStage subclass (bitwise verified)

egomimic/algo/hnet_outer_stage.py: HNetOuterStage class. Inherits
from OuterStage with inner_stage = HNetCore (stage tree). Owns
cond_encoder, input_modules (summed per-token contributions),
action_out head. Three forward paths inherited from the old
HNetPolicy pattern: forward(batch, ctx) dispatcher (padded/packed),
generate (offline AR), init_step_state + step (online single-tick).
egomimic/algo/loss.py: HNetLoss class. Reads batch[pred_action] +
batch[actions], adds per-chunker ratio_loss_from_aux from ctx.aux.
scripts/test_hnet_outer_stage.py: equivalence smoke. Instantiates
the existing hnet_pushshapes.yaml subcomponents, wraps the SAME
instances in HNetPolicy and HNetOuterStage, ties their action_out
heads, runs identical padded forward. Verified bitwise (max diff
0.00e+00 at fixed seed) on H200 alloc 8989249. Also smoke-tests
the step inference path (shape + finite).

The old HNetPolicy class is still in egomimic/algo/hnet.py and not yet
removed. Algo-class refactor and yaml updates come in follow-up
commits on this branch.

Old checkpoints will NOT load — state_dict keys move from policy.*
to outer_stage.* (or wherever the algo class places it). Per clean-
break policy.

hnet algo + yaml: refactor to outer_stage + loss (base config)

Algo class:

HNet.init now takes outer_stage: HNetOuterStage + loss: Optional[Loss]
instead of cond_encoder + hnet + action_dim + action_horizon +
d_model + action_head_type + input_modules. action_horizon read from
outer_stage.
Loss defaults to HNetLoss() if not provided.
self.nets is now ModuleDict({outer_stage, loss}); old keys
(self.nets[policy], self.nets[cond_encoder], ...) are exposed via
@Property forwarding to outer_stage submodules so legacy callsites
in forward_eval / _teacher_forced_packed / _ar_rollout_packed / step
inference keep working.
forward_training builds (batch, ctx), calls outer_stage(batch, ctx)
- loss(batch, ctx), unpacks per-term breakdown (ctx.action_loss,
  ctx.ratio_loss) into the predictions dict for logging.

HNetLoss:

Computes action MSE + ratio_loss_from_aux(ctx.aux) and stashes the
per-term split on ctx for the algo to log separately.

HNetOuterStage:

Adds back-compat bridge methods forward_padded(actions, obs) and
forward_packed(actions, obs, cu, msl) returning (pred, aux). The
forward_eval / _teacher_forced_packed paths use these; .generate /
.step / .init_step_state already had matching signatures.

hnet_pushshapes.yaml:

New outer_stage: block wraps cond_encoder + hnet stage tree +
input_modules + action_head_type. Top-level keeps training-recipe
knobs (init_weights_range, lr_multipliers, ...) and embodiment
wiring.
loss: block omitted (defaults to HNetLoss()).

scripts/test_hnet_refactor_e2e.py: packed-mode forward_training smoke.
Verified passing on H200 alloc 8989249 — produces action_loss 1.51 +
ratio_loss 0.032 + chunker stats for a 2-episode packed batch
(T=12+20). Padded mode hits a pre-existing torch SDPA error
(Explicit attn_mask should not be set when is_causal=True) in
train mode — this is in the trunk code, not introduced by the
refactor (the production training uses packed mode and never hits
the padded-train path).

Old HNetPolicy class is still in algo/hnet.py for now (no longer
used by HNet algo); will be removed in a cleanup commit once all
stage-based + flat yamls are migrated.

hnet yamls: migrate remaining 6 stage-based configs to outer_stage schema

Same outer_stage block pattern as the base hnet_pushshapes.yaml, applied
to the variant configs. Each yaml moves cond_encoder + hnet stage tree

(optional) input_modules + action_head_type under outer_stage; keeps
training-recipe knobs + embodiment wiring at the top level.

Configs migrated:

hnet_pushshapes_big.yaml (d_model 256, 21M params)
hnet_pushshapes_crossattn.yaml (cond_mode: cross_attn)
hnet_pushshapes_mamba_encdec.yaml (M8 encoder/decoder)
hnet_pushshapes_obs_ar.yaml (ObsToken input module)
hnet_pushshapes_obs_ar_large.yaml (ObsToken + d_model 256 + T8)
hnet_pushshapes_recipe.yaml (H-Net paper recipe)

scripts/test_hnet_yamls_load.py: batch instantiate smoke. Verified
on H200 alloc 8989249 — all 7 stage-based yamls (base + 6 variants)
instantiate from hydra config and produce sensible param counts
(5.5M baseline up to 42.8M obs_ar_large).

flat_fused_outer_stage + HNetFused thin alias + 3 flat yamls migrated

egomimic/algo/flat_fused_outer_stage.py: FlatFusedOuterStage class.
Structurally a rename of FlatFusedPolicy with OuterStage inheritance
plus an OuterStage forward(batch, ctx) dispatcher delegating to the
existing forward_padded / forward_packed. Legacy generate / step /
init_step_state preserved verbatim. encode / decode raise
NotImplementedError since the interleaved 2T-token flow does not
cleanly split along encode -> trunk -> decode.
egomimic/algo/hnet.py: HNetFused is now a thin pass-through subclass
of HNet, kept as a separate target for the existing flat yamls.
All flat-fused behavior moved into FlatFusedOuterStage; HNet.init
already tolerates outer_stage.inner_stage=None.
3 flat yamls migrated to outer_stage schema:
hnet_pushshapes_fused.yaml, hnet_pushshapes_fused_lowlr.yaml,
hnet_pushshapes_fused_pusher.yaml.
scripts/test_hnet_yamls_load.py extended to cover all 10 H-Net
configs. Verified on H200 alloc 8989249: 10/10 instantiate
successfully (7 HNetOuterStage + 3 FlatFusedOuterStage). Param
counts sensible.

Old FlatFusedPolicy class still in algo/hnet.py for now; cleanup of
unused legacy classes (HNetPolicy, FlatFusedPolicy) is a follow-up.

forward_training smoke for all 10 H-Net yamls + mamba regression check

eval: migrate algo.nets[] accesses to property accessors (refactor follow-up)

PACT: VAE + obs+action+image DFoT + bundle-aware evals + bcrnn

Joint state+vae_latent+action diffusion forcing.

DiT3D video diffusion: all fixes + VAE v6 config

DiT3D backbone with AdaLN-Zero, RoPE 3D (fixed rotate_half)
Latent normalization for VAE bias correction
No-padding per-episode backbone processing
Additive conditioning fusion (matches reference)
patch_size=2, noise-MSE loss, matched DiTBlock residual
VAE v6 config with KL beta=0.001

Co-Authored-By: Claude Opus 4.7 (1M context) noreply@anthropic.com

WIP checkpoint: obs-action DFoT 2D policy + spatial_rh sim controller + new evals (pre-hnet-variants merge)

snapshot_pre-restack_2026-05-31_decouple_spatial_tf_evals

pre-restructure snapshot (DESIGN.md step 0)

restructure: sweep root debris -> scratch/ (DESIGN.md step 1)

Move WIP-session / sibling-repo dead weight out of repo root into a
gitignored scratch/ archive (MOVES not deletes; tracked files via git mv
so history is preserved at old paths). Adds scratch/MANIFEST.md and
ignores /scratch/.

Swept (67 files):

37 .sh experiment runners (eval_/train_/smoke_/sim_/launch_/etc.)
11 patch_*.py monkeypatch scripts
12 root debug_/test_.py ad-hoc scripts
7 png/mp4 render dumps (4 mp4 were untracked->plain mv)

Deviation from DESIGN literal counts (40 .sh / 9 png/mp4): 5 of those are
original-repo files, not sibling debris, so KEPT in root:
pull_models.sh, run_eva_docker.sh, setup_nvm.sh (infra, git-added 2025),
convention.png, mano_keypoints.png (embedded in CONTRIBUTING_DATA.md).
Design's stated intent ("sibling-repo dead weight") preserved exactly.
See scratch/MANIFEST.md.

Co-Authored-By: Claude Opus 4.8 noreply@anthropic.com

restructure: collapse H-Net to models/hnet + flip hnet_core import (DESIGN steps 3-4)

Step 3 (collapse H-Net):

git mv egomimic/models/hnet_nets -> egomimic/models/hnet (the pact SUPERSET
tree: cross-attn + residual_scale + causal_conv1d + adaln_per_token).
DELETE egomimic/models/bc_rnn_nets/_hnet_vendored/ entirely (inferior subset
dup; its config/context/routing were byte-identical to the superset, so git
attributes those 3 as renames into models/hnet).
Rewrite intra-tree imports inside models/hnet/ from
egomimic.models.hnet_nets.X -> egomimic.models.hnet.X.
Leave egomimic/models/hnet_nets/ as a thin facade shim: a new init.py
that aliases each models.hnet._{into sys.modules under the legacy

hnet_nets._{key (so both top-level and submodule-path imports -- incl.

private symbols the tests import -- resolve to the SAME live module object)

and re-exports the top-level names. Keeps all legacy import paths alive

until the step-13 flip.}}

Step 4 (flip hnet_core import):

egomimic/models/bc_rnn_nets/hnet_core.py: imports flipped from
bc_rnn_nets._hnet_vendored.{context,hnet,stages} -> egomimic.models.hnet.*.
Superset extra flags (cross-attn / AdaLN / window) all default OFF, so the
obs-only HNetCore never touches the diverged paths.
Update bc_rnn_nets/init.py docstring to reflect the collapse.

Verification (A40, fixed-seed HNetCore forward, baseline captured in a
worktree at the pre-step-3 commit using _hnet_vendored vs post-flip using
models.hnet):

BIT-IDENTICAL: state_dict_sha256 560955..a197dc0 and output_sha256
772f6f..f6fe005 match exactly pre/post; 12,163,840 params / 121 tensors.
tests/test_hnet_nets.py: 57 passed (imports via shim).
All 7 BC-RNN paperexact configs compose (hydra --cfg job).
20/20 hnet_nets consumers import clean (DFoT family, algo/hnet family,
bc_rnn, act/hpt, both callbacks) via the shim.

Co-Authored-By: Claude Opus 4.8 noreply@anthropic.com

restructure: resolve BC-RNN algos -> flat algo/bc.py + WindowedBC (DESIGN step 5, amended)

Amended step 5 (user override of DESIGN's algo/bc/ package): the active BC
algo stays ONE FLAT FILE -- no package, no algo/bc/ directory. Backbone
(lstm/transformer/hnet) switching stays purely config-side via the existing
core_net knob.

Changes:

git mv egomimic/algo/bc_rnn.py -> egomimic/algo/bc.py (flat module; history
preserved).
Rename classes BCRNN -> WindowedBC, BCRNNPolicy -> WindowedBCPolicy. Old
names kept as module-level aliases (BCRNN = WindowedBC, BCRNNPolicy =
WindowedBCPolicy) -> same class objects, so isinstance/pickle/Hydra resolve
unchanged. Internal instantiation + name-bearing error/doc strings updated.
Add egomimic/algo/bc_rnn.py import shim re-exporting the full public surface
from egomimic.algo.bc (BCRNN/BCRNNPolicy/WindowedBC/WindowedBCPolicy +
_cut_windows/_cut_windows_strided/_pack_to_padded), so the legacy import path
and any target: egomimic.algo.bc_rnn.BCRNN config keep resolving.
Repoint the 7 BC-RNN configs' target to egomimic.algo.bc.WindowedBC
(old egomimic.algo.bc_rnn.BCRNN still works via shim + alias).
Quarantine the OTHER, name-colliding duplicate algo to scratch/ (git mv,
history preserved): egomimic/algo/bcrnn/{init,algo,outer_stage}.py +
its only config egomimic/hydra_configs/model/bcrnn_pushshapes.yaml ->
scratch/algo_bcrnn/. That dup is a separate robomimic-BC_RNN reimpl on the
Algo/OuterStage spine, not wired into the kept pipeline, with a stale config
(references pre-collapse egomimic.models.hnet_nets.* paths). Logged in
scratch/MANIFEST.md with a REQUEST-DELETE entry (user's call; not auto-deleted).

Verification (a40 compute node, sibling .venv):

import egomimic + egomimic.algo.{bc,bc_rnn,dfot,hpt,act,hnet,algo} clean;
WindowedBC is an HNet subclass; aliases are object-identical; shim re-exports
the same objects.
All 7 BC-RNN configs compose (hydra --cfg job) -> target:
egomimic.algo.bc.WindowedBC.
LSTM policy built via OLD target (egomimic.algo.bc_rnn.BCRNN, shim+alias)
AND NEW target (egomimic.algo.bc.WindowedBC) under a fixed seed:
state_dict torch.equal across all 137 tensors (23,511,752 params).
No dangling egomimic.algo.bcrnn refs in the active tree; no untracked
non-ignored files.

Co-Authored-By: Claude Opus 4.8 noreply@anthropic.com

restructure: relocate bc_rnn_nets to role homes + tests suite (DESIGN steps 6, 6.5)

DESIGN.md step 6: git mv the bc_rnn_nets members to their role homes and keep a
bc_rnn_nets/init FACADE re-exporting everything from the new homes (the old
import paths + yaml target submodule paths stay alive until step 13).

models/stems/ obs_encoder.py, visual_core.py
models/cores/ lstm_core.py, transformer_core.py, hnet_core.py
models/heads/ gmm_head.py, query_decoder.py

All 7 moves are R100 (pure git mv, content byte-identical). The facade aliases
each legacy submodule into sys.modules under egomimic.models.bc_rnn_nets._{(same mechanism as the step-3 hnet_nets shim) so package-name imports, submodule

imports, and yaml target paths all resolve to the SAME role-home module

objects. Removed the leftover empty _hnet_vendored/ dir from the step-3 collapse.}

Flipped the 7 BC-RNN configs' _target_s to the role paths (ObsEncoder ->
stems.obs_encoder, visual_core.VisualCore -> stems.visual_core, LSTMCore ->
cores.lstm_core, TransformerCore -> cores.transformer_core, HNetCore ->
cores.hnet_core, GMMActionHead -> heads.gmm_head, QueryActionDecoder ->
heads.query_decoder).

DESIGN.md amendment 6.5: distill the session proof patterns into a pytest suite
(GPU-alloc runnable; all forces CPU for determinism):
tests/test_core_defaults_byte_identical.py -- lstm/tx/hnet construct +
torch.equal across two fixed-seed builds + match committed ref fingerprints.
tests/test_causality.py -- TX + HNet prefix-consistency; TX future-perturb
no-leak; query-decoder future-perturbation EXACT-ZERO (torch.equal).
tests/test_train_rollout_parity.py -- forward vs sequential step() for all 3
cores + the chunk8 query-decoder queue replay.
tests/test_config_compose.py -- all 7 BC-RNN (+ legacy-path assertion) + 13
dfot + 5 vae configs compose through train_zarr_cartesian.

Verification (a40 alloc 3325503): 40/40 new tests GREEN; 57/57 test_hnet_nets
GREEN; all 7 BC-RNN configs --cfg job compose; import egomimic + algo.bc +
models.hnet + hnet_nets shim + algo.dfot clean. The 8 reds in the wider suite
(test_training_recipe::TestAlgoWiring algo.hnet outer_stage signature drift;
test_packed_pipeline missing on-disk zarr) are pre-existing -- reproduced
identically at the step-5 commit ccff845, untouched by this move.

Co-Authored-By: Claude Opus 4.8 noreply@anthropic.com

restructure: relocate DFoT pieces -> models/diffusion + algo/diffusion (DESIGN step 7)

Split egomimic/algo/dfot into its model + algo halves via git mv:

model pieces -> egomimic/models/diffusion/
backbones/{backbone,dit3d_backbone,spatial_backbone}
diffusion/{continuous_diffusion,discrete_diffusion,noise_schedule}
embeddings.py, sampling.py
algo pieces -> egomimic/algo/diffusion/
algo.py, outer_stages/{outer_stage + 9 *_outer_stage}
vae_algo.py (was egomimic/algo/vae/algo.py)

Intra-tree imports rewritten to the new role homes (backbones import
models.diffusion.embeddings + models.hnet.isotropic_builder; algo imports
models.diffusion.{backbones,diffusion,sampling} + models.hnet.cond_encoders).

algo/dfot/init and algo/vae/init kept ALIVE as thin facades: every
legacy egomimic.algo.dfot._{/ egomimic.algo.vae.algo dotted path is

registered in sys.modules pointing at the real relocated module, so the yaml

_target_s (algo.dfot.DFoT, algo.dfot.outer_stage.DFoTOuterStage,

algo.dfot.{continuous,discrete}_diffusion.*, algo.vae.VAE) all still resolve.

Shim identity verified (algo.dfot.DFoT is algo.diffusion.DFoT).}

Verify: 25/25 test_config_compose (13 DFoT + 5 VAE + 7 BC-RNN); import
egomimic + DFoT/VAE relocation imports clean.

Co-Authored-By: Claude Opus 4.8 noreply@anthropic.com

restructure: home zoo + curate eval into role buckets (DESIGN step 8)

Zoo (git mv, no behaviour change):

egomimic/algo/{act,hpt,pi}.py -> egomimic/algo/zoo/{act,hpt,pi}.py
flat egomimic/algo/{act,hpt,pi}.py kept as thin re-export shims so the
yaml _target_s (egomimic.algo.act.ACT / .hpt.HPT / .pi.PI) still resolve.
algo/zoo/init lazy-imports PI (optional openpi dep).
co-located algo/test_pi.py moved with pi.py -> algo/zoo/test_pi.py (kept
out of the tests/ suite, as before: it requires openpi).

Eval curated into egomimic/eval/{core,tf,dfot,probes,zoo}/ (git mv):

core/ eval, eval_video, eval_composite, eval_sim, eval_hnet, eval_vae_recon
tf/ eval_dfot_val, eval_dfot_controller_tf
dfot/ the 7 DFoT self/video/policy rollout evaluators
probes/ eval_boundary_strip, eval_pca_tokens
zoo/ eval_act, eval_hpt, eval_pi
Inter-eval imports rewritten to the bucketed paths; the dfot evals' imports of
the DFoT model pieces flipped to canonical egomimic.models.diffusion.* (off
the algo.dfot shim).

EDITED the ~20 evaluator-yaml _target_s DIRECTLY to the bucketed eval paths
(DESIGN warns target resolution is weaker through init shims), e.g.
egomimic.eval.eval_sim.PackedSimEval -> egomimic.eval.core.eval_sim.PackedSimEval.

eval/init kept ALIVE as a facade: every legacy egomimic.eval.eval_
PYTHON import path is sys.modules-aliased to its bucketed module, so the code
consumers (trainHydra: egomimic.eval.eval.Eval; scripts/ smoke+verify helpers)
keep working until the final flip (step 13).

Verify: 25/25 test_config_compose; 7 BC-RNN compose via hydra --cfg job;
20/20 evaluator yamls compose + every egomimic.eval.* target resolves;
import egomimic + DFoT/zoo/eval-bucket imports clean; full suite unchanged
from baseline (8 pre-existing fails: 7 TestAlgoWiring sig drift + 1 data-missing
packed_pipeline; no new failures).

Co-Authored-By: Claude Opus 4.8 noreply@anthropic.com

restructure: close BC-RNN sim-eval gap (DESIGN steps 9-10)

Step 9 (get_keymap_eval):

Add get_keymap_eval() to egomimic/rldb/embodiment/pushshapes.py: get_keymap()
plus a goal_pose passthrough keyed key_type="goal_keys". goal_keys is NOT in
MultiDataset.NORMALIZE_KEY_TYPES, so goal_pose is read into the packed batch
raw/un-normalized and passed straight through to PackedSimEval, which reads
batch["goal_pose"] in batch_to_env_init to set the env goal. Serves both
circle proxies. The 7 BC-RNN launchers already point KM at this symbol.
Deviation from the EgoVerse2 reference: omit the extra init_action passthrough.
pact-2's eval uses init_mode="replay" and never reads init_action (verified: no
reference in egomimic/eval or egomimic/algo), so it would be dead weight. The
~17-line design target counted EV2's init_action; documented inline.

Step 10 (close sim-eval gap):

Add T_max=None kwarg to WindowedBC.inference_step so the PackedSimEval call
inference_step(obs_zarr, t, emb_id, T_max=self.max_steps) (eval_sim.py:251) no
longer TypeErrors. T_max is the sim rollout horizon, a different quantity from
the policy's action-queue length, so the internal init_step_state buffer is
still sized from policy.action_horizon; T_max is accepted/tolerated to match
the eval contract (matches DFoT.inference_step, which already takes T_max).
Strip the unsupported evaluator.rollout_mode=ar override from all 7 BC-RNN
launchers. eval_hnet_sim.yaml (HNetSimEval/PackedSimEval) has no rollout_mode
key and drives AR natively via the per-token inference_step, so the override
raised ConfigAttributeError. No other eval-only override (delta_action/
temporal_ensemble/chunk_k/goal_in_obs) remains in the launchers.
Fix WindowedBC missing train_obs_transforms (the actual blocker on the
headline). WindowedBC.init calls Algo.init (not HNet.init), so
the inherited HNet.process_batch_for_training (hnet.py:889) hit
AttributeError: 'WindowedBC' object has no attribute 'train_obs_transforms'
on BOTH the train and the validation paths, before any rollout. Initialize
self.train_obs_transforms = [] in WindowedBC.init: the empty list makes
the if self.train_obs_transforms and self.outer_stage.training guard
short-circuit, also sidestepping the (nonexistent) outer_stage. WindowedBC
has no train-only obs augmentation, so [] is the correct value. Pre-existing
latent bug from the H-Net restructure (steps 3-8); surfaced only now because
this is the first time BC-RNN sim eval actually runs.

Verification:

All 7 BC-RNN configs compose (hydra --cfg job) with evaluator=eval_hnet_sim +
get_keymap_eval KM and no rollout_mode=ar.
import egomimic + DFoT (algo.diffusion) + zoo (algo.zoo.hpt) import clean;
WindowedBC/BCRNN alias intact.
get_keymap_eval() returns keys [front_img_1, state_agent_obj, actions,
goal_pose].
HEADLINE: a 1-batch REAL closed-loop sim eval (mode=eval, trainer.validate,
bc_rnn_pushshapes_paperexact, max_steps=8, 1 val batch, random-init weights)
ran end-to-end on an A40 and LOGGED A COVERAGE NUMBER for the first time:
Valid/emb15_sim_coverage = 0.0
Valid/emb15_sim_success_rate = 0.0
EVAL_EXIT=0. Coverage 0.0 is expected for an untrained model; the point is
the eval stack now runs the full loop through inference_step(...,T_max=) and
the goal_pose passthrough. (mode=eval used to bypass the unrelated training
loop; the WindowedBC fix above was required to get past validation_step.)
tests/: 122 passed, 3 skipped; the 8 failures (TestAlgoWiring x7 +
test_packed_pipeline full-pipeline-stats) are PRE-EXISTING at HEAD (proven by
re-running with these changes stashed) - HNet.init now requires an
outer_stage arg the older fixtures don't pass. Tracked separately.

Co-Authored-By: Claude Opus 4.8 noreply@anthropic.com

structural fixes: input_modules->stems, packed_base rename, flat_fused quarantine

PHASE 1 structural fixes (post-DESIGN step 10), all pure moves + compat shims:

git mv egomimic/algo/input_modules.py -> egomimic/models/stems/input_modules.py
(DESIGN stems role home). Fixed its internal import to the canonical
post-collapse home models.hnet.cond_encoders (was models.hnet_nets.*).
Compat shim left at algo/input_modules.py re-exporting all 3 classes; updated
direct importers (algo/packed_base.py, algo/hnet_outer_stage.py) and the two
obs_ar config target paths to the new home.
git mv egomimic/algo/zoo/test_pi.py -> tests/test_pi.py and guarded it with
pytest.importorskip(openpi) so it SKIPS cleanly (was a collection ERROR; the
PI algo needs the optional openpi pkg, absent in the default venv).
Quarantined dormant B-family flat-fused legacy -> scratch/flat_fused_quarantine/
(flat_fused_outer_stage.py + 3 hnet_pushshapes_fused*.yaml), unreferenced by
pact-2 mission. HNetFused stays as dormant dead code in packed_base.py;
MANIFEST.md + REQUEST-DELETE entry added.
git mv egomimic/algo/hnet.py -> egomimic/algo/packed_base.py (role-clarifying:
per-emb-norm + packed-path base, NOT the models/hnet/ stage tree). Class names
unchanged (HNet stays HNet). Compat shim at algo/hnet.py re-exports the full
surface; updated direct importers (bc.py, test_training_recipe.py). Configs
keep using egomimic.algo.hnet.* via the shim.

bc_rnn.py shim untouched (step 13). Verified: import smoke (shim identity ==
canonical), 7 hnet configs compose, tests/ at baseline (no new failures;
test_pi now skips cleanly).

Co-Authored-By: Claude Opus 4.8 noreply@anthropic.com

step 13: final flip — shims deleted, configs mirrored, dormant purge, packed_outer_stage rename

DESIGN.md step 13 + amendments A & B. algo/ END-STATE: no shims, no dormant
code, honest names everywhere.

Config mirror: every shim-routed target / import flipped to its real home
across all hydra configs + non-config importers (66 mechanical + 9 manual).
algo.hnet.* -> algo.packed_base.*
algo.bc_rnn.* -> algo.bc.*
algo.{act,hpt,pi}.* -> algo.zoo.*
algo.input_modules.* -> models.stems.input_modules.*
algo.dfot.* -> algo.diffusion.* / models.diffusion.*
algo.vae.* -> algo.diffusion.{VAE,vae_algo}
models.hnet_nets.* -> models.hnet.*
models.bc_rnn_nets._{.* -> models.{stems,cores,heads}.* (role-routed)}
Shims DELETED (grep-proven empty first): algo/{act,hpt,pi,bc_rnn,hnet,
input_modules}.py, algo/dfot/, algo/vae/, models/hnet_nets/init.py,
models/bc_rnn_nets/init.py.
Amendment A: FlatFusedPolicy + HNetFused purged from packed_base.py
(1239 -> 935 lines, -304) into scratch/flat_fused_quarantine/.
Amendment B: git mv algo/hnet_outer_stage.py -> algo/packed_outer_stage.py;
importers + 7 hnet configs flipped same commit, no shim.
Hygiene: pycache/ gitignored; tests/ import real homes directly.

Verify (a40 alloc): config compose 37/37 PASS; tests 122 pass / 8 pre-existing
fail (identical to pre-flip baseline, ZERO new); state_dict parity LSTM+HNet+
chunk8-Q all torch.equal vs pre-flip; SMOKE=1 train_bc_rnn_hnet.sh TRAIN_EXIT=0.

Co-Authored-By: Claude Opus 4.8 noreply@anthropic.com

dedup collapse 1: unify 3 pixel-policy DFoT outer stages into one pixel_mode-parameterized stage

Collapse the three near-duplicate pixel-policy outer-stage classes under
egomimic/algo/diffusion/outer_stages/ into ONE parameterized class
PixelObsActionDFoTOuterStage, selected by the pixel_mode config knob:

pixel_mode="policy" <- PixelObsActionPolicyDFoTOuterStage (Design A:
action broadcast into RGB channels, jointly diffused)
pixel_mode="regress" <- PixelObsActionRegressPolicyDFoTOuterStage (Design B:
RGB-only diffusion + conv action_head off pred x0)
pixel_mode="decoupled" <- PixelObsActionDecoupledDFoTOuterStage (DEC: action as
separate DiT3D token with independent noise level)

Each mode reproduces the corresponding old class EXACTLY and preserves the
duck-typed attribute surface the algo inference paths consume (_action_channels
for policy, action_head for regress, decouple_action_noise for decoupled,
plus the mode-correct action_slice). The 3 model configs are mirrored in this
same commit: _target_ -> PixelObsActionDFoTOuterStage + pixel_mode: <mode>.
Old class files moved to scratch/dedup_c1_old_stages/ (gitignored).

PROVEN behavioral equality (fixed seed, a40, srun on overcap alloc; harness at
scratch/proof_dedup_c1.py, all old vs new instantiated from the SAME resolved
sub-configs with identical RNG):

(a) Fixed-seed construction parity — state_dict keys identical AND every tensor
torch.equal:
policy: 82 keys, 7,325,460 params, keys_identical=True, all torch.equal
regress: 90 keys, 7,397,134 params, keys_identical=True, all torch.equal
decoupled: 87 keys, 7,322,894 params, keys_identical=True, all torch.equal
action_slice old==new for every mode (policy slice(3,5); regress/decoupled
slice(0,0)); mode attribute surface present on the unified class.

(b) Forward parity on a fixed-seed packed batch (2 episodes, T=9) — every
output tensor torch.equal (EXACT, no allclose fallback needed):
policy: forward_return, pred_v, qstate_x_t -> torch.equal
regress: forward_return, pred_v, pred_action, loss, x_t -> torch.equal
decoupled: forward_return, pred_v, qstate_x_t, loss -> torch.equal

(c) Hydra-compose of all 3 mirrored configs PASS; composed outer_stage.target
resolves to egomimic.algo.diffusion.PixelObsActionDFoTOuterStage with the
correct pixel_mode each.

Regression: pytest tests/ = 122 passed / 8 failed / 4 skipped — identical to the
step-13 baseline. The 8 failures are pre-existing and unrelated (7 TestAlgoWiring
old-HNet-signature, 1 packed_pipeline missing-zarr-data). All 25 config-compose
tests pass, including the 3 mirrored pixel configs. Zero new failures.

Co-Authored-By: Claude Opus 4.8 noreply@anthropic.com

dedup collapse 2: move SimpleConv + CondEncoderModule into models/stems/ (models/hnet now pure chunking machinery)

Image-encoder consolidation. Relocates the two input-side encoder modules out
of egomimic/models/hnet/ (which is meant to hold ONLY H-Net chunking
machinery) into their role home egomimic/models/stems/:

git mv egomimic/models/hnet/image_encoders.py egomimic/models/stems/image_encoders.py (SimpleConv)
git mv egomimic/models/hnet/cond_encoders.py egomimic/models/stems/cond_encoders.py (CondEncoderModule)

These are PURE MOVES — no logic edits. The only in-file content change is a
single docstring line in image_encoders.py whose _target_: example path was
updated hnet->stems. cond_encoders.py is byte-identical to its pre-move source.
All 33 references (13 python import sites + 33 yaml _target_ occurrences
across 20 model configs) flipped to the new stems path in this same commit;
hnet/init.py and stems/init.py re-exports updated; obs_encoder.py
docstring path corrected. After the move git grep shows models/hnet contains
no encoder/stem code (only a prose cross-reference comment in context.py).

PROVEN BEHAVIORAL EQUALITY (must function identically after the edit):
(a) Import-identity via temporary shim, checked with Python is then shim
removed in this commit:
OldSimpleConv is NewSimpleConv -> True
OldCondEncoderModule is NewCondEncoderModule -> True
hnet.init-resolved CondEncoderModule is new-> True
(b) Byte-identity modulo path lines: cond_encoders.py diff vs pre-move tag is
EMPTY; image_encoders.py diff is exactly ONE line (the _target_ docstring
example path).
(c) Construction state_dict torch.equal (old source extracted from tag
dedup-c2-pre vs new package source, fixed-seed init): all cases equal --
SimpleConv(4ch) 18 keys, SimpleConv(3ch) 14 keys, CondEnc+img 24 keys,
CondEnc+obs 10 keys, CondEnc(empty) 0 keys; plus forward torch.equal=True.
(d) Config mirror in this commit + compose-check: 20/20 affected configs
compose and instantiate the cond_encoder node (20 nodes) via new targets;
broader sweep 56/56 model configs compose, 0 failures.

Regression: pytest tests/ == 122 passed / 8 failed / 4 skipped, the SAME 8
pre-existing TestAlgoWiring + TestInferNormFromPacked failures present on tag
dedup-c2-pre (verified by running the suite on a worktree of the pre-move tag).
ZERO new failures.

Co-Authored-By: Claude Opus 4.8 noreply@anthropic.com

dedup collapse 3: factor shared zarr read/decode logic into rldb/zarr/_common.py

The two near-duplicate zarr loader paths — the padded/windowed reader
ZarrDataset.getitem and the packed/span reader ZarrDataset._read_span
(consumed by ZarrEpisodePackedDataset) — each inlined byte-for-byte copies of
the same JPEG window decode, single-frame JPEG decode, JSON-array decode,
float32 tensorization, and embodiment tagging. ZarrActionExpertDataset._load_obs_at
held a third copy of the single-frame JPEG decode.

This collapse extracts that shared logic into the new module
egomimic/rldb/zarr/_common.py as five pure helpers:

decode_jpeg_single(buf) -> CHW float image in [0,1]
decode_jpeg_window(buffers) -> stacked (T,C,H,W), per-frame decode
decode_json_array(arr, fn) -> [fn(v) for v in arr]
tensorize_float32(data, *, skip_object_dtype) (the ONE predicate the two
loaders genuinely differ on: _read_span skips
object-dtype arrays, getitem does not)
tag_embodiment(data, emb) -> stamps embodiment + metadata.robot_name

Each helper is a verbatim extraction of the pre-collapse loop body. The two
loaders now call the helpers and keep ONLY their genuine differences:
getitem keeps its horizon-windowing + repeat-last padding + bounded
JPEG-fail resample loop; _read_span keeps its exact-span read + seq_len /
episode_idx metadata. Dead import simplejpeg removed from both loader files
(decode now lives in _common). No public API / signature changes: _read_span,
getitem, _load_obs_at keep identical signatures; the sole _read_span call
site (zarr_dataset_packed.py) and the _load_obs_at call sites are unchanged
(git grep verified — zero call-site edits needed).

PROVEN BEHAVIORAL EQUALITY (fixed fixture episodes from
/coc/flash7/paphiwetsa3/datasets/new_circle_3, a40 overcap alloc, srun):

(a) New permanent suite tests/test_loader_equality.py (6 tests) PASSES both
BEFORE the refactor (anchoring reference behavior captured from the
pre-collapse code at tag dedup-c3-pre) and AFTER:
- TestReferenceHashes: both loaders reproduce frozen sha256 reference
hashes of the decoded front_img_1 / state_agent_obj / actions tensors
captured from the pre-collapse code. Post-refactor hashes match
exactly:
front_img_1 68f20e3c2c5f72b0 | (290,3,96,96) f32
state_agent_obj 26a652f406d275d2 | (290,5) f32
actions 6f7a2b1ab531506b | (290,2) f32
- TestCrossLoaderEquality: padded full-window vs packed span reads are
torch.equal per-frame across 4 episodes (+ embodiment id identical).
- TestNormalizationPathEquality: MultiDataset.normalize applied to BOTH
loaders' outputs is torch.equal (proves the normalization path is
identical across loaders).
- TestPackMetadata: pack_collate emits the documented seq_lens /
cu_seqlens / max_seq_len / batch_size, and the concatenated per-frame
stream equals the per-span reads in order.
The suite auto-skips off-cluster (fixture-missing guard).

(b) Direct old-vs-new bit-identity proof (scratch/proof_old_vs_new.py,
gitignored): the PRE-collapse loader modules extracted from tag
dedup-c3-pre via git archive and the POST-collapse live modules are run
on the same 3 episodes through BOTH the padded getitem and packed
_read_span paths. Result: 33 key comparisons across 3 episodes x 2 paths,
every output tensor torch.equal(old, new) == True (front_img_1,
state_agent_obj, actions all exact).

Regression: pytest tests/ == 128 passed / 8 failed / 4 skipped. The 8 failures
are the SAME pre-existing failures present on tag dedup-c3-pre (verified by
running the suite on a worktree of the pre tag: 122 passed / 8 failed / 4
skipped — 7 TestAlgoWiring old-HNet-signature + 1 TestInferNormFromPacked
missing-zarr-data, all in code this commit does not touch). 128 = the baseline
122 + the 6 new equality tests. ZERO new failures.

Co-Authored-By: Claude Opus 4.8 noreply@anthropic.com

docs: record dedup-campaign global acceptance gates (2026-06-06)

Append the dated dedup-campaign record to PORT_NOTES.md after the 3
behavior-preserving collapses (c1 f06330c pixel-DFoT outer-stage unify,
c2 32eb1fc hnet->stems encoder move, c3 c289657 zarr _common factor)
passed the global gates on alloc 3325596 (a40):

FULL compose sweep: 107/109 PASS (2 fails = PI viz configs, pre-existing
MissingConfigException on parent default, untouched by collapses).
pytest tests/: 128 passed / 8 failed / 4 skipped (8 = same pre-existing
TestAlgoWiring+InferNorm fails; +6 new c3 loader-equality tests). Zero new.
BC smoke (job 3325599) TRAIN_EXIT=0; DFoT 1-ep pixel smoke DFOT_TRAIN_EXIT=0.
NLL vs baseline: DFoT bit-identical (0.28878551721572876, delta=0.0);
BC delta ~1e-5, within 1e-3 gate.

Co-Authored-By: Claude Opus 4.8 noreply@anthropic.com

dedup collapse 4: dead-code purge (5 of 6 zero-ref symbols; HNetPolicy retained — it is NOT dead)

Pure-delete-with-grep-proof. Each symbol re-grepped over egomimic/ +
hydra_configs/ + tests/ + scripts/ (excl pycache/external/scratch/logs)
IMMEDIATELY before deletion to reconfirm zero LIVE refs.

Deleted (proven zero live refs):

egomimic/models/diffusion_policy.py (whole file -> scratch; only self-ref)
egomimic/models/ddim_scheduler.py (whole file -> scratch; live DDIM is
diffusion/sampling.ddim_sample)
egomimic/models/hnet/_smoke_stages.py (whole file -> scratch; only its own
main self-invoke)
algo/loss.py CompositeLoss + MSELoss (no target, no code ctor; live losses
are HNetLoss()/DFoTLoss() built in code)
algo/packed_base.py HNet._ar_rollout_packed (no caller; live eval is
forward_eval -> _teacher_forced_packed)
algo/packed_outer_stage.py HNetOuterStage.generate (dead; only ref was a docstring.
step/init_step_state KEPT — they ARE the
live closed-loop path, called at
packed_base.py policy.init_step_state/step)
pl_utils/pl_data_utils.py RLDBModule, DualDataModuleWrapper, DataModuleWrapper
(deprecated; live wrapper is
MultiDataModuleWrapper, ref'd by 21 files)

Also dropped now-unused imports (typing.Optional in packed_outer_stage.py;
typing.Iterable/List/Optional in loss.py) and updated 2 yaml comment mirrors
that named the now-deleted CompositeLoss (dfot_pushshapes.yaml, hnet_pushshapes.yaml).
Deleted files moved to scratch/dead_code_c4/ (gitignored), not destroyed.

SCOPE CORRECTION — HNetPolicy NOT deleted: the campaign evidence claimed
HNetPolicy was zero-ref, but that grep excluded scripts/. The required
grep over scripts/ found 3 LIVE importers+instantiators:
scripts/smoke_packed_training.py (documented live tooling in CLAUDE.md L538)
scripts/test_mamba_regression.py (old-vs-new equivalence regression)
scripts/test_hnet_outer_stage.py (HNetPolicy-vs-HNetOuterStage equivalence smoke)
The grep proof-gate fails for HNetPolicy, so it is retained (its .generate
method stays with the class). All other 5 targets pass cleanly.

Proofs (run on own a40 alloc, repo's symlinked .venv):

import-smoke: import egomimic.algo, egomimic.models, egomimic.pl_utils -> IMPORT OK
retained symbols import: Loss/HNetLoss/DFoTLoss, MultiDataModuleWrapper,
HNet+HNetPolicy, HNetOuterStage (step=True init_step_state=True generate=False)
deleted symbols confirmed gone (CompositeLoss/MSELoss absent; dead model files absent)
py_compile all edited .py: OK
pytest tests/: 128 passed / 8 failed / 4 skipped — IDENTICAL to baseline.
The 8 failures are the documented pre-existing set (7x TestAlgoWiring
old-HNet-signature: "HNet.init() missing 1 required positional argument:
'outer_stage'"; 1x test_full_pipeline_collects_per_feature_stats missing-zarr).
Zero NEW failures. Deletes touch zero reachable code paths -> no torch.equal needed.

Co-Authored-By: Claude Opus 4.8 noreply@anthropic.com

dedup collapse 5: hoist shared embodiment-key resolution + _build_obs + log_info onto base Algo

Three blocks of code were byte-identical (verified via diff, rc=0) across
the policy algos and are collapsed into a single home on the base Algo:

the per-embodiment key-resolution loop (for emb in self.domains: ...
resolving resolved_ac_keys / proprio_keys / lang_keys / camera_keys via
norm_stats) -- HNet packed_base.py:489-514 == DFoT diffusion/algo.py:169-194
== WindowedBC bc.py:699-724. Now Algo._resolve_embodiment_keys(norm_stats);
each subclass calls it from init.
_build_obs -- HNet packed_base.py:615-624 == DFoT diffusion/algo.py:246-255
(WindowedBC already inherited HNet's). Now defined once on Algo; the HNet
and DFoT overrides are deleted (inherited).
log_info -- HNet packed_base.py:778-783 == DFoT diffusion/algo.py:419-425.
The base Algo.log_info (formerly a NotImplementedError stub) now carries
this exact body as the shared default; the HNet/DFoT overrides are deleted.

Net -116/+83 lines. Behaviour is preserved by construction: the moved text is
identical, so every subclass resolves the same function object from Algo
(no subclass re-introduces a private copy). bc.py drops its now-unused
get_embodiment_id import; packed_base/DFoT keep theirs (still used elsewhere).

Proofs (run on a40 alloc 3325792, fixed seeds):

DFoT 1-epoch pixel smoke Train/Loss = 0.28878551721572876 -- BIT-IDENTICAL
to the pre-c5 baseline (all 17 digits; this is the deterministic, 0.0-jitter
path per the dedup_baseline manifest).
BC SMOKE=1 train: c5 Train/Loss = [1.3453816, 0.1752842]. A clean pre-c5
baseline RE-RUN on the same node gives [1.3452528, 0.1739942] -- i.e. the BC
smoke is itself run-to-run nondeterministic at ~1e-3 (CUDA/image-encoder/
sim-eval RNG), and the c5 run lands CLOSER to the manifest values
[1.3453673, 0.1749004] than the clean-tree rerun does (row0 |c5-manifest|
=1.4e-5 vs |baseline_rerun-manifest|=1.1e-4). The refactor is within the
tree's own deterministic-replay band.
New permanent guard tests/test_embodiment_key_resolution_shared.py (3 tests):
asserts HNet/WindowedBC/DFoT all resolve the SAME Algo function objects
for _resolve_embodiment_keys / _build_obs / log_info (import-identity) and
produce byte-equal key sets + obs selection on a fixed norm_stats fixture.
pytest tests/ = 131 passed / 8 failed / 4 skipped. The 8 failures are the
documented pre-existing baseline (7x TestAlgoWiring old-HNet-signature +
1x missing-zarr-data); +3 passed are the new permanent tests. Zero NEW
failures vs the 128/8/4 profile.

Co-Authored-By: Claude Opus 4.8 noreply@anthropic.com

dedup collapse 6: hoist 6 identical eval uint8 helpers -> eval/core/img_utils.img_chw_to_uint8

Six evaluators each carried a byte-equivalent (C,H,W) float-in-[0,1] ->
(H,W,C) uint8-in-[0,255] converter under three private names:

_img_chw_to_uint8 : video_rollout, pixel_video_rollout,
spatial_video_rollout, policy_action
_u8 : bundle_anchored
_to_uint8_hwc : eval/core/eval_vae_recon
Four were the 4-line clip->*255->transpose form; two (policy_action, _u8)
were the same logic as a 2-liner. All six now delegate to a single canonical
egomimic.eval.core.img_utils.img_chw_to_uint8; the local defs are removed and
each call site renamed.

EXCLUDED (genuinely different, left untouched): eval_dfot_self_rollout
._img_chw_to_uint8 uses an x.max()<=1.5 auto-scale heuristic + clip(0,255).

PROOFS (a40 job 3325794, fixed seeds):

tests/test_eval_img_utils.py (NEW, permanent gate):
test_canonical_matches_every_original_body PASSED
np.array_equal(canonical, orig_4liner) and ==orig_2liner on a fixed
torch.manual_seed(0) float tensor spanning out-of-[0,1] (rand*1.4-0.2).
test_touched_eval_modules_import PASSED (import-smoke of all 7 modules)
full tests/: pre-c6 (stashed) = 8 failed / 131 passed / 4 skipped;
post-c6 = 8 failed / 133 passed / 4 skipped. SAME 8 pre-existing failures
(7x TestAlgoWiring old-HNet-signature + 1x TestInferNormFromPacked
missing-zarr-data); +2 passed = the 2 new equality/import tests. Zero NEW
failures. Behaviour-preserving: the 6 bodies were already identical.

Deleted-helper provenance saved to scratch/c6_deleted_helpers/ (gitignored).

Co-Authored-By: Claude Opus 4.8 noreply@anthropic.com

dedup collapse 7: delegate image-only frame sampler to action-aware superset + hoist loss-reducer skeleton onto Algo

Two structural near-duplicates removed:

Frame sampler. PixelSpatialDFoTOuterStage._sample_frames_packed (image
only) duplicated the fixed_window/start_to_end/random_subsample branch
logic of PixelObsActionDFoTOuterStage._sample_windows_packed (image+
action). The action version is a strict superset: its image cropping and
cu_seqlens are byte-identical whether or not actions are supplied (the
action branch performs zero extra RNG draws). Hoisted the superset onto the
parent PixelSpatialDFoTOuterStage, made it accept actions=None, and reduced
_sample_frames_packed to a thin delegate
(_sample_windows_packed(images, None, cu)). Deleted the duplicate copy from
the PixelObsAction subclass (now inherited).
Loss reducer. DFoT.compute_losses was the pure sum-per-embodiment
{emb}_action_loss -> action_loss skeleton; promoted it to the Algo base as
the default compute_losses and deleted DFoT's byte-identical override.
VAE / HNet / HPT / PI keep their own overrides because they are genuine
SUPERSETS (recon/kl/lpips, ratio_loss, domain-count division) — NOT folded.

PROOFS (a40 alloc 3325796, fixed seeds):

BEFORE-state probe: current image-only sampler vs current superset produce
torch.equal image crops + cu_seqlens across all 3 modes (img_equal=True,
cu_equal=True for fixed_window / start_to_end / random_subsample).
Permanent test tests/test_c7_sampler_reducer_equality.py (3 tests, all pass):
- image-only sampler == superset(actions=None) torch.equal across all 3
  modes and all episode-length regimes (<n, ==n, >n);
- hoisted Algo.compute_losses torch.equal to legacy DFoT.compute_losses on
  fixed predictions;
- guard: HNet ratio_loss reducer is NOT reproduced by the base default
  (catches a future wrong fold of the superset overrides).
DFoT 1-epoch pixel-policy smoke (exercises the rewritten packed sampler,
frame_sampling=fixed_window, + the inherited reducer): Train/Loss =
Train/action_loss = Train/emb15_action_loss = 0.28878551721572876, BIT-
IDENTICAL (all 17 digits) to the pre-collapse baseline.
pytest tests/ = 136 passed / 8 failed / 4 skipped: the 8 failures are the
documented pre-existing set (7x TestAlgoWiring old-HNet-signature + 1x
missing-zarr-data InferNorm), ZERO new failures; suite grew by the 3 new
permanent equality tests.

No hydra configs reference the touched methods (no config mirror needed).

Co-Authored-By: Claude Opus 4.8 noreply@anthropic.com

models/ hierarchy pass: relocate 7 loose files to role homes (cores/heads/stems/diffusion) + utils

End state: ls egomimic/models/*.py shows ONLY init.py. Every move is a
git mv (R-status); the 2 splits git-mv the file to its primary home first
(history-preserving) then extract the other-role classes into new files in this
same commit. All moved class bodies are byte-identical modulo import lines
(verified by class-body diff vs HEAD). Every importer + config target updated
in this commit; grep confirms zero remaining old-module dotted refs.

WHOLE-FILE MOVES (git mv):
fm_policy.py -> heads/fm_policy.py (policy output head)
denoising_policy.py -> heads/denoising_policy.py (diffusion policy head base)
denoising_nets.py -> diffusion/denoising_nets.py (legacy diffusion nets)
image_vae.py -> diffusion/image_vae.py (DFoT pixel<->latent codec)
preprocess_pi_obs.py-> utils/preprocess_pi_obs.py (data preprocessing, OUT of models/)

SPLITS (git mv to primary home + extract):
act_nets.py -> stems/resnet_conv.py (primary: Module/ConvBase/CoordConv2d/ResNet18Conv)
+ cores/act_transformer.py (PositionalEncoding/Transformer/StyleEncoder)
hpt_nets.py -> stems/hpt_stems.py (primary: PolicyStem/MLPPolicyStem/ResNet)
+ cores/hpt_transformer.py (CrossAttention/Attention/MLP/BlockWithMasking/
MultiheadAttention/SimpleTransformer)
+ heads/hpt_heads.py (PolicyHead/MLPPolicyHead/TransformerDecoderBlock/
MultiBlockTransformerDecoder)

DEAD-CODE PRUNE (grep-proven 0 external refs):
hpt_nets: STPolicyStem, AttentivePooling, vit_base_patch16, T5TokenizerWrapper,
T5Encoder, L2Norm (also drops the heavy timm/transformers T5/ViT imports)
denoising_nets: ConditionalClassifier1D, CrossTransformerCfg2, CrossTransformerProj

VERIFY (on a40 alloc, PYTHONPATH=repo working tree, Python 3.11):

import egomimic OK; every touched module imports OK (pi/rollout fail only on
pre-existing missing optional deps openpi/robot_utils, BEFORE the moved import lines).
py_compile OK for all 10 new/moved files.
Hydra compose-check passes for every model config whose target moved.
OLD-ckpt path map appended to scratch/hierarchy_path_map.txt (gates phase folds
it into PORT_NOTES).

Co-Authored-By: Claude Opus 4.8 noreply@anthropic.com

algo/ hierarchy pass: rename packed_base.HNet -> PackedAlgoBase

In-place class rename (no file moved): the packed-sequence policy Algo base
was misleadingly named HNet while it is actually the shared base for the
packed-sequence algos (WindowedBC subclasses it; the inner H-Net stage tree
is supplied via outer_stage and is a separate, correctly-named concept in
models/hnet). Renamed the class to PackedAlgoBase to reflect its role.

Changes (all in one commit):

packed_base.py: class HNet(Algo) -> class PackedAlgoBase(Algo);
docstring updated; kept HNet = PackedAlgoBase compat alias at module
bottom (commented) so OLD ckpts/configs whose resolved target still names
egomimic.algo.packed_base.HNet keep resolving.
bc.py: import + class WindowedBC(PackedAlgoBase) + base-referring
docstring/comments updated.
algo/init.py: import-example comment updated.
7 model configs (hnet_pushshapes*.yaml): target ->
egomimic.algo.packed_base.PackedAlgoBase.
scripts/smoke_packed_{training_e2e,validation}.py: import (as HNetAlgo) +
Algo-method docstrings updated.
tests/{test_c7_sampler_reducer_equality, test_embodiment_key_resolution_shared,
test_training_recipe}.py: import + class refs updated.

OUT OF SCOPE (untouched per task): models/hnet (architecture HNet), HNetCore /
HNetOuterStage / HNetLoss / HNetSimEval (architecture/stage names), HNetPolicy
(landmine: proven alive), algo/obs_transforms.py (landmine: designed extension
point).

Verified on A40 alloc: import egomimic OK; PackedAlgoBase + HNet-alias
identity OK (HNet is PackedAlgoBase); WindowedBC subclass OK; all 7 hnet
configs compose with new target; old dotted path resolves via compat alias.
Tests: test_c7 + test_embodiment_key_resolution_shared + test_config_compose
all pass; test_training_recipe shows the SAME 7 pre-existing TestAlgoWiring
failures (old constructor signature, missing outer_stage) and ZERO new
failures vs the 136/8/4 baseline.

Path-map (old->new dotted) appended to scratch/hierarchy_path_map.txt for the
gates phase to fold into PORT_NOTES.

Co-Authored-By: Claude Opus 4.8 noreply@anthropic.com

utils/ hierarchy pass: junk-drawer split into pl_utils/vendored/models + dead-file purge

Relocate the misplaced contents of egomimic/utils/ to semantic homes and delete
grep-proven dead files. One commit per the hierarchy-pass group rule; revertible
via tag pre-utils-hier.

RENAMES (git mv, R-status):
utils/timing_callback.py -> pl_utils/callbacks/timing_callback.py
utils/instantiators.py -> pl_utils/instantiators.py
utils/logging_utils.py -> pl_utils/logging_utils.py
utils/rich_utils.py -> pl_utils/rich_utils.py
utils/utils.py -> pl_utils/utils.py
utils/tensor_utils.py -> vendored/robomimic_tensor_utils.py (988-line verbatim
robomimic vendor; +vendored/README.md provenance note)

egomimicUtils.py SPLIT (source file STAYS in utils/ as the generic remainder —
constants ARIA/EXTRINSICS/INTRINSICS, geometry, str2bool, interpolate_*,
CameraTransforms, download_from_huggingface, STD_SCALE):
model helpers -> models/cores/model_utils.py (NEW): get_sinusoid_encoding_table,
reverse_kl_from_samples, frechet_gaussian_over_time, EinOpsRearrange, AlohaFK
drawing fns merged into utils/viz_utils.py (dependency FLIPPED — viz_utils now
OWNS the drawing fns and pulls only INTRINSICS/cam_frame_to_cam_pixels/
ee_pose_to_cam_frame from egomimicUtils): draw_actions, draw_dot_on_frame,
draw_rotation_text, draw_annotation_text, miniviewer (+fmt helper).
All 11 moved bodies are byte-identical to originals (verified via AST diff
vs HEAD); only import lines differ.

model_utils.py placed in models/cores/ (not loose in models/) to keep the
models-group gate "models/ has only role dirs + init.py" intact.

DELETED dead (grep-proven 0 importers; scratch copies in scratch/utils_hier_deleted/):
utils/memory_utils.py, utils/real_utils.py, utils/obs_utils.py (only keep_keys,
0 refs), egomimic/init.pyc, egomimic/keypoints.jpeg.

Importers + config target updated in this same commit (grep-exhaustive over
egomimic/ tests/ scripts/ Tsimulation/ hydra_configs/):
callbacks/defaults.yaml wandb_profiler.target -> pl_utils.callbacks.timing_callback
trainHydra.py (instantiators/logging_utils/utils), norm_stats.py (utils),
pl_model.py (tensor_utils->vendored), hpt_heads.py / eval_hpt.py / algo/zoo/hpt.py
(model helpers), pushshapes.py / eval_act.py / robot/rollout.py /
data_visualization.py (drawing fns). pl_utils/utils.py internal rich_utils ref flipped.

Added pl_utils/init.py (was implicit ns pkg) so find_packages discovers it.

VERIFY (alloc 3325801, a40, pact-2 .venv): import egomimic + all 15 touched
modules import clean from THIS tree; pytest tests/ = 136 passed / 8 failed
(all pre-existing: 7 TestAlgoWiring old-HNet-sig + 1 missing-zarr) / 4 skipped —
ZERO new failures vs baseline; test_config_compose 25/25; parent config composes
and moved callbacks target resolves to the new class.

Path-map (old->new dotted) appended to scratch/hierarchy_path_map.txt for the
gates phase to fold into PORT_NOTES.

Co-Authored-By: Claude Opus 4.8 noreply@anthropic.com

rldb/ hierarchy pass: strays out + dead-file purge

Move (R-status rename, byte-identical modulo 2 fixed import lines):

egomimic/rldb/zarr/zarr_write_test.py -> egomimic/scripts/eva_process/zarr_write_test.py
(HDF5->zarr conversion CLI, not a test; already targets eva_process.
Fixed two stale imports as part of the move:
egomimic.rldb.zarr.ZarrWriter -> egomimic.rldb.zarr.zarr_writer.ZarrWriter (empty init)
egomimic.scripts.eva_process.zarr_utils -> egomimic.scripts.eva_process.eva_utils (file renamed earlier))

Delete dead (grep-proven zero importers; scratch backups in scratch/rldb_deleted_backup/):

egomimic/rldb/compression_utils.py av/jpeg video codec, no importers
egomimic/rldb/data_utils.py slerp/ypr quat math, superseded by egomimic.utils.pose_utils
egomimic/rldb/zarr/benchmark_forward_pass.py dead benchmark script
egomimic/rldb/zarr/test_zarr.py broken: imports nonexistent egomimic.rldb.utils.S3RLDBDataset
egomimic/rldb/scripts/ (whole subpackage) nds_pq/str2bool/etc already live in egomimic.utils.egomimicUtils

Verified on a40 alloc: import egomimic OK, moved module imports OK,
deleted modules raise ModuleNotFoundError, pytest tests/ = 136 passed / 8 failed
(pre-existing) / 4 skipped (zero new failures), rldb test_dataset_filter 8 passed.
No config target referenced any touched path.

Co-Authored-By: Claude Opus 4.8 noreply@anthropic.com

eval+pl_utils hierarchy pass: test_model_wrapper -> tests/, pushshapes sim-glue -> embodiment

Group: eval + pl_utils strays (hierarchy pass).

git mv egomimic/pl_utils/test_model_wrapper.py -> tests/test_model_wrapper.py
(R-status rename; 94% similar). tests/ has no init.py so pytest imp

Squash of: - a31d8ab4 tshape sim environment - ac272e8a Add Tsimulation viz/scripted/stats tools + physics tuning

…ding Squash of: - f261ff68 tsim training configs/embodiment - b38264ab Add pushshapes_sim HPT training: keymap, viz, eval fixes - 1e174131 Add episode-level packed dataloading

Squash of 10 commits from temp-arch-flexible: 7faf2012, 49ed0d34, 8387986e, 11a266ce, 3fe9a353, 0ea9d013, c83b6e69, 63a2e852, 7b71650a, a063c021

- test_hnet_nets.py (57): routing, chunk, dechunk, isotropic, stages (padded + packed), HNet assembly, ratio_loss, chunk_stats, STE, RMSNorm, AdaLN. - test_packed_pipeline.py (9): normalize broadcast on padded vs packed; _iter_leaves descent; multi-frame JPEG decode; end-to-end packed stats collection. - test_training_recipe.py (20): apply_optimization_params, init_weights height-scaled init, apply_lr_multiplier per-stage stamping, parameter_groups (default, with bias/norm WD=0, per-stage groups, AdamW-consumable). Plus algo wiring tests for the opt-in init_weights_range / lr_multipliers / use_parameter_groups / weight_decay kwargs. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Imports from EgoVerse7@temp-arch-flexible working tree as of 2026-05-20. Includes: - algo: input_modules.py + obs_transforms.py (new modules) - callbacks: chunker_residual_scheduler, ckpt_chunker(+dropout), random_attn_dropout - data configs: tsimulation_400ep, tsimulation_allep + tweaks to existing tsim configs - model configs: hnet_pushshapes_mamba_encdec, hnet_pushshapes_obs_ar + tweaks - eval/* edits, models/hnet_nets/* edits, schedulers, uv.lock - removes scripts/install_cuda_kernels.sh and egomimic/eval/eval_hnet_sim.py Excluded: egomimic/algo/hnet.py.bak.preinput (manual backup) and drift_eval_out_* (eval artifacts).

…screte diffusion + DDPM/DDIM samplers

…eys fallback)

… per-frame obs cond)

…rollout helper

…l); drop 400ep config

…ens defensive fix

…ckedSimEval rename, composite eval

…r); supports growing-T AR

- algo.py: add ar_inference_step_size (sub-steps per env tick at closed-loop AR); thread cfg_scale through _inference_step_ar/_inference_step_chunk; remove unused shape var - backbone.py: force_uncond branch for CFG two-pass blending; wire attn/resid dropout into Isotropic trunk - sampling.py: _CFGBackbone wrapper for cfg_scale > 1 sampling; schedule-matrix CFG plumbing Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…val viz - model/dfot_pushshapes.yaml: scale to 67M (d_model=512, T=12, num_heads=8, d_intermediate=2048); attn dropout=0.1, resid dropout=0.1, cond_dropout_prob=0.1; cfg_scale field; causal=true - data/tsimulation_full.yaml: 750-episode circle_750 dataset, batch_size=16 - evaluator/eval_dfot_val.yaml + eval_dfot_full.yaml: cfg_scale + ar_chunk_size + ar_step_size knobs - callbacks/ckpt_attn_dropout.yaml: composed callback (checkpoints + random_attn_dropout with values [0.1, 0.5, 0.8, 0.9, 0.95, 0.97, 0.98]) - eval/eval_dfot_val.py: 96x96 -> 512x512 nearest-neighbor upscale, palette (gt=green, chunk=red, ar=yellow), world-coord pixel scaling, threaded cfg_scale Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

- sbatch_train_dfot_200ep_full750_pace.sh: minor cleanup - sbatch_train_dfot_400ep_full750_pace.sh: 400ep H200 launch with scheduler.max_steps=18800 (fixes the 200ep cosine-not-decaying bug) - sbatch_train_dfot_400ep_attndrop.sh: 400ep + random_attn_dropout (the 5.6x sim_coverage win) - scripts/eval_cfg_latest.py: post-hoc DFoT eval CLI with --cfg-scale, --ar-chunk-size, --ar-step-size, --ar-inference-chunk-size, --ar-inference-step-size, --skip-val/--skip-sim - scripts/eval_fsd_latest.py: convenience wrapper for inference_mode=chunk - scripts/sbatch_fsd_eval.sh + sbatch_sim_sweep.sh: sbatch templates for closed-loop sim sweeps Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Introduces the abstract bases for the upcoming refactor: - egomimic/algo/outer_stage.py: OuterStage base. Owns an inner_stage field (the trunk). Subclasses implement encode (raw batch -> trunk-input tensor; can sample noise / record state on ctx) and decode (trunk-output -> per-modality prediction keys on batch). - egomimic/algo/loss.py: Loss base + CompositeLoss (weighted sum of terms) + MSELoss (per-modality MSE between pred_key and target_key). Loss policy becomes data — a hydra config block — rather than inheritance. No algorithm code uses these yet; HNetOuterStage / DFoTOuterStage and the algo.hnet.HNet / algo.dfot.DFoT refactors come in follow-up commits on this branch.

Foundational refactor for the upcoming DFoTOuterStage + DFoTLoss classes: - q_sample(x, t) -> dict: forward-noising step. Returns x_t, noise, alpha_t, sigma_t, logsnr, and the precond_scale * logsnr time_cond the backbone consumes. No backbone call inside. - compute_loss(v_pred, q_state) -> per-token weighted MSE: takes the dict from q_sample plus the backbones v_pred and computes the SNR-weighted epsilon-MSE. - forward(backbone, x, t, cond): kept as a back-compat wrapper that calls q_sample, runs the backbone, then compute_loss. Existing callers (DFoT.forward_training) are unaffected. Bitwise verified equivalent to the prior single-method path via the included /tmp/test_diffusion_split.py smoke (loss + x_pred match exactly, same random seed). discrete_diffusion.py is unsplit for now; current configs use continuous.

…verified) - egomimic/algo/dfot/outer_stage.py: DFoTOuterStage subclass. encode: encode obs to per-token cond, sample noise levels, run diffusion.q_sample, store q_state + external_cond on ctx, return noisy x_t. decode: write batch[pred_v] for the loss to read. forward override threads cu_seqlens/max_seqlen (packed mode) and time_cond into the backbone call. - egomimic/algo/loss.py: DFoTLoss class. Reads batch[pred_v] and ctx.q_state, calls diffusion.compute_loss (SNR-weighted eps-MSE), reduces to scalar. Bitwise verified via /tmp/test_dfot_outer_stage.py: padded-mode loss through DFoTOuterStage + DFoTLoss matches DFoT.forward_training to 0.0e+00 difference at fixed seed. Real CondEncoderModule + DFoTBackbone + ContinuousDiffusion submodules; no mocks of the math path. Algo class (DFoT.forward_training) is NOT yet wired to use these — that comes in the next commit on this branch. Inference paths (closed-loop AR sample_step, chunk plan-execute) also deferred.

Algo class: - __init__ now takes outer_stage: DFoTOuterStage and optional loss: Loss (auto-built as DFoTLoss(outer_stage.diffusion) if None). - Removes legacy cond_encoder, backbone, diffusion_type, diffusion_kwargs, cond_output_key args — they now live on the outer_stage subblock. - Adds @Property accessors for cond_encoder, backbone, diffusion, outer_stage, loss so existing inference paths (_inference_step_ar, _inference_step_chunk, _sample_chunk, forward_eval) keep working unchanged via property forwarding. - forward_training shrinks ~40 LOC -> ~20 LOC: build ctx, call outer_stage(batch, ctx), call loss(batch, ctx). No more inline diffusion math; no more cu_seqlens threading at this level. - Adds ar_inference_step_size knob. dfot_pushshapes.yaml: - New outer_stage: block wraps cond_encoder + backbone + diffusion. - Removes top-level diffusion_type / diffusion_kwargs; the diffusion module is now its own _target_ inside outer_stage. - loss: omitted (uses default DFoTLoss(outer_stage.diffusion)). End-to-end smoke (scripts/test_dfot_refactor_e2e.py) verifies the config instantiates via hydra and forward_training emits a finite scalar loss. Bitwise loss-equivalence was already shown in the prior commit (test_dfot_outer_stage.py). Old checkpoints WILL NOT load — state_dict keys moved from nets.{cond_encoder,backbone}.* to nets.outer_stage.{cond_encoder,inner_stage}.* This is intentional per the agreed clean-break refactor.

Adds scripts/test_dfot_inference.py: instantiates the refactored DFoT from dfot_pushshapes.yaml, runs inference_step in both ar and chunk modes, asserts action is (action_dim,) and finite. Verifies the @Property accessors (self.backbone, self.cond_encoder, self.diffusion) forward correctly to outer_stage submodules so the closed-loop AR and chunk-mode inference paths keep working after the refactor. Passing on compute node 8997316: [ar] action @ t=0: [0.30 0.47] [ar] action @ t=1: [0.44 1.06] [chunk] action @ t=0: [-0.78 -1.89]

…ied) - egomimic/algo/hnet_outer_stage.py: HNetOuterStage class. Inherits from OuterStage with inner_stage = HNetCore (stage tree). Owns cond_encoder, input_modules (summed per-token contributions), action_out head. Three forward paths inherited from the old HNetPolicy pattern: forward(batch, ctx) dispatcher (padded/packed), generate (offline AR), init_step_state + step (online single-tick). - egomimic/algo/loss.py: HNetLoss class. Reads batch[pred_action] + batch[actions], adds per-chunker ratio_loss_from_aux from ctx.aux. - scripts/test_hnet_outer_stage.py: equivalence smoke. Instantiates the existing hnet_pushshapes.yaml subcomponents, wraps the SAME instances in HNetPolicy and HNetOuterStage, ties their action_out heads, runs identical padded forward. Verified bitwise (max diff 0.00e+00 at fixed seed) on H200 alloc 8989249. Also smoke-tests the step inference path (shape + finite). The old HNetPolicy class is still in egomimic/algo/hnet.py and not yet removed. Algo-class refactor and yaml updates come in follow-up commits on this branch. Old checkpoints will NOT load — state_dict keys move from policy.* to outer_stage.* (or wherever the algo class places it). Per clean- break policy.

Algo class: - HNet.__init__ now takes outer_stage: HNetOuterStage + loss: Optional[Loss] instead of cond_encoder + hnet + action_dim + action_horizon + d_model + action_head_type + input_modules. action_horizon read from outer_stage. - Loss defaults to HNetLoss() if not provided. - self.nets is now ModuleDict({outer_stage, loss}); old keys (self.nets[policy], self.nets[cond_encoder], ...) are exposed via @Property forwarding to outer_stage submodules so legacy callsites in forward_eval / _teacher_forced_packed / _ar_rollout_packed / step inference keep working. - forward_training builds (batch, ctx), calls outer_stage(batch, ctx) + loss(batch, ctx), unpacks per-term breakdown (ctx.action_loss, ctx.ratio_loss) into the predictions dict for logging. HNetLoss: - Computes action MSE + ratio_loss_from_aux(ctx.aux) and stashes the per-term split on ctx for the algo to log separately. HNetOuterStage: - Adds back-compat bridge methods forward_padded(actions, obs) and forward_packed(actions, obs, cu, msl) returning (pred, aux). The forward_eval / _teacher_forced_packed paths use these; .generate / .step / .init_step_state already had matching signatures. hnet_pushshapes.yaml: - New outer_stage: block wraps cond_encoder + hnet stage tree + input_modules + action_head_type. Top-level keeps training-recipe knobs (init_weights_range, lr_multipliers, ...) and embodiment wiring. - loss: block omitted (defaults to HNetLoss()). scripts/test_hnet_refactor_e2e.py: packed-mode forward_training smoke. Verified passing on H200 alloc 8989249 — produces action_loss 1.51 + ratio_loss 0.032 + chunker stats for a 2-episode packed batch (T=12+20). Padded mode hits a pre-existing torch SDPA error (Explicit attn_mask should not be set when is_causal=True) in train mode — this is in the trunk code, not introduced by the refactor (the production training uses packed mode and never hits the padded-train path). Old HNetPolicy class is still in algo/hnet.py for now (no longer used by HNet algo); will be removed in a cleanup commit once all stage-based + flat yamls are migrated.

…hema Same outer_stage block pattern as the base hnet_pushshapes.yaml, applied to the variant configs. Each yaml moves cond_encoder + hnet stage tree + (optional) input_modules + action_head_type under outer_stage; keeps training-recipe knobs + embodiment wiring at the top level. Configs migrated: - hnet_pushshapes_big.yaml (d_model 256, 21M params) - hnet_pushshapes_crossattn.yaml (cond_mode: cross_attn) - hnet_pushshapes_mamba_encdec.yaml (M8 encoder/decoder) - hnet_pushshapes_obs_ar.yaml (ObsToken input module) - hnet_pushshapes_obs_ar_large.yaml (ObsToken + d_model 256 + T8) - hnet_pushshapes_recipe.yaml (H-Net paper recipe) scripts/test_hnet_yamls_load.py: batch instantiate smoke. Verified on H200 alloc 8989249 — all 7 stage-based yamls (base + 6 variants) instantiate from hydra config and produce sensible param counts (5.5M baseline up to 42.8M obs_ar_large).

- egomimic/algo/flat_fused_outer_stage.py: FlatFusedOuterStage class. Structurally a rename of FlatFusedPolicy with OuterStage inheritance plus an OuterStage forward(batch, ctx) dispatcher delegating to the existing forward_padded / forward_packed. Legacy generate / step / init_step_state preserved verbatim. encode / decode raise NotImplementedError since the interleaved 2T-token flow does not cleanly split along encode -> trunk -> decode. - egomimic/algo/hnet.py: HNetFused is now a thin pass-through subclass of HNet, kept as a separate _target_ for the existing flat yamls. All flat-fused behavior moved into FlatFusedOuterStage; HNet.__init__ already tolerates outer_stage.inner_stage=None. - 3 flat yamls migrated to outer_stage schema: hnet_pushshapes_fused.yaml, hnet_pushshapes_fused_lowlr.yaml, hnet_pushshapes_fused_pusher.yaml. - scripts/test_hnet_yamls_load.py extended to cover all 10 H-Net configs. Verified on H200 alloc 8989249: 10/10 instantiate successfully (7 HNetOuterStage + 3 FlatFusedOuterStage). Param counts sensible. Old FlatFusedPolicy class still in algo/hnet.py for now; cleanup of unused legacy classes (HNetPolicy, FlatFusedPolicy) is a follow-up.

…y retained — it is NOT dead) Pure-delete-with-grep-proof. Each symbol re-grepped over egomimic/ + hydra_configs/ + tests/ + scripts/ (excl __pycache__/external/scratch/logs) IMMEDIATELY before deletion to reconfirm zero LIVE refs. Deleted (proven zero live refs): - egomimic/models/diffusion_policy.py (whole file -> scratch; only self-ref) - egomimic/models/ddim_scheduler.py (whole file -> scratch; live DDIM is diffusion/sampling.ddim_sample) - egomimic/models/hnet/_smoke_stages.py (whole file -> scratch; only its own __main__ self-invoke) - algo/loss.py CompositeLoss + MSELoss (no _target_, no code ctor; live losses are HNetLoss()/DFoTLoss() built in code) - algo/packed_base.py HNet._ar_rollout_packed (no caller; live eval is forward_eval -> _teacher_forced_packed) - algo/packed_outer_stage.py HNetOuterStage.generate (dead; only ref was a docstring. step/init_step_state KEPT — they ARE the live closed-loop path, called at packed_base.py policy.init_step_state/step) - pl_utils/pl_data_utils.py RLDBModule, DualDataModuleWrapper, DataModuleWrapper (deprecated; live wrapper is MultiDataModuleWrapper, ref'd by 21 files) Also dropped now-unused imports (typing.Optional in packed_outer_stage.py; typing.Iterable/List/Optional in loss.py) and updated 2 yaml comment mirrors that named the now-deleted CompositeLoss (dfot_pushshapes.yaml, hnet_pushshapes.yaml). Deleted files moved to scratch/dead_code_c4/ (gitignored), not destroyed. SCOPE CORRECTION — HNetPolicy NOT deleted: the campaign evidence claimed HNetPolicy was zero-ref, but that grep excluded scripts/. The required grep over scripts/ found 3 LIVE importers+instantiators: scripts/smoke_packed_training.py (documented live tooling in CLAUDE.md L538) scripts/test_mamba_regression.py (old-vs-new equivalence regression) scripts/test_hnet_outer_stage.py (HNetPolicy-vs-HNetOuterStage equivalence smoke) The grep proof-gate fails for HNetPolicy, so it is retained (its .generate method stays with the class). All other 5 targets pass cleanly. Proofs (run on own a40 alloc, repo's symlinked .venv): - import-smoke: `import egomimic.algo, egomimic.models, egomimic.pl_utils` -> IMPORT OK - retained symbols import: Loss/HNetLoss/DFoTLoss, MultiDataModuleWrapper, HNet+HNetPolicy, HNetOuterStage (step=True init_step_state=True generate=False) - deleted symbols confirmed gone (CompositeLoss/MSELoss absent; dead model files absent) - py_compile all edited .py: OK - pytest tests/: 128 passed / 8 failed / 4 skipped — IDENTICAL to baseline. The 8 failures are the documented pre-existing set (7x TestAlgoWiring old-HNet-signature: "HNet.__init__() missing 1 required positional argument: 'outer_stage'"; 1x test_full_pipeline_collects_per_feature_stats missing-zarr). Zero NEW failures. Deletes touch zero reachable code paths -> no torch.equal needed. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

… + log_info onto base Algo Three blocks of code were byte-identical (verified via `diff`, rc=0) across the policy algos and are collapsed into a single home on the base `Algo`: * the per-embodiment key-resolution loop (`for emb in self.domains: ...` resolving resolved_ac_keys / proprio_keys / lang_keys / camera_keys via norm_stats) -- HNet packed_base.py:489-514 == DFoT diffusion/algo.py:169-194 == WindowedBC bc.py:699-724. Now `Algo._resolve_embodiment_keys(norm_stats)`; each subclass calls it from __init__. * `_build_obs` -- HNet packed_base.py:615-624 == DFoT diffusion/algo.py:246-255 (WindowedBC already inherited HNet's). Now defined once on `Algo`; the HNet and DFoT overrides are deleted (inherited). * `log_info` -- HNet packed_base.py:778-783 == DFoT diffusion/algo.py:419-425. The base `Algo.log_info` (formerly a NotImplementedError stub) now carries this exact body as the shared default; the HNet/DFoT overrides are deleted. Net -116/+83 lines. Behaviour is preserved by construction: the moved text is identical, so every subclass resolves the same function object from `Algo` (no subclass re-introduces a private copy). bc.py drops its now-unused `get_embodiment_id` import; packed_base/DFoT keep theirs (still used elsewhere). Proofs (run on a40 alloc 3325792, fixed seeds): * DFoT 1-epoch pixel smoke Train/Loss = 0.28878551721572876 -- BIT-IDENTICAL to the pre-c5 baseline (all 17 digits; this is the deterministic, 0.0-jitter path per the dedup_baseline manifest). * BC SMOKE=1 train: c5 Train/Loss = [1.3453816, 0.1752842]. A clean pre-c5 baseline RE-RUN on the same node gives [1.3452528, 0.1739942] -- i.e. the BC smoke is itself run-to-run nondeterministic at ~1e-3 (CUDA/image-encoder/ sim-eval RNG), and the c5 run lands CLOSER to the manifest values [1.3453673, 0.1749004] than the clean-tree rerun does (row0 |c5-manifest| =1.4e-5 vs |baseline_rerun-manifest|=1.1e-4). The refactor is within the tree's own deterministic-replay band. * New permanent guard tests/test_embodiment_key_resolution_shared.py (3 tests): asserts HNet/WindowedBC/DFoT all resolve the SAME `Algo` function objects for _resolve_embodiment_keys / _build_obs / log_info (import-identity) and produce byte-equal key sets + obs selection on a fixed norm_stats fixture. * pytest tests/ = 131 passed / 8 failed / 4 skipped. The 8 failures are the documented pre-existing baseline (7x TestAlgoWiring old-HNet-signature + 1x missing-zarr-data); +3 passed are the new permanent tests. Zero NEW failures vs the 128/8/4 profile. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…mg_utils.img_chw_to_uint8 Six evaluators each carried a byte-equivalent (C,H,W) float-in-[0,1] -> (H,W,C) uint8-in-[0,255] converter under three private names: - _img_chw_to_uint8 : video_rollout, pixel_video_rollout, spatial_video_rollout, policy_action - _u8 : bundle_anchored - _to_uint8_hwc : eval/core/eval_vae_recon Four were the 4-line clip->*255->transpose form; two (policy_action, _u8) were the same logic as a 2-liner. All six now delegate to a single canonical egomimic.eval.core.img_utils.img_chw_to_uint8; the local defs are removed and each call site renamed. EXCLUDED (genuinely different, left untouched): eval_dfot_self_rollout ._img_chw_to_uint8 uses an x.max()<=1.5 auto-scale heuristic + clip(0,255). PROOFS (a40 job 3325794, fixed seeds): - tests/test_eval_img_utils.py (NEW, permanent gate): test_canonical_matches_every_original_body PASSED np.array_equal(canonical, orig_4liner) and ==orig_2liner on a fixed torch.manual_seed(0) float tensor spanning out-of-[0,1] (rand*1.4-0.2). test_touched_eval_modules_import PASSED (import-smoke of all 7 modules) - full tests/: pre-c6 (stashed) = 8 failed / 131 passed / 4 skipped; post-c6 = 8 failed / 133 passed / 4 skipped. SAME 8 pre-existing failures (7x TestAlgoWiring old-HNet-signature + 1x TestInferNormFromPacked missing-zarr-data); +2 passed = the 2 new equality/import tests. Zero NEW failures. Behaviour-preserving: the 6 bodies were already identical. Deleted-helper provenance saved to scratch/c6_deleted_helpers/ (gitignored). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…uperset + hoist loss-reducer skeleton onto Algo Two structural near-duplicates removed: 1. Frame sampler. PixelSpatialDFoTOuterStage._sample_frames_packed (image only) duplicated the fixed_window/start_to_end/random_subsample branch logic of PixelObsActionDFoTOuterStage._sample_windows_packed (image+ action). The action version is a strict superset: its image cropping and cu_seqlens are byte-identical whether or not actions are supplied (the action branch performs zero extra RNG draws). Hoisted the superset onto the parent PixelSpatialDFoTOuterStage, made it accept actions=None, and reduced _sample_frames_packed to a thin delegate (_sample_windows_packed(images, None, cu)). Deleted the duplicate copy from the PixelObsAction subclass (now inherited). 2. Loss reducer. DFoT.compute_losses was the pure sum-per-embodiment {emb}_action_loss -> action_loss skeleton; promoted it to the Algo base as the default compute_losses and deleted DFoT's byte-identical override. VAE / HNet / HPT / PI keep their own overrides because they are genuine SUPERSETS (recon/kl/lpips, ratio_loss, domain-count division) — NOT folded. PROOFS (a40 alloc 3325796, fixed seeds): - BEFORE-state probe: current image-only sampler vs current superset produce torch.equal image crops + cu_seqlens across all 3 modes (img_equal=True, cu_equal=True for fixed_window / start_to_end / random_subsample). - Permanent test tests/test_c7_sampler_reducer_equality.py (3 tests, all pass): * image-only sampler == superset(actions=None) torch.equal across all 3 modes and all episode-length regimes (<n, ==n, >n); * hoisted Algo.compute_losses torch.equal to legacy DFoT.compute_losses on fixed predictions; * guard: HNet ratio_loss reducer is NOT reproduced by the base default (catches a future wrong fold of the superset overrides). - DFoT 1-epoch pixel-policy smoke (exercises the rewritten packed sampler, frame_sampling=fixed_window, + the inherited reducer): Train/Loss = Train/action_loss = Train/emb15_action_loss = 0.28878551721572876, BIT- IDENTICAL (all 17 digits) to the pre-collapse baseline. - pytest tests/ = 136 passed / 8 failed / 4 skipped: the 8 failures are the documented pre-existing set (7x TestAlgoWiring old-HNet-signature + 1x missing-zarr-data InferNorm), ZERO new failures; suite grew by the 3 new permanent equality tests. No hydra configs reference the touched methods (no config mirror needed). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…eads/stems/diffusion) + utils End state: `ls egomimic/models/*.py` shows ONLY __init__.py. Every move is a git mv (R-status); the 2 splits git-mv the file to its primary home first (history-preserving) then extract the other-role classes into new files in this same commit. All moved class bodies are byte-identical modulo import lines (verified by class-body diff vs HEAD). Every importer + config _target_ updated in this commit; grep confirms zero remaining old-module dotted refs. WHOLE-FILE MOVES (git mv): fm_policy.py -> heads/fm_policy.py (policy output head) denoising_policy.py -> heads/denoising_policy.py (diffusion policy head base) denoising_nets.py -> diffusion/denoising_nets.py (legacy diffusion nets) image_vae.py -> diffusion/image_vae.py (DFoT pixel<->latent codec) preprocess_pi_obs.py-> utils/preprocess_pi_obs.py (data preprocessing, OUT of models/) SPLITS (git mv to primary home + extract): act_nets.py -> stems/resnet_conv.py (primary: Module/ConvBase/CoordConv2d/ResNet18Conv) + cores/act_transformer.py (PositionalEncoding/Transformer/StyleEncoder) hpt_nets.py -> stems/hpt_stems.py (primary: PolicyStem/MLPPolicyStem/ResNet) + cores/hpt_transformer.py (CrossAttention/Attention/MLP/BlockWithMasking/ MultiheadAttention/SimpleTransformer) + heads/hpt_heads.py (PolicyHead/MLPPolicyHead/TransformerDecoderBlock/ MultiBlockTransformerDecoder) DEAD-CODE PRUNE (grep-proven 0 external refs): hpt_nets: STPolicyStem, AttentivePooling, vit_base_patch16, T5TokenizerWrapper, T5Encoder, L2Norm (also drops the heavy timm/transformers T5/ViT imports) denoising_nets: ConditionalClassifier1D, CrossTransformerCfg2, CrossTransformerProj VERIFY (on a40 alloc, PYTHONPATH=repo working tree, Python 3.11): - `import egomimic` OK; every touched module imports OK (pi/rollout fail only on pre-existing missing optional deps openpi/robot_utils, BEFORE the moved import lines). - py_compile OK for all 10 new/moved files. - Hydra compose-check passes for every model config whose _target_ moved. - OLD-ckpt path map appended to scratch/hierarchy_path_map.txt (gates phase folds it into PORT_NOTES). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

In-place class rename (no file moved): the packed-sequence policy Algo base was misleadingly named ``HNet`` while it is actually the shared base for the packed-sequence algos (WindowedBC subclasses it; the inner H-Net stage tree is supplied via ``outer_stage`` and is a separate, correctly-named concept in models/hnet). Renamed the class to ``PackedAlgoBase`` to reflect its role. Changes (all in one commit): - packed_base.py: ``class HNet(Algo)`` -> ``class PackedAlgoBase(Algo)``; docstring updated; kept ``HNet = PackedAlgoBase`` compat alias at module bottom (commented) so OLD ckpts/configs whose resolved _target_ still names ``egomimic.algo.packed_base.HNet`` keep resolving. - bc.py: import + ``class WindowedBC(PackedAlgoBase)`` + base-referring docstring/comments updated. - algo/__init__.py: import-example comment updated. - 7 model configs (hnet_pushshapes*.yaml): _target_ -> egomimic.algo.packed_base.PackedAlgoBase. - scripts/smoke_packed_{training_e2e,validation}.py: import (as HNetAlgo) + Algo-method docstrings updated. - tests/{test_c7_sampler_reducer_equality, test_embodiment_key_resolution_shared, test_training_recipe}.py: import + class refs updated. OUT OF SCOPE (untouched per task): models/hnet (architecture HNet), HNetCore / HNetOuterStage / HNetLoss / HNetSimEval (architecture/stage names), HNetPolicy (landmine: proven alive), algo/obs_transforms.py (landmine: designed extension point). Verified on A40 alloc: `import egomimic` OK; PackedAlgoBase + HNet-alias identity OK (HNet is PackedAlgoBase); WindowedBC subclass OK; all 7 hnet configs compose with new _target_; old dotted path resolves via compat alias. Tests: test_c7 + test_embodiment_key_resolution_shared + test_config_compose all pass; test_training_recipe shows the SAME 7 pre-existing TestAlgoWiring failures (old constructor signature, missing ``outer_stage``) and ZERO new failures vs the 136/8/4 baseline. Path-map (old->new dotted) appended to scratch/hierarchy_path_map.txt for the gates phase to fold into PORT_NOTES. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…s + dead-file purge Relocate the misplaced contents of egomimic/utils/ to semantic homes and delete grep-proven dead files. One commit per the hierarchy-pass group rule; revertible via tag pre-utils-hier. RENAMES (git mv, R-status): utils/timing_callback.py -> pl_utils/callbacks/timing_callback.py utils/instantiators.py -> pl_utils/instantiators.py utils/logging_utils.py -> pl_utils/logging_utils.py utils/rich_utils.py -> pl_utils/rich_utils.py utils/utils.py -> pl_utils/utils.py utils/tensor_utils.py -> vendored/robomimic_tensor_utils.py (988-line verbatim robomimic vendor; +vendored/README.md provenance note) egomimicUtils.py SPLIT (source file STAYS in utils/ as the generic remainder — constants ARIA/EXTRINSICS/INTRINSICS, geometry, str2bool, interpolate_*, CameraTransforms, download_from_huggingface, STD_SCALE): model helpers -> models/cores/model_utils.py (NEW): get_sinusoid_encoding_table, reverse_kl_from_samples, frechet_gaussian_over_time, EinOpsRearrange, AlohaFK drawing fns merged into utils/viz_utils.py (dependency FLIPPED — viz_utils now OWNS the drawing fns and pulls only INTRINSICS/cam_frame_to_cam_pixels/ ee_pose_to_cam_frame from egomimicUtils): draw_actions, draw_dot_on_frame, draw_rotation_text, draw_annotation_text, miniviewer (+fmt helper). All 11 moved bodies are byte-identical to originals (verified via AST diff vs HEAD); only import lines differ. model_utils.py placed in models/cores/ (not loose in models/) to keep the models-group gate "models/ has only role dirs + __init__.py" intact. DELETED dead (grep-proven 0 importers; scratch copies in scratch/utils_hier_deleted/): utils/memory_utils.py, utils/real_utils.py, utils/obs_utils.py (only keep_keys, 0 refs), egomimic/__init__.pyc, egomimic/keypoints.jpeg. Importers + config _target_ updated in this same commit (grep-exhaustive over egomimic/ tests/ scripts/ Tsimulation/ hydra_configs/): callbacks/defaults.yaml wandb_profiler._target_ -> pl_utils.callbacks.timing_callback trainHydra.py (instantiators/logging_utils/utils), norm_stats.py (utils), pl_model.py (tensor_utils->vendored), hpt_heads.py / eval_hpt.py / algo/zoo/hpt.py (model helpers), pushshapes.py / eval_act.py / robot/rollout.py / data_visualization.py (drawing fns). pl_utils/utils.py internal rich_utils ref flipped. Added pl_utils/__init__.py (was implicit ns pkg) so find_packages discovers it. VERIFY (alloc 3325801, a40, pact-2 .venv): import egomimic + all 15 touched modules import clean from THIS tree; pytest tests/ = 136 passed / 8 failed (all pre-existing: 7 TestAlgoWiring old-HNet-sig + 1 missing-zarr) / 4 skipped — ZERO new failures vs baseline; test_config_compose 25/25; parent config composes and moved callbacks _target_ resolves to the new class. Path-map (old->new dotted) appended to scratch/hierarchy_path_map.txt for the gates phase to fold into PORT_NOTES. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Move (R-status rename, byte-identical modulo 2 fixed import lines): - egomimic/rldb/zarr/zarr_write_test.py -> egomimic/scripts/eva_process/zarr_write_test.py (HDF5->zarr conversion CLI, not a test; already targets eva_process. Fixed two stale imports as part of the move: egomimic.rldb.zarr.ZarrWriter -> egomimic.rldb.zarr.zarr_writer.ZarrWriter (empty __init__) egomimic.scripts.eva_process.zarr_utils -> egomimic.scripts.eva_process.eva_utils (file renamed earlier)) Delete dead (grep-proven zero importers; scratch backups in scratch/rldb_deleted_backup/): - egomimic/rldb/compression_utils.py av/jpeg video codec, no importers - egomimic/rldb/data_utils.py slerp/ypr quat math, superseded by egomimic.utils.pose_utils - egomimic/rldb/zarr/benchmark_forward_pass.py dead benchmark script - egomimic/rldb/zarr/test_zarr.py broken: imports nonexistent egomimic.rldb.utils.S3RLDBDataset - egomimic/rldb/scripts/ (whole subpackage) nds_pq/str2bool/etc already live in egomimic.utils.egomimicUtils Verified on a40 alloc: `import egomimic` OK, moved module imports OK, deleted modules raise ModuleNotFoundError, pytest tests/ = 136 passed / 8 failed (pre-existing) / 4 skipped (zero new failures), rldb test_dataset_filter 8 passed. No config _target_ referenced any touched path. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…s sim-glue -> embodiment Group: eval + pl_utils strays (hierarchy pass). 1) git mv egomimic/pl_utils/test_model_wrapper.py -> tests/test_model_wrapper.py (R-status rename; 94% similar). tests/ has no __init__.py so pytest imports it as top-level module test_model_wrapper; updated the in-test _target_ (egomimic.pl_utils.test_model_wrapper.DummyAlgo -> test_model_wrapper.DummyAlgo) and the two __module__ assertions to match. Behaviour identical: 2 passed + 1 pre-existing fail (lr_scheduler dict-wrap) both before and after the move. 2) SPLIT: extracted the PushShapes sim-eval glue (_env_to_zarr_pushshapes, _ENV_TO_ZARR, _state_to_init) out of egomimic/eval/core/eval_sim.py (the algo-agnostic evaluator, which stays as the primary home of the eval classes) into a NEW embodiment helper module egomimic/rldb/embodiment/pushshapes_sim.py. The 3 symbols are byte-identical (verified via AST extraction). eval_sim.py re-imports them so the legacy names (incl. the facade path egomimic.eval.eval_sim._env_to_zarr_pushshapes / ._state_to_init used by scripts/verify_*) keep resolving to the SAME objects. Importers repointed to the canonical home in this same commit: - egomimic/eval/dfot/eval_dfot_self_rollout.py (_state_to_init) - scripts/verify_normalization.py / verify_normalization2.py / verify_image.py (_env_to_zarr_pushshapes) Verification (a40 alloc): import egomimic OK; pushshapes_sim / eval_sim re-export / eval_dfot_self_rollout / legacy facade eval_sim all import and resolve to identical objects; all 4 eval_sim class _target_s resolve via hydra get_class; verify_* scripts py_compile clean; pytest tests/test_config_compose.py 25/25; pytest tests/ = 138 passed / 9 failed / 4 skipped (baseline 136/8/4 + the now-collected test_model_wrapper 2-pass +1-pre-existing-fail; zero NEW failures, 8 pre-existing TestAlgoWiring + 1 missing-zarr unchanged). Per-folder .py counts: pl_utils 8->7, tests 12->13, rldb/embodiment 4->5, eval/core 8->8 (SPLIT: eval_sim.py stays; glue fragment extracted to embodiment). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…lib + ops homes, dead launchers Folder-group of the pact-2 hierarchy pass. Every move is a git-mv R-rename (history preserved); moved code is byte-identical modulo import lines. Moves (all git mv, R-status): - 6 regression smokes scripts/test_*.py -> tests/regression/ . Each got a module-level pytest.skip(allow_module_level=True) guard (the +9 lines) so pytest collection stays clean -- they hardcode an EgoVerse-clone-3 path and load configs removed from this repo + need GPU/checkpoints; run manually. - 7 packed/composite/teacher smokes scripts/smoke_*.py -> tests/regression/ (byte-identical, no test_ prefix so pytest never collects them). - scripts/smoke_sim_eval.py -> egomimic/eval/core/ckpt_loading.py: it is a LIBRARY (load_algo_from_ckpt + _MockTrainer, the CLI main() rides along). Its sibling import rebased to egomimic.eval.core.eval_sim (off the legacy facade). All 5 importers updated in this commit: scripts/{tf_dec_overlay,tf_decoupled_eval,tf_chunk_eval, eval_cfg_latest,eval_fsd_latest}.py - root setup_nvm.sh, run_eva_docker.sh, pull_models.sh -> scripts/ops/ . Deletes (with proof): - scripts/sbatch_train_hnet_fused_{50,80}ep_cosine.sh -- both pass model=hnet_pushshapes_fused, a config removed from hydra_configs/ (survives only quarantined under scratch/flat_fused_quarantine/). Dead launchers. - scripts/__pycache__/ . Docs touched for accuracy (not load-bearing): egomimic/robot/eva/eva.md run_eva_docker.sh path; CLAUDE.md smoke-script paths + section heading. Verify (on a40 alloc 3325804): import egomimic OK; new ckpt_loading path exposes load_algo_from_ckpt + _MockTrainer; all 5 importers parse; pytest tests/ --collect-only stays 150 collected, 0 errors (regression dir = 0 collected / 6 skipped); full tests/ = 138 passed / 9 failed / 10 skipped, identical pass+fail to the pre-change HEAD (the 1 test_model_wrapper failure + 7 TestAlgoWiring + 1 missing-zarr all pre-exist), +6 = the guarded smokes. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

The hierarchy pass (10b2398) moved test_model_wrapper.py from egomimic/pl_utils/ into tests/, newly subjecting it to `pytest tests/` collection. One assertion was stale: it asserted optimizers["lr_scheduler"] *is* a StepLR, but ModelWrapper.configure_optimizers() returns the Lightning scheduler-config dict {"scheduler": <StepLR>, "interval", "frequency"}. The production contract is correct; the test was written against an older bare- scheduler return. Target the nested ["scheduler"] key. No production behavior changed (debug-the-assertion only). Restores ZERO-new-failures gate. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…ap, deep-clean log DESIGN.md §9: hierarchy contract — role-dirs (cores/heads/stems) vs subsystem-dirs (hnet/diffusion) asymmetry is intentional; root scripts/ (launchers, run-as-script) vs egomimic/scripts/ (importable data CLIs) split; final egomimic/models/ tree (zero loose .py). PORT_NOTES.md: hierarchy-pass record (6 group commits + tags + per-group R-rename counts), the gate-fix for the newly-collected stale test_model_wrapper assertion, final-gate results, and the FULL old->new OLD-ckpt _target_ path map. Also added the deep-clean dead-code-purge record (collapses c4-c7) that the gate-check found missing from the dedup-gate section (which only covered c1-c3). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…oc-gap fix) Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…-outer-stage) COMBINE A. Unify the three DFoT video-rollout evaluators into ONE family-agnostic DFoTVideoRolloutEval. Each outer stage now owns a ``rollout_video_episode`` hook (the family-specific sampler + decode); the eval owns the family-INVARIANT skeleton. Decode-on-outer-stage. The family-specific code is MOVED byte-for-byte from the evaluators INTO the outer stages: * ObsActionImageDFoTOuterStage: unconditional chunk/AR bundle sampler, slice the flat VAE-latent portion, frozen-VAE decode. Single-panel (t=0 GT prepended). Metric prefix "video". * ImageSpatialDFoTOuterStage: per-step (state,action) cond, conditional chunk/AR spatial-latent sampler (optional GT-context anchor), frozen- VAE decode. Side-by-side [GT|pred]. Prefix "spatial". * PixelSpatialDFoTOuterStage: sliding-window pixel rollout anchored on the first GT frame(s), no VAE, + PSNR/SSIM/LPIPS. Side-by-side. Prefix "pixel". The unified eval dispatches on three class attrs the outer stage advertises (video_metric_prefix / video_panel / video_has_extra_metrics) and owns: packed/padded episode indexing, per-step recon-MSE accumulation, panel assembly, perceptual-metric averaging, mp4 emission. eval_dfot_spatial_video_rollout.py + eval_dfot_pixel_video_rollout.py deleted (scratch backup kept). DFoTSpatialVideoRolloutEval / DFoTPixelVideoRolloutEval kept as compat aliases of the unified class — the __init__ is a true superset and every old config now passes its knobs explicitly, so the aliases are pure _target_ redirects. _target_ mirrored SAME commit in the live configs that named the old classes (eval_dfot_image_spatial: +explicit n_context_frames=0; eval_dfot_pixel: +explicit n_context_frames=1, rollout_window=9 — all were the old class defaults). eval_dfot_obs_action_image already named the kept class. The eval/__init__ _MODULE_HOMES legacy-import names + dfot/__init__ re-exports now point at the unified module. PROOF (per the universal bar, all 3 families). Fixed-seed eval OUTPUT equality old-vs-new in separate processes: the metrics dict (torch.equal, incl. PSNR/SSIM/LPIPS) AND the rendered frame tensors (torch.equal) on a random-weight algo + one real packed batch. PASS for image_spatial, obs_action_image (BOTH chunk + AR sub-evals), and pixel — bit-identical (A40 deterministic; verified by an independent rerun). tests/ back to the 8-failed baseline (the 9th, test_touched_eval_modules_import, updated for the collapse), all 20 evaluator yamls compose+resolve. Net -222 LOC (940 -> 269 eval LOC; family code now lives on the stages). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…b base Three structural dedups on the DFoT evaluators (egomimic/eval/dfot/), all behaviour-preserving (proven by fixed-seed torch.equal old-vs-new + textual identity of the moved method bodies): 1. POLICY PAIR MERGE. git mv eval_dfot_policy_action.py -> eval_dfot_policy.py and fold DFoTPolicyRecedingHorizonEval (which subclasses DFoTPolicyActionEval and reuses _rollout/_ddim_from_v verbatim) into the same module; delete eval_dfot_policy_receding_horizon.py. _ddim_from_v + _rollout bodies are byte-identical to the pre-combineB sources; the RH compute_metrics_and_viz body is verbatim. Mirrors _target_ in the two policy evaluator configs and the eval/__init__ legacy-import facade + dfot/__init__ exports. 2. SHARED ANCHORED-DDIM HELPER. New eval/dfot/_sampling.py::anchored_ddim_rollout factors the single-tensor sched[:,:n]=clean anchored loop shared by DFoTBundleAnchoredEval._rollout (1D bundle) and ImageSpatialDFoTOuterStage._rollout_latent (5D spatial latent). Both call sites adopt it with shape adapters; proven torch.equal to the verbatim inline loop for both the 1D and 5D shapes. The 2D-policy dual-stream _rollout (co-denoises x_lat + x_act through a dual-output backbone with a hand-rolled v-pred step) is genuinely different and is left duplicated (documented in the helper docstring). 3. KNOB BASE. New eval/dfot/_base.py::DFoTVideoEvalMixin hoists the common knob storage (embodiment_name, image_key, video_subdir/_video_subdir, recon_loss_n_frames, upscale_to, n_chunk_steps) + the video_dir() override shared by bundle_anchored / policy / video_rollout. Each __init__ keeps its own per-class defaults and passes them through store_dfot_knobs explicitly; resolved attributes verified identical to the per-class expected defaults. Proof: scratch/proof_combineB.py (removed post-proof) — 14/14 PASS, all torch.equal with maxdiff 0.00e+00 on A40. Full tests/ = 139 passed / 8 failed (pre-existing, unrelated: test_packed_pipeline + test_training_recipe wiring) / 10 skipped — zero new failures. All three touched evaluator configs compose. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Records the dedup-campaign DFoT-evaluator combine (combines A c73a554 + B 7fde862): rollout trio -> 1 family-agnostic DFoTVideoRolloutEval (decode-on-outer-stage), policy pair merge (eval_dfot_policy, RH subclasses Action, shared _rollout), shared anchored-DDIM helper (_sampling), knob/path mixin (_base). self_rollout untouched (genuinely-different uint8 variant). eval/dfot/ 7-eval-file set: 1784 -> 1278 lines (-506, -28%); 4 per-family modules deleted, 2 reusable helpers added; class count consolidated. Per-file before/after table + the full _target_ map (live + dead-on-disk configs, all resolve via __init__ compat aliases) recorded in PORT_NOTES. Final gates (alloc 3326107, a40 megabot, pact-2 symlinked .venv): - pytest tests/: 139 passed / 8 failed / 10 skipped (8 = pre-existing; ZERO new) - compose sweep: TOTAL_PASS=107 / TOTAL_FAIL=2 (2 = pre-broken viz/pi_cartesian_lang*, NOT DFoT); all 11 DFoT evaluator yamls compose - real eval forward: evaluator=eval_dfot_image_spatial + eval_dfot_pixel each built a REAL DFoT algo (random weights, fixed seed) and ran one compute_metrics_and_viz end-to-end through the unified eval + outer-stage decode hook -> finite metrics (11 / 13), mp4 written (599954 / 288794 B) Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

HPTEvalVideo and PIEvalVideo each carried a byte-for-byte equivalent "apply the revert transform once, then reuse it for both the cam-frame paired/final MSE and the viz video" block (HPT named the prediction key main_pred_key, PI named it pred_key -- same f"{embodiment}_{ac_key}" value). Hoist it into one helper, eval/core/_viz_shared.py cam_frame_mse_and_viz_batches(...), and have both evaluators delegate. The per-evaluator-specific metrics (HPT's Frechet / Reverse-KL / aux+shared heads, PI's native-frame MSE) stay in place; only the genuinely-common cam block moves. Net -38 lines across the two zoo files. Proven output-identical: for each evaluator, built from its composed evaluator config (real viz_func, hydra-composed) with a stub algo + fixed- seed batch and a deterministic injected transform, compute_metrics_and_viz run with the pre-combineC class vs the refactored class produces torch.equal cam_paired_mse_avg / cam_final_mse_avg (both 0.5509150624) and torch.equal preds_for_viz / gt_batch_viz tensors, with identical metric-key sets. tests/test_pi.py + tests/test_config_compose.py: 25 passed, 1 skipped (zero new failures). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

… peers The zoo/ grouping was wrong: HPT and PI are actively-developed algorithms benchmarked against the H-Net line, not a pen of frozen third-party baselines. They are first-class peers of bc. Dissolve algo/zoo/ and the mirrored eval/zoo/ into per-algo folders; algo/ and eval/ mirror each other. Moves (git mv, R-status renames): egomimic/algo/zoo/hpt.py -> egomimic/algo/hpt/hpt.py egomimic/algo/zoo/pi.py -> egomimic/algo/pi/pi.py egomimic/algo/zoo/act.py -> egomimic/algo/act/act.py egomimic/eval/zoo/eval_hpt.py -> egomimic/eval/hpt/eval_hpt.py egomimic/eval/zoo/eval_pi.py -> egomimic/eval/pi/eval_pi.py egomimic/eval/zoo/eval_act.py -> egomimic/eval/act/eval_act.py Each new folder gets an __init__.py re-exporting its public class (egomimic.algo.hpt.HPT/HPTModel, egomimic.algo.pi.PI, egomimic.algo.act.ACT/ ACTModel; egomimic.eval.{hpt,pi,act}.<...>EvalVideo). The two zoo/__init__.py are git rm'd; the algo/zoo PEP-562 lazy-PI shim is gone but PI laziness is preserved (top-level egomimic.algo never imports egomimic.algo.pi eagerly). Mirrored in the SAME commit: 16 yaml _target_s (12 hpt model configs, act.yaml, pi0.5_base.yaml, eval_hpt.yaml, eval_pi.yaml) + every importer (eval/__init__.py _MODULE_HOMES facade, algo/__init__.py doc comments, tests/test_pi.py). Principle (DESIGN.md §9.4): all algorithms are first-class peers under algo/, each its own folder when it may grow; bc stays a single flat module-home (left untouched here); eval mirrors algo; the shared spine (algo.py/PackedAlgoBase, packed_base, loss, outer_stage, obs_transforms, packed_outer_stage) stays flat at algo/ top — not zoo, not bc-specific. Shared eval helper eval/core/_viz_shared.cam_frame_mse_and_viz_batches stays in eval/core/ (import path unchanged). Old-ckpt _target_ resolution: scratch/ hierarchy_path_map.txt (ZOO DISSOLVE block) + PORT_NOTES.md. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

… -> */algo.py, drop loss/packed_base/outer_stage shims) — safety snapshot before cotrain+eval port

…ndEncoder + inner_working_dim polymorphism - models/hnet/per_embodiment_stage.py: per-emb outer stages, shared inner trunk; dispatch by ctx.embodiment_id (ported from elmo/hnet-cotrain-circle-stick, import remapped to models.hnet) - models/hnet/stages.py: inner_working_dim property (base=input_dim, ChunkerStage=output_dim) for polymorphic dim-handoff - models/hnet/hnet.py: replace isinstance(ChunkerStage) dim-check with stages[i].inner_working_dim (-9 lines) - models/hnet/context.py: add embodiment_id field - models/stems/cond_encoders.py: MultiEmbodimentCondEncoder + ignored embodiment_id kwarg on CondEncoderModule.encode - algo/hnet/algo.py: thread embodiment_id + domain_by_id reverse map through HNetOuterStage/PackedAlgoBase forward paths Gate: hnet regression 57 passed/2 skipped; PerEmbodimentStage construct+dispatch+guard smoke green.

…override resolver - embodiment.py: PUSHSHAPES_SIM_STICK = 16 (resolves by name; no per-emb handler needed) - zarr_dataset_multi.py: LocalEpisodeResolverWithEmbodimentOverride — re-tags shared-metadata zarrs to a config-supplied embodiment so circle/stick dispatch apart - schedulers.py.piecewise_linear already present in pact-2 (skipped) Gate: stick id=16, circle id=15, resolver imports, viz_gt_preds present.

… skynet paths) - model/hnet_pushshapes_cotrain.yaml: PackedAlgoBase + HNetOuterStage with MultiEmbodimentCondEncoder (per-emb circle/stick) and 2-level PerEmbodimentStage chain (per-emb EncDec+Chunker r=8 outer, shared EncDec+Chunker r=4+Compute inner). _target_ paths remapped to models.hnet/models.stems; algo.hnet.HNet alias preserved. No recipe knobs (single LR, source-faithful); comment points to warmup_cosine for fresh launches. - data/tsimulation_cotrain.yaml: circle_750 + stick_312 on skynet; stick via LocalEpisodeResolverWithEmbodimentOverride. chunking=sequential (stick has a 1068-frame episode > max_seq_len=1024). Gate: full trainHydra smoke on L40S — compose + load both datasets + norm_stats + construct 15.6M-param model + 2-step forward_training, clean exit.

… cross-emb composite - probes/eval_boundary_strip.py: horizontal multi-stage render (Stage k labels, time on x-axis, per-emb keys, frame-level upsample); threads embodiment_id via domain_by_id - probes/eval_pca_tokens.py: embodiment threading + n_extra_train_episodes train mix-in - core/eval_composite.py: EvalVideoList below_indices (vstack boundary strip below traj+PCA row) - core/eval_combined_rows.py (new): CombinedRowsEval — vstack per-emb composites into one 2-row video - evaluator/eval_hnet_pairs.yaml + _combined.yaml: wire the cross-emb composite (eval_hnet/pca/boundary), paths remapped to core/probes - eval_hnet.py unchanged (already had per-emb viz_func) Gate: mode=eval on cotrain model rendered per-emb composites (traj + PCA + horizontal Stage0/Stage1 boundary strip below), verified visually.

…ndowedBC mimic, full-episode TF) Mimics WindowedBC on the HNet algo with FULL-EPISODE teacher forcing: - algo/hnet/algo.py: HNetOuterStage action_head_type=gmm (pre-instantiated GMMActionHead); decode() stashes head on ctx.extras. New GMMLoss (peer of HNetLoss): builds per-obs-step 8-action chunked targets (repeat-pad at episode ends, packed+padded) and computes GMM-NLL via head.nll + ratio loss. - embodiment.py: PUSHSHAPES_SIM_SMALL_CIRCLE = 17. - model/hnet_cotrain_gmm_obs.yaml: ObsToken (ResNet VisualCore + proprio) obs-as-input, no-AdaLN (cond_key=null), per-emb outer ChunkerStage -> shared inner ChunkerStage -> ComputeStage, GMM head chunk_len=8, GMMLoss, cotrain circle+small_circle. - data/tsimulation_cotrain_small_circle.yaml: full-episode TF (chunking=none); small_circle PLACEHOLDER at circle path via embodiment_override. Gate: trainHydra smoke on L40S — construct 15.4M-param model + 2-step forward_training with GMMLoss, clean exit.

…it by filter) Found the small-circle data: /coc/flash7/paphiwetsa3/datasets/circle_co_big_small holds 953 big-circle + 955 small-circle episodes, BOTH tagged embodiment=pushshapes_sim, distinguished only by task_description.env_args.pusher_shape (circle vs circle_small). Split by episode-name DatasetFilter: - pushshapes_sim: folder=circle_co_big_small, filter "circle_small not in episode_hash" (905 big) - pushshapes_sim_small_circle: same folder, filter "circle_small in episode_hash" + embodiment_override (955 small) No more placeholder. max_seq_len 1024->2560 (longest small episode = 2255 frames; full-episode TF), batch_size 1 (long episodes + ResNet), model action_horizon 1024->2560. Gate: trainHydra smoke on L40S — both filtered splits load, norm_stats, 15.4M-param construct, 2-step forward_training with GMMLoss, clean exit.

Make val-overlay + closed-loop sim work with the chunked GMM head: - HNetOuterStage._decode_pred_actions: eval bridges (forward_padded/packed) decode GMM params -> sample (low-noise) -> chunk pos-0 point action per frame for the teacher-forced overlay. - inference_step: open-loop action-chunk buffer (mirrors WindowedBCPolicy.step queue) -- on an obs-step decode the chunk_len actions, dispense one/frame with NO model call between, re-observe every chunk_len frames (obs_stride==chunk_len). Non-GMM unchanged (one action/frame). - HNetOuterStage.step: thread embodiment_id onto the per-step ctx so PerEmbodimentStage dispatches in the AR cache. - PerEmbodimentStage._allocate: pact-2 AR-cache scheme (per-emb sub-stage cache; only active emb advanced). Gate (untrained, L40S): val overlay renders both embs + actions_paired_mse computed; sim eval rolls out both embs (coverage 0.0 untrained) -- both paths clean.

Obs-input H-Net GMM cotrain model for the PushShapes circle/small-circle campaign: - first_token causal chunker with grab_prev_end (concat[first-token, prev-chunk-end] -> MLP) + duplicate upsampler + residual mixer (stages.py, residual_mixer.py) - per-embodiment GMM heads (gmm_heads dict) + per-emb obs encoder (embodiment_id threaded through ObsToken; MultiEmbodimentCondEncoder d_cond) so only the high-level trunk is shared (algo.py, stems/) - obs_stride arch (strided token stream + tiled chunk targets) + fp32-matched sim rollout dtype (algo.py) - ratio-loss scheduler callback + GMM / first-end model+data configs - scan_interface + circle_small sim-env support Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01TYmG3nhxt7LYiaaSJPsKEV

Eval and visualization tooling for the GMM cotrain models: - boundary-strip / PCA-token probes, keyframe + boundary viz, overlay loader, sim-eval + ckpt-loading updates (eval/core, eval/probes) - chunkviz explorer: export.py (per-frame chunk ids + cross-emb shared PCA with embodiment-removal variants raw / mean-diff / LDA), build_html.py (self-contained HTML), viewer_template.html (multi-model dropdown + single/compare + PCA-removal toggle), serve / nocache_server - GMM evaluator configs (circle / cotrain / smallcircle / firstend) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01TYmG3nhxt7LYiaaSJPsKEV

ElmoPA · 2026-06-25T03:23:45Z

dual-stream cross-emb H-Net two-trunk: agnostic+specific separate-weight trunks, per-layer channel-aware cross-attn, partitioned GMM head, 2trunk d16/d18 200M-active configs, grad/mode probes + launchers. Built on fab815f3; base classes intact at gmm-cotr #513
tsim: sim env + viz/scripts + physics tuning #512 👈 (View in Graphite)
main

This stack of pull requests is managed by Graphite. Learn more about stacking.

ElmoPA and others added 30 commits May 20, 2026 02:21

tsim: sim env + viz/scripts + physics tuning

c92bba9

Squash of: - a31d8ab4 tshape sim environment - ac272e8a Add Tsimulation viz/scripted/stats tools + physics tuning

dataset: tsim training configs + HPT keymap/viz/eval + packed dataloa…

e0d3517

…ding Squash of: - f261ff68 tsim training configs/embodiment - b38264ab Add pushshapes_sim HPT training: keymap, viz, eval fixes - 1e174131 Add episode-level packed dataloading

hnet: flexible stage refactor + packed training + viz/infra

fc2f262

Squash of 10 commits from temp-arch-flexible: 7faf2012, 49ed0d34, 8387986e, 11a266ce, 3fe9a353, 0ea9d013, c83b6e69, 63a2e852, 7b71650a, a063c021

dfot: v1 algo + Isotropic backbone w/ per-token AdaLN + continuous/di…

410dcbf

…screte diffusion + DDPM/DDIM samplers

dfot: fix inference_step obs shaping (don't unsqueeze; drop dead ac_k…

69e4889

…eys fallback)

dfot: packed-mode training + eval (cu_seqlens-aware backbone forward,…

b38db1a

… per-frame obs cond)

dfot: causal-AR staircase sampler + schedule-matrix sampler + online …

68fc270

…rollout helper

dfot: PACE sbatch for 80ep packed training on PushShapes circle/basic

e05e77d

dfot: 20x training budget (12800 steps over 80 ep), --time 8h

9a24b2e

dfot: full 750-ep training (200 ep, 32000 steps, tsimulation_full.yam…

703c5e7

…l); drop 400ep config

dfot eval: val-data evaluator with full-chunk + staircase-AR overlay viz

1033559

dfot: relaunch sbatch with eval_dfot_val (teacher-forced viz) + seq_l…

48b88a2

…ens defensive fix

dfot refactor: unify sampling, fix AR _ar_pred, AR inference_step, Pa…

7779868

…ckedSimEval rename, composite eval

dfot eval: fix HNetSimEval->PackedSimEval comment reference

eb0dc76

dfot sampling: split sample_step (primitive) from sample (loop wrappe…

b708980

…r); supports growing-T AR

dfot: inline AR rollout state into DFoT; delete CausalARRollout class

b2a1578

ElmoPA and others added 28 commits June 7, 2026 04:39

docs: PORT_NOTES hierarchy-pass record + path-map pointer (verifier d…

fe64f81

…oc-gap fix) Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

checkpoint: in-flight algo subpackage restructure (act/bc/hnet/hpt/pi…

569ef6b

… -> */algo.py, drop loss/packed_base/outer_stage shims) — safety snapshot before cotrain+eval port

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

tsim: sim env + viz/scripts + physics tuning#512

tsim: sim env + viz/scripts + physics tuning#512
ElmoPA wants to merge 78 commits into
mainfrom
elmo/gmm-cotrain-eval-viz

ElmoPA commented Jun 25, 2026

Uh oh!

ElmoPA commented Jun 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

ElmoPA commented Jun 25, 2026

Uh oh!

ElmoPA commented Jun 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant