Skip to content

tsim: sim env + viz/scripts + physics tuning#512

Draft
ElmoPA wants to merge 78 commits into
mainfrom
elmo/gmm-cotrain-eval-viz
Draft

tsim: sim env + viz/scripts + physics tuning#512
ElmoPA wants to merge 78 commits into
mainfrom
elmo/gmm-cotrain-eval-viz

Conversation

@ElmoPA

@ElmoPA ElmoPA commented Jun 25, 2026

Copy link
Copy Markdown
Contributor

tsim: sim env + viz/scripts + physics tuning

Squash of:

  • a31d8ab4 tshape sim environment
  • ac272e8a Add Tsimulation viz/scripted/stats tools + physics tuning

dataset: tsim training configs + HPT keymap/viz/eval + packed dataloading

Squash of:

  • f261ff68 tsim training configs/embodiment
  • b38264ab Add pushshapes_sim HPT training: keymap, viz, eval fixes
  • 1e174131 Add episode-level packed dataloading

hnet: flexible stage refactor + packed training + viz/infra

Squash of 10 commits from temp-arch-flexible: 7faf2012, 49ed0d34, 8387986e, 11a266ce, 3fe9a353, 0ea9d013, c83b6e69, 63a2e852, 7b71650a, a063c021

HNet packed training: test suite (86 tests)

  • test_hnet_nets.py (57): routing, chunk, dechunk, isotropic,
    stages (padded + packed), HNet assembly, ratio_loss, chunk_stats,
    STE, RMSNorm, AdaLN.
  • test_packed_pipeline.py (9): normalize broadcast on padded vs
    packed; _iter_leaves descent; multi-frame JPEG decode; end-to-end
    packed stats collection.
  • test_training_recipe.py (20): apply_optimization_params,
    init_weights height-scaled init, apply_lr_multiplier per-stage
    stamping, parameter_groups (default, with bias/norm WD=0,
    per-stage groups, AdamW-consumable). Plus algo wiring tests for
    the opt-in init_weights_range / lr_multipliers /
    use_parameter_groups / weight_decay kwargs.

Co-Authored-By: Claude Opus 4.7 (1M context) noreply@anthropic.com

wip: uncommitted edits from EgoVerse7 (active work-in-progress)

Imports from EgoVerse7@temp-arch-flexible working tree as of 2026-05-20.
Includes:

  • algo: input_modules.py + obs_transforms.py (new modules)
  • callbacks: chunker_residual_scheduler, ckpt_chunker(+dropout), random_attn_dropout
  • data configs: tsimulation_400ep, tsimulation_allep + tweaks to existing tsim configs
  • model configs: hnet_pushshapes_mamba_encdec, hnet_pushshapes_obs_ar + tweaks
  • eval/* edits, models/hnet_nets/* edits, schedulers, uv.lock
  • removes scripts/install_cuda_kernels.sh and egomimic/eval/eval_hnet_sim.py

Excluded: egomimic/algo/hnet.py.bak.preinput (manual backup) and drift_eval_out_* (eval artifacts).

dfot: v1 algo + Isotropic backbone w/ per-token AdaLN + continuous/discrete diffusion + DDPM/DDIM samplers

dfot: fix inference_step obs shaping (don't unsqueeze; drop dead ac_keys fallback)

dfot: packed-mode training + eval (cu_seqlens-aware backbone forward, per-frame obs cond)

dfot: causal-AR staircase sampler + schedule-matrix sampler + online rollout helper

dfot: PACE sbatch for 80ep packed training on PushShapes circle/basic

dfot: 20x training budget (12800 steps over 80 ep), --time 8h

dfot: full 750-ep training (200 ep, 32000 steps, tsimulation_full.yaml); drop 400ep config

dfot eval: val-data evaluator with full-chunk + staircase-AR overlay viz

dfot: relaunch sbatch with eval_dfot_val (teacher-forced viz) + seq_lens defensive fix

dfot refactor: unify sampling, fix AR _ar_pred, AR inference_step, PackedSimEval rename, composite eval

dfot eval: fix HNetSimEval->PackedSimEval comment reference

dfot sampling: split sample_step (primitive) from sample (loop wrapper); supports growing-T AR

dfot: inline AR rollout state into DFoT; delete CausalARRollout class

dfot algo: ar_inference_step_size knob, CFG plumbing, AdaLN dropouts

  • algo.py: add ar_inference_step_size (sub-steps per env tick at closed-loop AR); thread cfg_scale through _inference_step_ar/_inference_step_chunk; remove unused shape var
  • backbone.py: force_uncond branch for CFG two-pass blending; wire attn/resid dropout into Isotropic trunk
  • sampling.py: _CFGBackbone wrapper for cfg_scale > 1 sampling; schedule-matrix CFG plumbing

Co-Authored-By: Claude Opus 4.7 (1M context) noreply@anthropic.com

dfot configs+viz: model scaling, CFG, attn_dropout callback, 512x512 val viz

  • model/dfot_pushshapes.yaml: scale to 67M (d_model=512, T=12, num_heads=8, d_intermediate=2048); attn dropout=0.1, resid dropout=0.1, cond_dropout_prob=0.1; cfg_scale field; causal=true
  • data/tsimulation_full.yaml: 750-episode circle_750 dataset, batch_size=16
  • evaluator/eval_dfot_val.yaml + eval_dfot_full.yaml: cfg_scale + ar_chunk_size + ar_step_size knobs
  • callbacks/ckpt_attn_dropout.yaml: composed callback (checkpoints + random_attn_dropout with values [0.1, 0.5, 0.8, 0.9, 0.95, 0.97, 0.98])
  • eval/eval_dfot_val.py: 96x96 -> 512x512 nearest-neighbor upscale, palette (gt=green, chunk=red, ar=yellow), world-coord pixel scaling, threaded cfg_scale

Co-Authored-By: Claude Opus 4.7 (1M context) noreply@anthropic.com

dfot scripts: 200/400ep sbatch variants + eval CLI tools

  • sbatch_train_dfot_200ep_full750_pace.sh: minor cleanup
  • sbatch_train_dfot_400ep_full750_pace.sh: 400ep H200 launch with scheduler.max_steps=18800 (fixes the 200ep cosine-not-decaying bug)
  • sbatch_train_dfot_400ep_attndrop.sh: 400ep + random_attn_dropout (the 5.6x sim_coverage win)
  • scripts/eval_cfg_latest.py: post-hoc DFoT eval CLI with --cfg-scale, --ar-chunk-size, --ar-step-size, --ar-inference-chunk-size, --ar-inference-step-size, --skip-val/--skip-sim
  • scripts/eval_fsd_latest.py: convenience wrapper for inference_mode=chunk
  • scripts/sbatch_fsd_eval.sh + sbatch_sim_sweep.sh: sbatch templates for closed-loop sim sweeps

Co-Authored-By: Claude Opus 4.7 (1M context) noreply@anthropic.com

outer_stage + loss: abstract base classes for OuterStage/Loss refactor

Introduces the abstract bases for the upcoming refactor:

  • egomimic/algo/outer_stage.py: OuterStage base. Owns an
    inner_stage field (the trunk). Subclasses implement encode
    (raw batch -> trunk-input tensor; can sample noise / record state
    on ctx) and decode (trunk-output -> per-modality prediction keys
    on batch).

  • egomimic/algo/loss.py: Loss base + CompositeLoss (weighted
    sum of terms) + MSELoss (per-modality MSE between pred_key and
    target_key). Loss policy becomes data — a hydra config block —
    rather than inheritance.

No algorithm code uses these yet; HNetOuterStage / DFoTOuterStage and
the algo.hnet.HNet / algo.dfot.DFoT refactors come in follow-up commits
on this branch.

continuous_diffusion: split forward into q_sample + compute_loss

Foundational refactor for the upcoming DFoTOuterStage + DFoTLoss
classes:

  • q_sample(x, t) -> dict: forward-noising step. Returns x_t, noise,
    alpha_t, sigma_t, logsnr, and the precond_scale * logsnr time_cond
    the backbone consumes. No backbone call inside.

  • compute_loss(v_pred, q_state) -> per-token weighted MSE: takes the
    dict from q_sample plus the backbones v_pred and computes the
    SNR-weighted epsilon-MSE.

  • forward(backbone, x, t, cond): kept as a back-compat wrapper that
    calls q_sample, runs the backbone, then compute_loss. Existing
    callers (DFoT.forward_training) are unaffected.

Bitwise verified equivalent to the prior single-method path via
the included /tmp/test_diffusion_split.py smoke (loss + x_pred match
exactly, same random seed).

discrete_diffusion.py is unsplit for now; current configs use
continuous.

dfot outer_stage + DFoTLoss: training-path classes (loss equivalence verified)

  • egomimic/algo/dfot/outer_stage.py: DFoTOuterStage subclass.
    encode: encode obs to per-token cond, sample noise levels, run
    diffusion.q_sample, store q_state + external_cond on ctx, return
    noisy x_t. decode: write batch[pred_v] for the loss to read.
    forward override threads cu_seqlens/max_seqlen (packed mode) and
    time_cond into the backbone call.

  • egomimic/algo/loss.py: DFoTLoss class. Reads batch[pred_v] and
    ctx.q_state, calls diffusion.compute_loss (SNR-weighted eps-MSE),
    reduces to scalar.

Bitwise verified via /tmp/test_dfot_outer_stage.py: padded-mode loss
through DFoTOuterStage + DFoTLoss matches DFoT.forward_training to
0.0e+00 difference at fixed seed. Real CondEncoderModule +
DFoTBackbone + ContinuousDiffusion submodules; no mocks of the math
path.

Algo class (DFoT.forward_training) is NOT yet wired to use these —
that comes in the next commit on this branch. Inference paths
(closed-loop AR sample_step, chunk plan-execute) also deferred.

dfot algo + yaml: refactor to outer_stage + loss

Algo class:

  • init now takes outer_stage: DFoTOuterStage and optional
    loss: Loss (auto-built as DFoTLoss(outer_stage.diffusion) if None).
  • Removes legacy cond_encoder, backbone, diffusion_type,
    diffusion_kwargs, cond_output_key args — they now live on the
    outer_stage subblock.
  • Adds @Property accessors for cond_encoder, backbone,
    diffusion, outer_stage, loss so existing inference paths
    (_inference_step_ar, _inference_step_chunk, _sample_chunk,
    forward_eval) keep working unchanged via property forwarding.
  • forward_training shrinks ~40 LOC -> ~20 LOC: build ctx, call
    outer_stage(batch, ctx), call loss(batch, ctx). No more inline
    diffusion math; no more cu_seqlens threading at this level.
  • Adds ar_inference_step_size knob.

dfot_pushshapes.yaml:

  • New outer_stage: block wraps cond_encoder + backbone + diffusion.
  • Removes top-level diffusion_type / diffusion_kwargs; the
    diffusion module is now its own target inside outer_stage.
  • loss: omitted (uses default DFoTLoss(outer_stage.diffusion)).

End-to-end smoke (scripts/test_dfot_refactor_e2e.py) verifies the
config instantiates via hydra and forward_training emits a finite
scalar loss. Bitwise loss-equivalence was already shown in the prior
commit (test_dfot_outer_stage.py).

Old checkpoints WILL NOT load — state_dict keys moved from
nets.{cond_encoder,backbone}.* to nets.outer_stage.{cond_encoder,inner_stage}.*
This is intentional per the agreed clean-break refactor.

dfot inference smoke: verify AR + chunk paths after outer_stage refactor

Adds scripts/test_dfot_inference.py: instantiates the refactored DFoT
from dfot_pushshapes.yaml, runs inference_step in both ar and chunk
modes, asserts action is (action_dim,) and finite. Verifies the
@Property accessors (self.backbone, self.cond_encoder, self.diffusion)
forward correctly to outer_stage submodules so the closed-loop AR
and chunk-mode inference paths keep working after the refactor.

Passing on compute node 8997316:
[ar] action @ t=0: [0.30 0.47]
[ar] action @ t=1: [0.44 1.06]
[chunk] action @ t=0: [-0.78 -1.89]

hnet_outer_stage + HNetLoss: H-Net OuterStage subclass (bitwise verified)

  • egomimic/algo/hnet_outer_stage.py: HNetOuterStage class. Inherits
    from OuterStage with inner_stage = HNetCore (stage tree). Owns
    cond_encoder, input_modules (summed per-token contributions),
    action_out head. Three forward paths inherited from the old
    HNetPolicy pattern: forward(batch, ctx) dispatcher (padded/packed),
    generate (offline AR), init_step_state + step (online single-tick).

  • egomimic/algo/loss.py: HNetLoss class. Reads batch[pred_action] +
    batch[actions], adds per-chunker ratio_loss_from_aux from ctx.aux.

  • scripts/test_hnet_outer_stage.py: equivalence smoke. Instantiates
    the existing hnet_pushshapes.yaml subcomponents, wraps the SAME
    instances in HNetPolicy and HNetOuterStage, ties their action_out
    heads, runs identical padded forward. Verified bitwise (max diff
    0.00e+00 at fixed seed) on H200 alloc 8989249. Also smoke-tests
    the step inference path (shape + finite).

The old HNetPolicy class is still in egomimic/algo/hnet.py and not yet
removed. Algo-class refactor and yaml updates come in follow-up
commits on this branch.

Old checkpoints will NOT load — state_dict keys move from policy.*
to outer_stage.* (or wherever the algo class places it). Per clean-
break policy.

hnet algo + yaml: refactor to outer_stage + loss (base config)

Algo class:

  • HNet.init now takes outer_stage: HNetOuterStage + loss: Optional[Loss]
    instead of cond_encoder + hnet + action_dim + action_horizon +
    d_model + action_head_type + input_modules. action_horizon read from
    outer_stage.
  • Loss defaults to HNetLoss() if not provided.
  • self.nets is now ModuleDict({outer_stage, loss}); old keys
    (self.nets[policy], self.nets[cond_encoder], ...) are exposed via
    @Property forwarding to outer_stage submodules so legacy callsites
    in forward_eval / _teacher_forced_packed / _ar_rollout_packed / step
    inference keep working.
  • forward_training builds (batch, ctx), calls outer_stage(batch, ctx)
    • loss(batch, ctx), unpacks per-term breakdown (ctx.action_loss,
      ctx.ratio_loss) into the predictions dict for logging.

HNetLoss:

  • Computes action MSE + ratio_loss_from_aux(ctx.aux) and stashes the
    per-term split on ctx for the algo to log separately.

HNetOuterStage:

  • Adds back-compat bridge methods forward_padded(actions, obs) and
    forward_packed(actions, obs, cu, msl) returning (pred, aux). The
    forward_eval / _teacher_forced_packed paths use these; .generate /
    .step / .init_step_state already had matching signatures.

hnet_pushshapes.yaml:

  • New outer_stage: block wraps cond_encoder + hnet stage tree +
    input_modules + action_head_type. Top-level keeps training-recipe
    knobs (init_weights_range, lr_multipliers, ...) and embodiment
    wiring.
  • loss: block omitted (defaults to HNetLoss()).

scripts/test_hnet_refactor_e2e.py: packed-mode forward_training smoke.
Verified passing on H200 alloc 8989249 — produces action_loss 1.51 +
ratio_loss 0.032 + chunker stats for a 2-episode packed batch
(T=12+20). Padded mode hits a pre-existing torch SDPA error
(Explicit attn_mask should not be set when is_causal=True) in
train mode — this is in the trunk code, not introduced by the
refactor (the production training uses packed mode and never hits
the padded-train path).

Old HNetPolicy class is still in algo/hnet.py for now (no longer
used by HNet algo); will be removed in a cleanup commit once all
stage-based + flat yamls are migrated.

hnet yamls: migrate remaining 6 stage-based configs to outer_stage schema

Same outer_stage block pattern as the base hnet_pushshapes.yaml, applied
to the variant configs. Each yaml moves cond_encoder + hnet stage tree

  • (optional) input_modules + action_head_type under outer_stage; keeps
    training-recipe knobs + embodiment wiring at the top level.

Configs migrated:

  • hnet_pushshapes_big.yaml (d_model 256, 21M params)
  • hnet_pushshapes_crossattn.yaml (cond_mode: cross_attn)
  • hnet_pushshapes_mamba_encdec.yaml (M8 encoder/decoder)
  • hnet_pushshapes_obs_ar.yaml (ObsToken input module)
  • hnet_pushshapes_obs_ar_large.yaml (ObsToken + d_model 256 + T8)
  • hnet_pushshapes_recipe.yaml (H-Net paper recipe)

scripts/test_hnet_yamls_load.py: batch instantiate smoke. Verified
on H200 alloc 8989249 — all 7 stage-based yamls (base + 6 variants)
instantiate from hydra config and produce sensible param counts
(5.5M baseline up to 42.8M obs_ar_large).

flat_fused_outer_stage + HNetFused thin alias + 3 flat yamls migrated

  • egomimic/algo/flat_fused_outer_stage.py: FlatFusedOuterStage class.
    Structurally a rename of FlatFusedPolicy with OuterStage inheritance
    plus an OuterStage forward(batch, ctx) dispatcher delegating to the
    existing forward_padded / forward_packed. Legacy generate / step /
    init_step_state preserved verbatim. encode / decode raise
    NotImplementedError since the interleaved 2T-token flow does not
    cleanly split along encode -> trunk -> decode.

  • egomimic/algo/hnet.py: HNetFused is now a thin pass-through subclass
    of HNet, kept as a separate target for the existing flat yamls.
    All flat-fused behavior moved into FlatFusedOuterStage; HNet.init
    already tolerates outer_stage.inner_stage=None.

  • 3 flat yamls migrated to outer_stage schema:
    hnet_pushshapes_fused.yaml, hnet_pushshapes_fused_lowlr.yaml,
    hnet_pushshapes_fused_pusher.yaml.

  • scripts/test_hnet_yamls_load.py extended to cover all 10 H-Net
    configs. Verified on H200 alloc 8989249: 10/10 instantiate
    successfully (7 HNetOuterStage + 3 FlatFusedOuterStage). Param
    counts sensible.

Old FlatFusedPolicy class still in algo/hnet.py for now; cleanup of
unused legacy classes (HNetPolicy, FlatFusedPolicy) is a follow-up.

forward_training smoke for all 10 H-Net yamls + mamba regression check

eval: migrate algo.nets[] accesses to property accessors (refactor follow-up)

PACT: VAE + obs+action+image DFoT + bundle-aware evals + bcrnn

Joint state+vae_latent+action diffusion forcing.

DiT3D video diffusion: all fixes + VAE v6 config

  • DiT3D backbone with AdaLN-Zero, RoPE 3D (fixed rotate_half)
  • Latent normalization for VAE bias correction
  • No-padding per-episode backbone processing
  • Additive conditioning fusion (matches reference)
  • patch_size=2, noise-MSE loss, matched DiTBlock residual
  • VAE v6 config with KL beta=0.001

Co-Authored-By: Claude Opus 4.7 (1M context) noreply@anthropic.com

WIP checkpoint: obs-action DFoT 2D policy + spatial_rh sim controller + new evals (pre-hnet-variants merge)

snapshot_pre-restack_2026-05-31_decouple_spatial_tf_evals

pre-restructure snapshot (DESIGN.md step 0)

restructure: sweep root debris -> scratch/ (DESIGN.md step 1)

Move WIP-session / sibling-repo dead weight out of repo root into a
gitignored scratch/ archive (MOVES not deletes; tracked files via git mv
so history is preserved at old paths). Adds scratch/MANIFEST.md and
ignores /scratch/.

Swept (67 files):

  • 37 .sh experiment runners (eval_/train_/smoke_/sim_/launch_/etc.)
  • 11 patch_*.py monkeypatch scripts
  • 12 root debug_/test_.py ad-hoc scripts
  • 7 png/mp4 render dumps (4 mp4 were untracked->plain mv)

Deviation from DESIGN literal counts (40 .sh / 9 png/mp4): 5 of those are
original-repo files, not sibling debris, so KEPT in root:
pull_models.sh, run_eva_docker.sh, setup_nvm.sh (infra, git-added 2025),
convention.png, mano_keypoints.png (embedded in CONTRIBUTING_DATA.md).
Design's stated intent ("sibling-repo dead weight") preserved exactly.
See scratch/MANIFEST.md.

Co-Authored-By: Claude Opus 4.8 noreply@anthropic.com

restructure: collapse H-Net to models/hnet + flip hnet_core import (DESIGN steps 3-4)

Step 3 (collapse H-Net):

  • git mv egomimic/models/hnet_nets -> egomimic/models/hnet (the pact SUPERSET
    tree: cross-attn + residual_scale + causal_conv1d + adaln_per_token).
  • DELETE egomimic/models/bc_rnn_nets/_hnet_vendored/ entirely (inferior subset
    dup; its config/context/routing were byte-identical to the superset, so git
    attributes those 3 as renames into models/hnet).
  • Rewrite intra-tree imports inside models/hnet/ from
    egomimic.models.hnet_nets.X -> egomimic.models.hnet.X.
  • Leave egomimic/models/hnet_nets/ as a thin facade shim: a new init.py
    that aliases each models.hnet. into sys.modules under the legacy
    hnet_nets. key (so both top-level and submodule-path imports -- incl.
    private symbols the tests import -- resolve to the SAME live module object)
    and re-exports the top-level names. Keeps all legacy import paths alive
    until the step-13 flip.

Step 4 (flip hnet_core import):

  • egomimic/models/bc_rnn_nets/hnet_core.py: imports flipped from
    bc_rnn_nets._hnet_vendored.{context,hnet,stages} -> egomimic.models.hnet.*.
    Superset extra flags (cross-attn / AdaLN / window) all default OFF, so the
    obs-only HNetCore never touches the diverged paths.
  • Update bc_rnn_nets/init.py docstring to reflect the collapse.

Verification (A40, fixed-seed HNetCore forward, baseline captured in a
worktree at the pre-step-3 commit using _hnet_vendored vs post-flip using
models.hnet):

  • BIT-IDENTICAL: state_dict_sha256 560955..a197dc0 and output_sha256
    772f6f..f6fe005 match exactly pre/post; 12,163,840 params / 121 tensors.
  • tests/test_hnet_nets.py: 57 passed (imports via shim).
  • All 7 BC-RNN paperexact configs compose (hydra --cfg job).
  • 20/20 hnet_nets consumers import clean (DFoT family, algo/hnet family,
    bc_rnn, act/hpt, both callbacks) via the shim.

Co-Authored-By: Claude Opus 4.8 noreply@anthropic.com

restructure: resolve BC-RNN algos -> flat algo/bc.py + WindowedBC (DESIGN step 5, amended)

Amended step 5 (user override of DESIGN's algo/bc/ package): the active BC
algo stays ONE FLAT FILE -- no package, no algo/bc/ directory. Backbone
(lstm/transformer/hnet) switching stays purely config-side via the existing
core_net knob.

Changes:

  • git mv egomimic/algo/bc_rnn.py -> egomimic/algo/bc.py (flat module; history
    preserved).
  • Rename classes BCRNN -> WindowedBC, BCRNNPolicy -> WindowedBCPolicy. Old
    names kept as module-level aliases (BCRNN = WindowedBC, BCRNNPolicy =
    WindowedBCPolicy) -> same class objects, so isinstance/pickle/Hydra resolve
    unchanged. Internal instantiation + name-bearing error/doc strings updated.
  • Add egomimic/algo/bc_rnn.py import shim re-exporting the full public surface
    from egomimic.algo.bc (BCRNN/BCRNNPolicy/WindowedBC/WindowedBCPolicy +
    _cut_windows/_cut_windows_strided/_pack_to_padded), so the legacy import path
    and any target: egomimic.algo.bc_rnn.BCRNN config keep resolving.
  • Repoint the 7 BC-RNN configs' target to egomimic.algo.bc.WindowedBC
    (old egomimic.algo.bc_rnn.BCRNN still works via shim + alias).
  • Quarantine the OTHER, name-colliding duplicate algo to scratch/ (git mv,
    history preserved): egomimic/algo/bcrnn/{init,algo,outer_stage}.py +
    its only config egomimic/hydra_configs/model/bcrnn_pushshapes.yaml ->
    scratch/algo_bcrnn/. That dup is a separate robomimic-BC_RNN reimpl on the
    Algo/OuterStage spine, not wired into the kept pipeline, with a stale config
    (references pre-collapse egomimic.models.hnet_nets.* paths). Logged in
    scratch/MANIFEST.md with a REQUEST-DELETE entry (user's call; not auto-deleted).

Verification (a40 compute node, sibling .venv):

  • import egomimic + egomimic.algo.{bc,bc_rnn,dfot,hpt,act,hnet,algo} clean;
    WindowedBC is an HNet subclass; aliases are object-identical; shim re-exports
    the same objects.
  • All 7 BC-RNN configs compose (hydra --cfg job) -> target:
    egomimic.algo.bc.WindowedBC.
  • LSTM policy built via OLD target (egomimic.algo.bc_rnn.BCRNN, shim+alias)
    AND NEW target (egomimic.algo.bc.WindowedBC) under a fixed seed:
    state_dict torch.equal across all 137 tensors (23,511,752 params).
  • No dangling egomimic.algo.bcrnn refs in the active tree; no untracked
    non-ignored files.

Co-Authored-By: Claude Opus 4.8 noreply@anthropic.com

restructure: relocate bc_rnn_nets to role homes + tests suite (DESIGN steps 6, 6.5)

DESIGN.md step 6: git mv the bc_rnn_nets members to their role homes and keep a
bc_rnn_nets/init FACADE re-exporting everything from the new homes (the old
import paths + yaml target submodule paths stay alive until step 13).

models/stems/ obs_encoder.py, visual_core.py
models/cores/ lstm_core.py, transformer_core.py, hnet_core.py
models/heads/ gmm_head.py, query_decoder.py

All 7 moves are R100 (pure git mv, content byte-identical). The facade aliases
each legacy submodule into sys.modules under egomimic.models.bc_rnn_nets.
(same mechanism as the step-3 hnet_nets shim) so package-name imports, submodule
imports, and yaml target paths all resolve to the SAME role-home module
objects. Removed the leftover empty _hnet_vendored/ dir from the step-3 collapse.

Flipped the 7 BC-RNN configs' _target_s to the role paths (ObsEncoder ->
stems.obs_encoder, visual_core.VisualCore -> stems.visual_core, LSTMCore ->
cores.lstm_core, TransformerCore -> cores.transformer_core, HNetCore ->
cores.hnet_core, GMMActionHead -> heads.gmm_head, QueryActionDecoder ->
heads.query_decoder).

DESIGN.md amendment 6.5: distill the session proof patterns into a pytest suite
(GPU-alloc runnable; all forces CPU for determinism):
tests/test_core_defaults_byte_identical.py -- lstm/tx/hnet construct +
torch.equal across two fixed-seed builds + match committed ref fingerprints.
tests/test_causality.py -- TX + HNet prefix-consistency; TX future-perturb
no-leak; query-decoder future-perturbation EXACT-ZERO (torch.equal).
tests/test_train_rollout_parity.py -- forward vs sequential step() for all 3
cores + the chunk8 query-decoder queue replay.
tests/test_config_compose.py -- all 7 BC-RNN (+ legacy-path assertion) + 13
dfot + 5 vae configs compose through train_zarr_cartesian.

Verification (a40 alloc 3325503): 40/40 new tests GREEN; 57/57 test_hnet_nets
GREEN; all 7 BC-RNN configs --cfg job compose; import egomimic + algo.bc +
models.hnet + hnet_nets shim + algo.dfot clean. The 8 reds in the wider suite
(test_training_recipe::TestAlgoWiring algo.hnet outer_stage signature drift;
test_packed_pipeline missing on-disk zarr) are pre-existing -- reproduced
identically at the step-5 commit ccff845, untouched by this move.

Co-Authored-By: Claude Opus 4.8 noreply@anthropic.com

restructure: relocate DFoT pieces -> models/diffusion + algo/diffusion (DESIGN step 7)

Split egomimic/algo/dfot into its model + algo halves via git mv:

  • model pieces -> egomimic/models/diffusion/
    backbones/{backbone,dit3d_backbone,spatial_backbone}
    diffusion/{continuous_diffusion,discrete_diffusion,noise_schedule}
    embeddings.py, sampling.py
  • algo pieces -> egomimic/algo/diffusion/
    algo.py, outer_stages/{outer_stage + 9 *_outer_stage}
    vae_algo.py (was egomimic/algo/vae/algo.py)

Intra-tree imports rewritten to the new role homes (backbones import
models.diffusion.embeddings + models.hnet.isotropic_builder; algo imports
models.diffusion.{backbones,diffusion,sampling} + models.hnet.cond_encoders).

algo/dfot/init and algo/vae/init kept ALIVE as thin facades: every
legacy egomimic.algo.dfot. / egomimic.algo.vae.algo dotted path is
registered in sys.modules pointing at the real relocated module, so the yaml
_target_s (algo.dfot.DFoT, algo.dfot.outer_stage.DFoTOuterStage,
algo.dfot.{continuous,discrete}_diffusion.*, algo.vae.VAE) all still resolve.
Shim identity verified (algo.dfot.DFoT is algo.diffusion.DFoT).

Verify: 25/25 test_config_compose (13 DFoT + 5 VAE + 7 BC-RNN); import
egomimic + DFoT/VAE relocation imports clean.

Co-Authored-By: Claude Opus 4.8 noreply@anthropic.com

restructure: home zoo + curate eval into role buckets (DESIGN step 8)

Zoo (git mv, no behaviour change):

  • egomimic/algo/{act,hpt,pi}.py -> egomimic/algo/zoo/{act,hpt,pi}.py
  • flat egomimic/algo/{act,hpt,pi}.py kept as thin re-export shims so the
    yaml _target_s (egomimic.algo.act.ACT / .hpt.HPT / .pi.PI) still resolve.
  • algo/zoo/init lazy-imports PI (optional openpi dep).
  • co-located algo/test_pi.py moved with pi.py -> algo/zoo/test_pi.py (kept
    out of the tests/ suite, as before: it requires openpi).

Eval curated into egomimic/eval/{core,tf,dfot,probes,zoo}/ (git mv):

  • core/ eval, eval_video, eval_composite, eval_sim, eval_hnet, eval_vae_recon
  • tf/ eval_dfot_val, eval_dfot_controller_tf
  • dfot/ the 7 DFoT self/video/policy rollout evaluators
  • probes/ eval_boundary_strip, eval_pca_tokens
  • zoo/ eval_act, eval_hpt, eval_pi
    Inter-eval imports rewritten to the bucketed paths; the dfot evals' imports of
    the DFoT model pieces flipped to canonical egomimic.models.diffusion.* (off
    the algo.dfot shim).

EDITED the ~20 evaluator-yaml _target_s DIRECTLY to the bucketed eval paths
(DESIGN warns target resolution is weaker through init shims), e.g.
egomimic.eval.eval_sim.PackedSimEval -> egomimic.eval.core.eval_sim.PackedSimEval.

eval/init kept ALIVE as a facade: every legacy egomimic.eval.eval_
PYTHON import path is sys.modules-aliased to its bucketed module, so the code
consumers (trainHydra: egomimic.eval.eval.Eval; scripts/ smoke+verify helpers)
keep working until the final flip (step 13).

Verify: 25/25 test_config_compose; 7 BC-RNN compose via hydra --cfg job;
20/20 evaluator yamls compose + every egomimic.eval.* target resolves;
import egomimic + DFoT/zoo/eval-bucket imports clean; full suite unchanged
from baseline (8 pre-existing fails: 7 TestAlgoWiring sig drift + 1 data-missing
packed_pipeline; no new failures).

Co-Authored-By: Claude Opus 4.8 noreply@anthropic.com

restructure: close BC-RNN sim-eval gap (DESIGN steps 9-10)

Step 9 (get_keymap_eval):

  • Add get_keymap_eval() to egomimic/rldb/embodiment/pushshapes.py: get_keymap()
    plus a goal_pose passthrough keyed key_type="goal_keys". goal_keys is NOT in
    MultiDataset.NORMALIZE_KEY_TYPES, so goal_pose is read into the packed batch
    raw/un-normalized and passed straight through to PackedSimEval, which reads
    batch["goal_pose"] in batch_to_env_init to set the env goal. Serves both
    circle proxies. The 7 BC-RNN launchers already point KM at this symbol.
  • Deviation from the EgoVerse2 reference: omit the extra init_action passthrough.
    pact-2's eval uses init_mode="replay" and never reads init_action (verified: no
    reference in egomimic/eval or egomimic/algo), so it would be dead weight. The
    ~17-line design target counted EV2's init_action; documented inline.

Step 10 (close sim-eval gap):

  • Add T_max=None kwarg to WindowedBC.inference_step so the PackedSimEval call
    inference_step(obs_zarr, t, emb_id, T_max=self.max_steps) (eval_sim.py:251) no
    longer TypeErrors. T_max is the sim rollout horizon, a different quantity from
    the policy's action-queue length, so the internal init_step_state buffer is
    still sized from policy.action_horizon; T_max is accepted/tolerated to match
    the eval contract (matches DFoT.inference_step, which already takes T_max).
  • Strip the unsupported evaluator.rollout_mode=ar override from all 7 BC-RNN
    launchers. eval_hnet_sim.yaml (HNetSimEval/PackedSimEval) has no rollout_mode
    key and drives AR natively via the per-token inference_step, so the override
    raised ConfigAttributeError. No other eval-only override (delta_action/
    temporal_ensemble/chunk_k/goal_in_obs) remains in the launchers.
  • Fix WindowedBC missing train_obs_transforms (the actual blocker on the
    headline). WindowedBC.init calls Algo.init (not HNet.init), so
    the inherited HNet.process_batch_for_training (hnet.py:889) hit
    AttributeError: 'WindowedBC' object has no attribute 'train_obs_transforms'
    on BOTH the train and the validation paths, before any rollout. Initialize
    self.train_obs_transforms = [] in WindowedBC.init: the empty list makes
    the if self.train_obs_transforms and self.outer_stage.training guard
    short-circuit, also sidestepping the (nonexistent) outer_stage. WindowedBC
    has no train-only obs augmentation, so [] is the correct value. Pre-existing
    latent bug from the H-Net restructure (steps 3-8); surfaced only now because
    this is the first time BC-RNN sim eval actually runs.

Verification:

  • All 7 BC-RNN configs compose (hydra --cfg job) with evaluator=eval_hnet_sim +
    get_keymap_eval KM and no rollout_mode=ar.
  • import egomimic + DFoT (algo.diffusion) + zoo (algo.zoo.hpt) import clean;
    WindowedBC/BCRNN alias intact.
  • get_keymap_eval() returns keys [front_img_1, state_agent_obj, actions,
    goal_pose].
  • HEADLINE: a 1-batch REAL closed-loop sim eval (mode=eval, trainer.validate,
    bc_rnn_pushshapes_paperexact, max_steps=8, 1 val batch, random-init weights)
    ran end-to-end on an A40 and LOGGED A COVERAGE NUMBER for the first time:
    Valid/emb15_sim_coverage = 0.0
    Valid/emb15_sim_success_rate = 0.0
    EVAL_EXIT=0. Coverage 0.0 is expected for an untrained model; the point is
    the eval stack now runs the full loop through inference_step(...,T_max=) and
    the goal_pose passthrough. (mode=eval used to bypass the unrelated training
    loop; the WindowedBC fix above was required to get past validation_step.)
  • tests/: 122 passed, 3 skipped; the 8 failures (TestAlgoWiring x7 +
    test_packed_pipeline full-pipeline-stats) are PRE-EXISTING at HEAD (proven by
    re-running with these changes stashed) - HNet.init now requires an
    outer_stage arg the older fixtures don't pass. Tracked separately.

Co-Authored-By: Claude Opus 4.8 noreply@anthropic.com

structural fixes: input_modules->stems, packed_base rename, flat_fused quarantine

PHASE 1 structural fixes (post-DESIGN step 10), all pure moves + compat shims:

  1. git mv egomimic/algo/input_modules.py -> egomimic/models/stems/input_modules.py
    (DESIGN stems role home). Fixed its internal import to the canonical
    post-collapse home models.hnet.cond_encoders (was models.hnet_nets.*).
    Compat shim left at algo/input_modules.py re-exporting all 3 classes; updated
    direct importers (algo/packed_base.py, algo/hnet_outer_stage.py) and the two
    obs_ar config target paths to the new home.

  2. git mv egomimic/algo/zoo/test_pi.py -> tests/test_pi.py and guarded it with
    pytest.importorskip(openpi) so it SKIPS cleanly (was a collection ERROR; the
    PI algo needs the optional openpi pkg, absent in the default venv).

  3. Quarantined dormant B-family flat-fused legacy -> scratch/flat_fused_quarantine/
    (flat_fused_outer_stage.py + 3 hnet_pushshapes_fused*.yaml), unreferenced by
    pact-2 mission. HNetFused stays as dormant dead code in packed_base.py;
    MANIFEST.md + REQUEST-DELETE entry added.

  4. git mv egomimic/algo/hnet.py -> egomimic/algo/packed_base.py (role-clarifying:
    per-emb-norm + packed-path base, NOT the models/hnet/ stage tree). Class names
    unchanged (HNet stays HNet). Compat shim at algo/hnet.py re-exports the full
    surface; updated direct importers (bc.py, test_training_recipe.py). Configs
    keep using egomimic.algo.hnet.* via the shim.

bc_rnn.py shim untouched (step 13). Verified: import smoke (shim identity ==
canonical), 7 hnet configs compose, tests/ at baseline (no new failures;
test_pi now skips cleanly).

Co-Authored-By: Claude Opus 4.8 noreply@anthropic.com

step 13: final flip — shims deleted, configs mirrored, dormant purge, packed_outer_stage rename

DESIGN.md step 13 + amendments A & B. algo/ END-STATE: no shims, no dormant
code, honest names everywhere.

  • Config mirror: every shim-routed target / import flipped to its real home
    across all hydra configs + non-config importers (66 mechanical + 9 manual).
    algo.hnet.* -> algo.packed_base.*
    algo.bc_rnn.* -> algo.bc.*
    algo.{act,hpt,pi}.* -> algo.zoo.*
    algo.input_modules.* -> models.stems.input_modules.*
    algo.dfot.* -> algo.diffusion.* / models.diffusion.*
    algo.vae.* -> algo.diffusion.{VAE,vae_algo}
    models.hnet_nets.* -> models.hnet.*
    models.bc_rnn_nets..* -> models.{stems,cores,heads}.* (role-routed)
  • Shims DELETED (grep-proven empty first): algo/{act,hpt,pi,bc_rnn,hnet,
    input_modules}.py, algo/dfot/, algo/vae/, models/hnet_nets/init.py,
    models/bc_rnn_nets/init.py.
  • Amendment A: FlatFusedPolicy + HNetFused purged from packed_base.py
    (1239 -> 935 lines, -304) into scratch/flat_fused_quarantine/.
  • Amendment B: git mv algo/hnet_outer_stage.py -> algo/packed_outer_stage.py;
    importers + 7 hnet configs flipped same commit, no shim.
  • Hygiene: pycache/ gitignored; tests/ import real homes directly.

Verify (a40 alloc): config compose 37/37 PASS; tests 122 pass / 8 pre-existing
fail (identical to pre-flip baseline, ZERO new); state_dict parity LSTM+HNet+
chunk8-Q all torch.equal vs pre-flip; SMOKE=1 train_bc_rnn_hnet.sh TRAIN_EXIT=0.

Co-Authored-By: Claude Opus 4.8 noreply@anthropic.com

dedup collapse 1: unify 3 pixel-policy DFoT outer stages into one pixel_mode-parameterized stage

Collapse the three near-duplicate pixel-policy outer-stage classes under
egomimic/algo/diffusion/outer_stages/ into ONE parameterized class
PixelObsActionDFoTOuterStage, selected by the pixel_mode config knob:

pixel_mode="policy" <- PixelObsActionPolicyDFoTOuterStage (Design A:
action broadcast into RGB channels, jointly diffused)
pixel_mode="regress" <- PixelObsActionRegressPolicyDFoTOuterStage (Design B:
RGB-only diffusion + conv action_head off pred x0)
pixel_mode="decoupled" <- PixelObsActionDecoupledDFoTOuterStage (DEC: action as
separate DiT3D token with independent noise level)

Each mode reproduces the corresponding old class EXACTLY and preserves the
duck-typed attribute surface the algo inference paths consume (_action_channels
for policy, action_head for regress, decouple_action_noise for decoupled,
plus the mode-correct action_slice). The 3 model configs are mirrored in this
same commit: _target_ -> PixelObsActionDFoTOuterStage + pixel_mode: <mode>.
Old class files moved to scratch/dedup_c1_old_stages/ (gitignored).

PROVEN behavioral equality (fixed seed, a40, srun on overcap alloc; harness at
scratch/proof_dedup_c1.py, all old vs new instantiated from the SAME resolved
sub-configs with identical RNG):

(a) Fixed-seed construction parity — state_dict keys identical AND every tensor
torch.equal:
policy: 82 keys, 7,325,460 params, keys_identical=True, all torch.equal
regress: 90 keys, 7,397,134 params, keys_identical=True, all torch.equal
decoupled: 87 keys, 7,322,894 params, keys_identical=True, all torch.equal
action_slice old==new for every mode (policy slice(3,5); regress/decoupled
slice(0,0)); mode attribute surface present on the unified class.

(b) Forward parity on a fixed-seed packed batch (2 episodes, T=9) — every
output tensor torch.equal (EXACT, no allclose fallback needed):
policy: forward_return, pred_v, qstate_x_t -> torch.equal
regress: forward_return, pred_v, pred_action, loss, x_t -> torch.equal
decoupled: forward_return, pred_v, qstate_x_t, loss -> torch.equal

(c) Hydra-compose of all 3 mirrored configs PASS; composed outer_stage.target
resolves to egomimic.algo.diffusion.PixelObsActionDFoTOuterStage with the
correct pixel_mode each.

Regression: pytest tests/ = 122 passed / 8 failed / 4 skipped — identical to the
step-13 baseline. The 8 failures are pre-existing and unrelated (7 TestAlgoWiring
old-HNet-signature, 1 packed_pipeline missing-zarr-data). All 25 config-compose
tests pass, including the 3 mirrored pixel configs. Zero new failures.

Co-Authored-By: Claude Opus 4.8 noreply@anthropic.com

dedup collapse 2: move SimpleConv + CondEncoderModule into models/stems/ (models/hnet now pure chunking machinery)

Image-encoder consolidation. Relocates the two input-side encoder modules out
of egomimic/models/hnet/ (which is meant to hold ONLY H-Net chunking
machinery) into their role home egomimic/models/stems/:

git mv egomimic/models/hnet/image_encoders.py egomimic/models/stems/image_encoders.py (SimpleConv)
git mv egomimic/models/hnet/cond_encoders.py egomimic/models/stems/cond_encoders.py (CondEncoderModule)

These are PURE MOVES — no logic edits. The only in-file content change is a
single docstring line in image_encoders.py whose _target_: example path was
updated hnet->stems. cond_encoders.py is byte-identical to its pre-move source.
All 33 references (13 python import sites + 33 yaml _target_ occurrences
across 20 model configs) flipped to the new stems path in this same commit;
hnet/init.py and stems/init.py re-exports updated; obs_encoder.py
docstring path corrected. After the move git grep shows models/hnet contains
no encoder/stem code (only a prose cross-reference comment in context.py).

PROVEN BEHAVIORAL EQUALITY (must function identically after the edit):
(a) Import-identity via temporary shim, checked with Python is then shim
removed in this commit:
OldSimpleConv is NewSimpleConv -> True
OldCondEncoderModule is NewCondEncoderModule -> True
hnet.init-resolved CondEncoderModule is new-> True
(b) Byte-identity modulo path lines: cond_encoders.py diff vs pre-move tag is
EMPTY; image_encoders.py diff is exactly ONE line (the _target_ docstring
example path).
(c) Construction state_dict torch.equal (old source extracted from tag
dedup-c2-pre vs new package source, fixed-seed init): all cases equal --
SimpleConv(4ch) 18 keys, SimpleConv(3ch) 14 keys, CondEnc+img 24 keys,
CondEnc+obs 10 keys, CondEnc(empty) 0 keys; plus forward torch.equal=True.
(d) Config mirror in this commit + compose-check: 20/20 affected configs
compose and instantiate the cond_encoder node (20 nodes) via new targets;
broader sweep 56/56 model configs compose, 0 failures.

Regression: pytest tests/ == 122 passed / 8 failed / 4 skipped, the SAME 8
pre-existing TestAlgoWiring + TestInferNormFromPacked failures present on tag
dedup-c2-pre (verified by running the suite on a worktree of the pre-move tag).
ZERO new failures.

Co-Authored-By: Claude Opus 4.8 noreply@anthropic.com

dedup collapse 3: factor shared zarr read/decode logic into rldb/zarr/_common.py

The two near-duplicate zarr loader paths — the padded/windowed reader
ZarrDataset.getitem and the packed/span reader ZarrDataset._read_span
(consumed by ZarrEpisodePackedDataset) — each inlined byte-for-byte copies of
the same JPEG window decode, single-frame JPEG decode, JSON-array decode,
float32 tensorization, and embodiment tagging. ZarrActionExpertDataset._load_obs_at
held a third copy of the single-frame JPEG decode.

This collapse extracts that shared logic into the new module
egomimic/rldb/zarr/_common.py as five pure helpers:

decode_jpeg_single(buf) -> CHW float image in [0,1]
decode_jpeg_window(buffers) -> stacked (T,C,H,W), per-frame decode
decode_json_array(arr, fn) -> [fn(v) for v in arr]
tensorize_float32(data, *, skip_object_dtype) (the ONE predicate the two
loaders genuinely differ on: _read_span skips
object-dtype arrays, getitem does not)
tag_embodiment(data, emb) -> stamps embodiment + metadata.robot_name

Each helper is a verbatim extraction of the pre-collapse loop body. The two
loaders now call the helpers and keep ONLY their genuine differences:
getitem keeps its horizon-windowing + repeat-last padding + bounded
JPEG-fail resample loop; _read_span keeps its exact-span read + seq_len /
episode_idx metadata. Dead import simplejpeg removed from both loader files
(decode now lives in _common). No public API / signature changes: _read_span,
getitem, _load_obs_at keep identical signatures; the sole _read_span call
site (zarr_dataset_packed.py) and the _load_obs_at call sites are unchanged
(git grep verified — zero call-site edits needed).

PROVEN BEHAVIORAL EQUALITY (fixed fixture episodes from
/coc/flash7/paphiwetsa3/datasets/new_circle_3, a40 overcap alloc, srun):

(a) New permanent suite tests/test_loader_equality.py (6 tests) PASSES both
BEFORE the refactor (anchoring reference behavior captured from the
pre-collapse code at tag dedup-c3-pre) and AFTER:
- TestReferenceHashes: both loaders reproduce frozen sha256 reference
hashes of the decoded front_img_1 / state_agent_obj / actions tensors
captured from the pre-collapse code. Post-refactor hashes match
exactly:
front_img_1 68f20e3c2c5f72b0 | (290,3,96,96) f32
state_agent_obj 26a652f406d275d2 | (290,5) f32
actions 6f7a2b1ab531506b | (290,2) f32
- TestCrossLoaderEquality: padded full-window vs packed span reads are
torch.equal per-frame across 4 episodes (+ embodiment id identical).
- TestNormalizationPathEquality: MultiDataset.normalize applied to BOTH
loaders' outputs is torch.equal (proves the normalization path is
identical across loaders).
- TestPackMetadata: pack_collate emits the documented seq_lens /
cu_seqlens / max_seq_len / batch_size, and the concatenated per-frame
stream equals the per-span reads in order.
The suite auto-skips off-cluster (fixture-missing guard).

(b) Direct old-vs-new bit-identity proof (scratch/proof_old_vs_new.py,
gitignored): the PRE-collapse loader modules extracted from tag
dedup-c3-pre via git archive and the POST-collapse live modules are run
on the same 3 episodes through BOTH the padded getitem and packed
_read_span paths. Result: 33 key comparisons across 3 episodes x 2 paths,
every output tensor torch.equal(old, new) == True (front_img_1,
state_agent_obj, actions all exact).

Regression: pytest tests/ == 128 passed / 8 failed / 4 skipped. The 8 failures
are the SAME pre-existing failures present on tag dedup-c3-pre (verified by
running the suite on a worktree of the pre tag: 122 passed / 8 failed / 4
skipped — 7 TestAlgoWiring old-HNet-signature + 1 TestInferNormFromPacked
missing-zarr-data, all in code this commit does not touch). 128 = the baseline
122 + the 6 new equality tests. ZERO new failures.

Co-Authored-By: Claude Opus 4.8 noreply@anthropic.com

docs: record dedup-campaign global acceptance gates (2026-06-06)

Append the dated dedup-campaign record to PORT_NOTES.md after the 3
behavior-preserving collapses (c1 f06330c pixel-DFoT outer-stage unify,
c2 32eb1fc hnet->stems encoder move, c3 c289657 zarr _common factor)
passed the global gates on alloc 3325596 (a40):

  • FULL compose sweep: 107/109 PASS (2 fails = PI viz configs, pre-existing
    MissingConfigException on parent default, untouched by collapses).
  • pytest tests/: 128 passed / 8 failed / 4 skipped (8 = same pre-existing
    TestAlgoWiring+InferNorm fails; +6 new c3 loader-equality tests). Zero new.
  • BC smoke (job 3325599) TRAIN_EXIT=0; DFoT 1-ep pixel smoke DFOT_TRAIN_EXIT=0.
  • NLL vs baseline: DFoT bit-identical (0.28878551721572876, delta=0.0);
    BC delta ~1e-5, within 1e-3 gate.

Co-Authored-By: Claude Opus 4.8 noreply@anthropic.com

dedup collapse 4: dead-code purge (5 of 6 zero-ref symbols; HNetPolicy retained — it is NOT dead)

Pure-delete-with-grep-proof. Each symbol re-grepped over egomimic/ +
hydra_configs/ + tests/ + scripts/ (excl pycache/external/scratch/logs)
IMMEDIATELY before deletion to reconfirm zero LIVE refs.

Deleted (proven zero live refs):

  • egomimic/models/diffusion_policy.py (whole file -> scratch; only self-ref)
  • egomimic/models/ddim_scheduler.py (whole file -> scratch; live DDIM is
    diffusion/sampling.ddim_sample)
  • egomimic/models/hnet/_smoke_stages.py (whole file -> scratch; only its own
    main self-invoke)
  • algo/loss.py CompositeLoss + MSELoss (no target, no code ctor; live losses
    are HNetLoss()/DFoTLoss() built in code)
  • algo/packed_base.py HNet._ar_rollout_packed (no caller; live eval is
    forward_eval -> _teacher_forced_packed)
  • algo/packed_outer_stage.py HNetOuterStage.generate (dead; only ref was a docstring.
    step/init_step_state KEPT — they ARE the
    live closed-loop path, called at
    packed_base.py policy.init_step_state/step)
  • pl_utils/pl_data_utils.py RLDBModule, DualDataModuleWrapper, DataModuleWrapper
    (deprecated; live wrapper is
    MultiDataModuleWrapper, ref'd by 21 files)

Also dropped now-unused imports (typing.Optional in packed_outer_stage.py;
typing.Iterable/List/Optional in loss.py) and updated 2 yaml comment mirrors
that named the now-deleted CompositeLoss (dfot_pushshapes.yaml, hnet_pushshapes.yaml).
Deleted files moved to scratch/dead_code_c4/ (gitignored), not destroyed.

SCOPE CORRECTION — HNetPolicy NOT deleted: the campaign evidence claimed
HNetPolicy was zero-ref, but that grep excluded scripts/. The required
grep over scripts/ found 3 LIVE importers+instantiators:
scripts/smoke_packed_training.py (documented live tooling in CLAUDE.md L538)
scripts/test_mamba_regression.py (old-vs-new equivalence regression)
scripts/test_hnet_outer_stage.py (HNetPolicy-vs-HNetOuterStage equivalence smoke)
The grep proof-gate fails for HNetPolicy, so it is retained (its .generate
method stays with the class). All other 5 targets pass cleanly.

Proofs (run on own a40 alloc, repo's symlinked .venv):

  • import-smoke: import egomimic.algo, egomimic.models, egomimic.pl_utils -> IMPORT OK
  • retained symbols import: Loss/HNetLoss/DFoTLoss, MultiDataModuleWrapper,
    HNet+HNetPolicy, HNetOuterStage (step=True init_step_state=True generate=False)
  • deleted symbols confirmed gone (CompositeLoss/MSELoss absent; dead model files absent)
  • py_compile all edited .py: OK
  • pytest tests/: 128 passed / 8 failed / 4 skipped — IDENTICAL to baseline.
    The 8 failures are the documented pre-existing set (7x TestAlgoWiring
    old-HNet-signature: "HNet.init() missing 1 required positional argument:
    'outer_stage'"; 1x test_full_pipeline_collects_per_feature_stats missing-zarr).
    Zero NEW failures. Deletes touch zero reachable code paths -> no torch.equal needed.

Co-Authored-By: Claude Opus 4.8 noreply@anthropic.com

dedup collapse 5: hoist shared embodiment-key resolution + _build_obs + log_info onto base Algo

Three blocks of code were byte-identical (verified via diff, rc=0) across
the policy algos and are collapsed into a single home on the base Algo:

  • the per-embodiment key-resolution loop (for emb in self.domains: ...
    resolving resolved_ac_keys / proprio_keys / lang_keys / camera_keys via
    norm_stats) -- HNet packed_base.py:489-514 == DFoT diffusion/algo.py:169-194
    == WindowedBC bc.py:699-724. Now Algo._resolve_embodiment_keys(norm_stats);
    each subclass calls it from init.
  • _build_obs -- HNet packed_base.py:615-624 == DFoT diffusion/algo.py:246-255
    (WindowedBC already inherited HNet's). Now defined once on Algo; the HNet
    and DFoT overrides are deleted (inherited).
  • log_info -- HNet packed_base.py:778-783 == DFoT diffusion/algo.py:419-425.
    The base Algo.log_info (formerly a NotImplementedError stub) now carries
    this exact body as the shared default; the HNet/DFoT overrides are deleted.

Net -116/+83 lines. Behaviour is preserved by construction: the moved text is
identical, so every subclass resolves the same function object from Algo
(no subclass re-introduces a private copy). bc.py drops its now-unused
get_embodiment_id import; packed_base/DFoT keep theirs (still used elsewhere).

Proofs (run on a40 alloc 3325792, fixed seeds):

  • DFoT 1-epoch pixel smoke Train/Loss = 0.28878551721572876 -- BIT-IDENTICAL
    to the pre-c5 baseline (all 17 digits; this is the deterministic, 0.0-jitter
    path per the dedup_baseline manifest).
  • BC SMOKE=1 train: c5 Train/Loss = [1.3453816, 0.1752842]. A clean pre-c5
    baseline RE-RUN on the same node gives [1.3452528, 0.1739942] -- i.e. the BC
    smoke is itself run-to-run nondeterministic at ~1e-3 (CUDA/image-encoder/
    sim-eval RNG), and the c5 run lands CLOSER to the manifest values
    [1.3453673, 0.1749004] than the clean-tree rerun does (row0 |c5-manifest|
    =1.4e-5 vs |baseline_rerun-manifest|=1.1e-4). The refactor is within the
    tree's own deterministic-replay band.
  • New permanent guard tests/test_embodiment_key_resolution_shared.py (3 tests):
    asserts HNet/WindowedBC/DFoT all resolve the SAME Algo function objects
    for _resolve_embodiment_keys / _build_obs / log_info (import-identity) and
    produce byte-equal key sets + obs selection on a fixed norm_stats fixture.
  • pytest tests/ = 131 passed / 8 failed / 4 skipped. The 8 failures are the
    documented pre-existing baseline (7x TestAlgoWiring old-HNet-signature +
    1x missing-zarr-data); +3 passed are the new permanent tests. Zero NEW
    failures vs the 128/8/4 profile.

Co-Authored-By: Claude Opus 4.8 noreply@anthropic.com

dedup collapse 6: hoist 6 identical eval uint8 helpers -> eval/core/img_utils.img_chw_to_uint8

Six evaluators each carried a byte-equivalent (C,H,W) float-in-[0,1] ->
(H,W,C) uint8-in-[0,255] converter under three private names:

  • _img_chw_to_uint8 : video_rollout, pixel_video_rollout,
    spatial_video_rollout, policy_action
  • _u8 : bundle_anchored
  • _to_uint8_hwc : eval/core/eval_vae_recon
    Four were the 4-line clip->*255->transpose form; two (policy_action, _u8)
    were the same logic as a 2-liner. All six now delegate to a single canonical
    egomimic.eval.core.img_utils.img_chw_to_uint8; the local defs are removed and
    each call site renamed.

EXCLUDED (genuinely different, left untouched): eval_dfot_self_rollout
._img_chw_to_uint8 uses an x.max()<=1.5 auto-scale heuristic + clip(0,255).

PROOFS (a40 job 3325794, fixed seeds):

  • tests/test_eval_img_utils.py (NEW, permanent gate):
    test_canonical_matches_every_original_body PASSED
    np.array_equal(canonical, orig_4liner) and ==orig_2liner on a fixed
    torch.manual_seed(0) float tensor spanning out-of-[0,1] (rand*1.4-0.2).
    test_touched_eval_modules_import PASSED (import-smoke of all 7 modules)
  • full tests/: pre-c6 (stashed) = 8 failed / 131 passed / 4 skipped;
    post-c6 = 8 failed / 133 passed / 4 skipped. SAME 8 pre-existing failures
    (7x TestAlgoWiring old-HNet-signature + 1x TestInferNormFromPacked
    missing-zarr-data); +2 passed = the 2 new equality/import tests. Zero NEW
    failures. Behaviour-preserving: the 6 bodies were already identical.

Deleted-helper provenance saved to scratch/c6_deleted_helpers/ (gitignored).

Co-Authored-By: Claude Opus 4.8 noreply@anthropic.com

dedup collapse 7: delegate image-only frame sampler to action-aware superset + hoist loss-reducer skeleton onto Algo

Two structural near-duplicates removed:

  1. Frame sampler. PixelSpatialDFoTOuterStage._sample_frames_packed (image
    only) duplicated the fixed_window/start_to_end/random_subsample branch
    logic of PixelObsActionDFoTOuterStage._sample_windows_packed (image+
    action). The action version is a strict superset: its image cropping and
    cu_seqlens are byte-identical whether or not actions are supplied (the
    action branch performs zero extra RNG draws). Hoisted the superset onto the
    parent PixelSpatialDFoTOuterStage, made it accept actions=None, and reduced
    _sample_frames_packed to a thin delegate
    (_sample_windows_packed(images, None, cu)). Deleted the duplicate copy from
    the PixelObsAction subclass (now inherited).

  2. Loss reducer. DFoT.compute_losses was the pure sum-per-embodiment
    {emb}_action_loss -> action_loss skeleton; promoted it to the Algo base as
    the default compute_losses and deleted DFoT's byte-identical override.
    VAE / HNet / HPT / PI keep their own overrides because they are genuine
    SUPERSETS (recon/kl/lpips, ratio_loss, domain-count division) — NOT folded.

PROOFS (a40 alloc 3325796, fixed seeds):

  • BEFORE-state probe: current image-only sampler vs current superset produce
    torch.equal image crops + cu_seqlens across all 3 modes (img_equal=True,
    cu_equal=True for fixed_window / start_to_end / random_subsample).
  • Permanent test tests/test_c7_sampler_reducer_equality.py (3 tests, all pass):
    • image-only sampler == superset(actions=None) torch.equal across all 3
      modes and all episode-length regimes (<n, ==n, >n);
    • hoisted Algo.compute_losses torch.equal to legacy DFoT.compute_losses on
      fixed predictions;
    • guard: HNet ratio_loss reducer is NOT reproduced by the base default
      (catches a future wrong fold of the superset overrides).
  • DFoT 1-epoch pixel-policy smoke (exercises the rewritten packed sampler,
    frame_sampling=fixed_window, + the inherited reducer): Train/Loss =
    Train/action_loss = Train/emb15_action_loss = 0.28878551721572876, BIT-
    IDENTICAL (all 17 digits) to the pre-collapse baseline.
  • pytest tests/ = 136 passed / 8 failed / 4 skipped: the 8 failures are the
    documented pre-existing set (7x TestAlgoWiring old-HNet-signature + 1x
    missing-zarr-data InferNorm), ZERO new failures; suite grew by the 3 new
    permanent equality tests.

No hydra configs reference the touched methods (no config mirror needed).

Co-Authored-By: Claude Opus 4.8 noreply@anthropic.com

models/ hierarchy pass: relocate 7 loose files to role homes (cores/heads/stems/diffusion) + utils

End state: ls egomimic/models/*.py shows ONLY init.py. Every move is a
git mv (R-status); the 2 splits git-mv the file to its primary home first
(history-preserving) then extract the other-role classes into new files in this
same commit. All moved class bodies are byte-identical modulo import lines
(verified by class-body diff vs HEAD). Every importer + config target updated
in this commit; grep confirms zero remaining old-module dotted refs.

WHOLE-FILE MOVES (git mv):
fm_policy.py -> heads/fm_policy.py (policy output head)
denoising_policy.py -> heads/denoising_policy.py (diffusion policy head base)
denoising_nets.py -> diffusion/denoising_nets.py (legacy diffusion nets)
image_vae.py -> diffusion/image_vae.py (DFoT pixel<->latent codec)
preprocess_pi_obs.py-> utils/preprocess_pi_obs.py (data preprocessing, OUT of models/)

SPLITS (git mv to primary home + extract):
act_nets.py -> stems/resnet_conv.py (primary: Module/ConvBase/CoordConv2d/ResNet18Conv)
+ cores/act_transformer.py (PositionalEncoding/Transformer/StyleEncoder)
hpt_nets.py -> stems/hpt_stems.py (primary: PolicyStem/MLPPolicyStem/ResNet)
+ cores/hpt_transformer.py (CrossAttention/Attention/MLP/BlockWithMasking/
MultiheadAttention/SimpleTransformer)
+ heads/hpt_heads.py (PolicyHead/MLPPolicyHead/TransformerDecoderBlock/
MultiBlockTransformerDecoder)

DEAD-CODE PRUNE (grep-proven 0 external refs):
hpt_nets: STPolicyStem, AttentivePooling, vit_base_patch16, T5TokenizerWrapper,
T5Encoder, L2Norm (also drops the heavy timm/transformers T5/ViT imports)
denoising_nets: ConditionalClassifier1D, CrossTransformerCfg2, CrossTransformerProj

VERIFY (on a40 alloc, PYTHONPATH=repo working tree, Python 3.11):

  • import egomimic OK; every touched module imports OK (pi/rollout fail only on
    pre-existing missing optional deps openpi/robot_utils, BEFORE the moved import lines).
  • py_compile OK for all 10 new/moved files.
  • Hydra compose-check passes for every model config whose target moved.
  • OLD-ckpt path map appended to scratch/hierarchy_path_map.txt (gates phase folds
    it into PORT_NOTES).

Co-Authored-By: Claude Opus 4.8 noreply@anthropic.com

algo/ hierarchy pass: rename packed_base.HNet -> PackedAlgoBase

In-place class rename (no file moved): the packed-sequence policy Algo base
was misleadingly named HNet while it is actually the shared base for the
packed-sequence algos (WindowedBC subclasses it; the inner H-Net stage tree
is supplied via outer_stage and is a separate, correctly-named concept in
models/hnet). Renamed the class to PackedAlgoBase to reflect its role.

Changes (all in one commit):

  • packed_base.py: class HNet(Algo) -> class PackedAlgoBase(Algo);
    docstring updated; kept HNet = PackedAlgoBase compat alias at module
    bottom (commented) so OLD ckpts/configs whose resolved target still names
    egomimic.algo.packed_base.HNet keep resolving.
  • bc.py: import + class WindowedBC(PackedAlgoBase) + base-referring
    docstring/comments updated.
  • algo/init.py: import-example comment updated.
  • 7 model configs (hnet_pushshapes*.yaml): target ->
    egomimic.algo.packed_base.PackedAlgoBase.
  • scripts/smoke_packed_{training_e2e,validation}.py: import (as HNetAlgo) +
    Algo-method docstrings updated.
  • tests/{test_c7_sampler_reducer_equality, test_embodiment_key_resolution_shared,
    test_training_recipe}.py: import + class refs updated.

OUT OF SCOPE (untouched per task): models/hnet (architecture HNet), HNetCore /
HNetOuterStage / HNetLoss / HNetSimEval (architecture/stage names), HNetPolicy
(landmine: proven alive), algo/obs_transforms.py (landmine: designed extension
point).

Verified on A40 alloc: import egomimic OK; PackedAlgoBase + HNet-alias
identity OK (HNet is PackedAlgoBase); WindowedBC subclass OK; all 7 hnet
configs compose with new target; old dotted path resolves via compat alias.
Tests: test_c7 + test_embodiment_key_resolution_shared + test_config_compose
all pass; test_training_recipe shows the SAME 7 pre-existing TestAlgoWiring
failures (old constructor signature, missing outer_stage) and ZERO new
failures vs the 136/8/4 baseline.

Path-map (old->new dotted) appended to scratch/hierarchy_path_map.txt for the
gates phase to fold into PORT_NOTES.

Co-Authored-By: Claude Opus 4.8 noreply@anthropic.com

utils/ hierarchy pass: junk-drawer split into pl_utils/vendored/models + dead-file purge

Relocate the misplaced contents of egomimic/utils/ to semantic homes and delete
grep-proven dead files. One commit per the hierarchy-pass group rule; revertible
via tag pre-utils-hier.

RENAMES (git mv, R-status):
utils/timing_callback.py -> pl_utils/callbacks/timing_callback.py
utils/instantiators.py -> pl_utils/instantiators.py
utils/logging_utils.py -> pl_utils/logging_utils.py
utils/rich_utils.py -> pl_utils/rich_utils.py
utils/utils.py -> pl_utils/utils.py
utils/tensor_utils.py -> vendored/robomimic_tensor_utils.py (988-line verbatim
robomimic vendor; +vendored/README.md provenance note)

egomimicUtils.py SPLIT (source file STAYS in utils/ as the generic remainder —
constants ARIA/EXTRINSICS/INTRINSICS, geometry, str2bool, interpolate_*,
CameraTransforms, download_from_huggingface, STD_SCALE):
model helpers -> models/cores/model_utils.py (NEW): get_sinusoid_encoding_table,
reverse_kl_from_samples, frechet_gaussian_over_time, EinOpsRearrange, AlohaFK
drawing fns merged into utils/viz_utils.py (dependency FLIPPED — viz_utils now
OWNS the drawing fns and pulls only INTRINSICS/cam_frame_to_cam_pixels/
ee_pose_to_cam_frame from egomimicUtils): draw_actions, draw_dot_on_frame,
draw_rotation_text, draw_annotation_text, miniviewer (+fmt helper).
All 11 moved bodies are byte-identical to originals (verified via AST diff
vs HEAD); only import lines differ.

model_utils.py placed in models/cores/ (not loose in models/) to keep the
models-group gate "models/ has only role dirs + init.py" intact.

DELETED dead (grep-proven 0 importers; scratch copies in scratch/utils_hier_deleted/):
utils/memory_utils.py, utils/real_utils.py, utils/obs_utils.py (only keep_keys,
0 refs), egomimic/init.pyc, egomimic/keypoints.jpeg.

Importers + config target updated in this same commit (grep-exhaustive over
egomimic/ tests/ scripts/ Tsimulation/ hydra_configs/):
callbacks/defaults.yaml wandb_profiler.target -> pl_utils.callbacks.timing_callback
trainHydra.py (instantiators/logging_utils/utils), norm_stats.py (utils),
pl_model.py (tensor_utils->vendored), hpt_heads.py / eval_hpt.py / algo/zoo/hpt.py
(model helpers), pushshapes.py / eval_act.py / robot/rollout.py /
data_visualization.py (drawing fns). pl_utils/utils.py internal rich_utils ref flipped.

Added pl_utils/init.py (was implicit ns pkg) so find_packages discovers it.

VERIFY (alloc 3325801, a40, pact-2 .venv): import egomimic + all 15 touched
modules import clean from THIS tree; pytest tests/ = 136 passed / 8 failed
(all pre-existing: 7 TestAlgoWiring old-HNet-sig + 1 missing-zarr) / 4 skipped —
ZERO new failures vs baseline; test_config_compose 25/25; parent config composes
and moved callbacks target resolves to the new class.

Path-map (old->new dotted) appended to scratch/hierarchy_path_map.txt for the
gates phase to fold into PORT_NOTES.

Co-Authored-By: Claude Opus 4.8 noreply@anthropic.com

rldb/ hierarchy pass: strays out + dead-file purge

Move (R-status rename, byte-identical modulo 2 fixed import lines):

  • egomimic/rldb/zarr/zarr_write_test.py -> egomimic/scripts/eva_process/zarr_write_test.py
    (HDF5->zarr conversion CLI, not a test; already targets eva_process.
    Fixed two stale imports as part of the move:
    egomimic.rldb.zarr.ZarrWriter -> egomimic.rldb.zarr.zarr_writer.ZarrWriter (empty init)
    egomimic.scripts.eva_process.zarr_utils -> egomimic.scripts.eva_process.eva_utils (file renamed earlier))

Delete dead (grep-proven zero importers; scratch backups in scratch/rldb_deleted_backup/):

  • egomimic/rldb/compression_utils.py av/jpeg video codec, no importers
  • egomimic/rldb/data_utils.py slerp/ypr quat math, superseded by egomimic.utils.pose_utils
  • egomimic/rldb/zarr/benchmark_forward_pass.py dead benchmark script
  • egomimic/rldb/zarr/test_zarr.py broken: imports nonexistent egomimic.rldb.utils.S3RLDBDataset
  • egomimic/rldb/scripts/ (whole subpackage) nds_pq/str2bool/etc already live in egomimic.utils.egomimicUtils

Verified on a40 alloc: import egomimic OK, moved module imports OK,
deleted modules raise ModuleNotFoundError, pytest tests/ = 136 passed / 8 failed
(pre-existing) / 4 skipped (zero new failures), rldb test_dataset_filter 8 passed.
No config target referenced any touched path.

Co-Authored-By: Claude Opus 4.8 noreply@anthropic.com

eval+pl_utils hierarchy pass: test_model_wrapper -> tests/, pushshapes sim-glue -> embodiment

Group: eval + pl_utils strays (hierarchy pass).

  1. git mv egomimic/pl_utils/test_model_wrapper.py -> tests/test_model_wrapper.py
    (R-status rename; 94% similar). tests/ has no init.py so pytest imp

ElmoPA and others added 30 commits May 20, 2026 02:21
Squash of:
- a31d8ab4 tshape sim environment
- ac272e8a Add Tsimulation viz/scripted/stats tools + physics tuning
…ding

Squash of:
- f261ff68 tsim training configs/embodiment
- b38264ab Add pushshapes_sim HPT training: keymap, viz, eval fixes
- 1e174131 Add episode-level packed dataloading
Squash of 10 commits from temp-arch-flexible: 7faf2012, 49ed0d34, 8387986e, 11a266ce, 3fe9a353, 0ea9d013, c83b6e69, 63a2e852, 7b71650a, a063c021
- test_hnet_nets.py (57): routing, chunk, dechunk, isotropic,
  stages (padded + packed), HNet assembly, ratio_loss, chunk_stats,
  STE, RMSNorm, AdaLN.
- test_packed_pipeline.py (9): normalize broadcast on padded vs
  packed; _iter_leaves descent; multi-frame JPEG decode; end-to-end
  packed stats collection.
- test_training_recipe.py (20): apply_optimization_params,
  init_weights height-scaled init, apply_lr_multiplier per-stage
  stamping, parameter_groups (default, with bias/norm WD=0,
  per-stage groups, AdamW-consumable). Plus algo wiring tests for
  the opt-in init_weights_range / lr_multipliers /
  use_parameter_groups / weight_decay kwargs.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Imports from EgoVerse7@temp-arch-flexible working tree as of 2026-05-20.
Includes:
- algo: input_modules.py + obs_transforms.py (new modules)
- callbacks: chunker_residual_scheduler, ckpt_chunker(+dropout), random_attn_dropout
- data configs: tsimulation_400ep, tsimulation_allep + tweaks to existing tsim configs
- model configs: hnet_pushshapes_mamba_encdec, hnet_pushshapes_obs_ar + tweaks
- eval/* edits, models/hnet_nets/* edits, schedulers, uv.lock
- removes scripts/install_cuda_kernels.sh and egomimic/eval/eval_hnet_sim.py

Excluded: egomimic/algo/hnet.py.bak.preinput (manual backup) and drift_eval_out_* (eval artifacts).
- algo.py: add ar_inference_step_size (sub-steps per env tick at closed-loop AR); thread cfg_scale through _inference_step_ar/_inference_step_chunk; remove unused shape var
- backbone.py: force_uncond branch for CFG two-pass blending; wire attn/resid dropout into Isotropic trunk
- sampling.py: _CFGBackbone wrapper for cfg_scale > 1 sampling; schedule-matrix CFG plumbing

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…val viz

- model/dfot_pushshapes.yaml: scale to 67M (d_model=512, T=12, num_heads=8, d_intermediate=2048); attn dropout=0.1, resid dropout=0.1, cond_dropout_prob=0.1; cfg_scale field; causal=true
- data/tsimulation_full.yaml: 750-episode circle_750 dataset, batch_size=16
- evaluator/eval_dfot_val.yaml + eval_dfot_full.yaml: cfg_scale + ar_chunk_size + ar_step_size knobs
- callbacks/ckpt_attn_dropout.yaml: composed callback (checkpoints + random_attn_dropout with values [0.1, 0.5, 0.8, 0.9, 0.95, 0.97, 0.98])
- eval/eval_dfot_val.py: 96x96 -> 512x512 nearest-neighbor upscale, palette (gt=green, chunk=red, ar=yellow), world-coord pixel scaling, threaded cfg_scale

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- sbatch_train_dfot_200ep_full750_pace.sh: minor cleanup
- sbatch_train_dfot_400ep_full750_pace.sh: 400ep H200 launch with scheduler.max_steps=18800 (fixes the 200ep cosine-not-decaying bug)
- sbatch_train_dfot_400ep_attndrop.sh: 400ep + random_attn_dropout (the 5.6x sim_coverage win)
- scripts/eval_cfg_latest.py: post-hoc DFoT eval CLI with --cfg-scale, --ar-chunk-size, --ar-step-size, --ar-inference-chunk-size, --ar-inference-step-size, --skip-val/--skip-sim
- scripts/eval_fsd_latest.py: convenience wrapper for inference_mode=chunk
- scripts/sbatch_fsd_eval.sh + sbatch_sim_sweep.sh: sbatch templates for closed-loop sim sweeps

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Introduces the abstract bases for the upcoming refactor:

- egomimic/algo/outer_stage.py: OuterStage base. Owns an
  inner_stage field (the trunk). Subclasses implement encode
  (raw batch -> trunk-input tensor; can sample noise / record state
  on ctx) and decode (trunk-output -> per-modality prediction keys
  on batch).

- egomimic/algo/loss.py: Loss base + CompositeLoss (weighted
  sum of terms) + MSELoss (per-modality MSE between pred_key and
  target_key). Loss policy becomes data — a hydra config block —
  rather than inheritance.

No algorithm code uses these yet; HNetOuterStage / DFoTOuterStage and
the algo.hnet.HNet / algo.dfot.DFoT refactors come in follow-up commits
on this branch.
Foundational refactor for the upcoming DFoTOuterStage + DFoTLoss
classes:

- q_sample(x, t) -> dict: forward-noising step. Returns x_t, noise,
  alpha_t, sigma_t, logsnr, and the precond_scale * logsnr time_cond
  the backbone consumes. No backbone call inside.

- compute_loss(v_pred, q_state) -> per-token weighted MSE: takes the
  dict from q_sample plus the backbones v_pred and computes the
  SNR-weighted epsilon-MSE.

- forward(backbone, x, t, cond): kept as a back-compat wrapper that
  calls q_sample, runs the backbone, then compute_loss. Existing
  callers (DFoT.forward_training) are unaffected.

Bitwise verified equivalent to the prior single-method path via
the included /tmp/test_diffusion_split.py smoke (loss + x_pred match
exactly, same random seed).

discrete_diffusion.py is unsplit for now; current configs use
continuous.
…verified)

- egomimic/algo/dfot/outer_stage.py: DFoTOuterStage subclass.
  encode: encode obs to per-token cond, sample noise levels, run
  diffusion.q_sample, store q_state + external_cond on ctx, return
  noisy x_t. decode: write batch[pred_v] for the loss to read.
  forward override threads cu_seqlens/max_seqlen (packed mode) and
  time_cond into the backbone call.

- egomimic/algo/loss.py: DFoTLoss class. Reads batch[pred_v] and
  ctx.q_state, calls diffusion.compute_loss (SNR-weighted eps-MSE),
  reduces to scalar.

Bitwise verified via /tmp/test_dfot_outer_stage.py: padded-mode loss
through DFoTOuterStage + DFoTLoss matches DFoT.forward_training to
0.0e+00 difference at fixed seed. Real CondEncoderModule +
DFoTBackbone + ContinuousDiffusion submodules; no mocks of the math
path.

Algo class (DFoT.forward_training) is NOT yet wired to use these —
that comes in the next commit on this branch. Inference paths
(closed-loop AR sample_step, chunk plan-execute) also deferred.
Algo class:
- __init__ now takes outer_stage: DFoTOuterStage and optional
  loss: Loss (auto-built as DFoTLoss(outer_stage.diffusion) if None).
- Removes legacy cond_encoder, backbone, diffusion_type,
  diffusion_kwargs, cond_output_key args — they now live on the
  outer_stage subblock.
- Adds @Property accessors for cond_encoder, backbone,
  diffusion, outer_stage, loss so existing inference paths
  (_inference_step_ar, _inference_step_chunk, _sample_chunk,
  forward_eval) keep working unchanged via property forwarding.
- forward_training shrinks ~40 LOC -> ~20 LOC: build ctx, call
  outer_stage(batch, ctx), call loss(batch, ctx). No more inline
  diffusion math; no more cu_seqlens threading at this level.
- Adds ar_inference_step_size knob.

dfot_pushshapes.yaml:
- New outer_stage: block wraps cond_encoder + backbone + diffusion.
- Removes top-level diffusion_type / diffusion_kwargs; the
  diffusion module is now its own _target_ inside outer_stage.
- loss: omitted (uses default DFoTLoss(outer_stage.diffusion)).

End-to-end smoke (scripts/test_dfot_refactor_e2e.py) verifies the
config instantiates via hydra and forward_training emits a finite
scalar loss. Bitwise loss-equivalence was already shown in the prior
commit (test_dfot_outer_stage.py).

Old checkpoints WILL NOT load — state_dict keys moved from
nets.{cond_encoder,backbone}.* to nets.outer_stage.{cond_encoder,inner_stage}.*
This is intentional per the agreed clean-break refactor.
Adds scripts/test_dfot_inference.py: instantiates the refactored DFoT
from dfot_pushshapes.yaml, runs inference_step in both ar and chunk
modes, asserts action is (action_dim,) and finite. Verifies the
@Property accessors (self.backbone, self.cond_encoder, self.diffusion)
forward correctly to outer_stage submodules so the closed-loop AR
and chunk-mode inference paths keep working after the refactor.

Passing on compute node 8997316:
  [ar] action @ t=0: [0.30 0.47]
  [ar] action @ t=1: [0.44 1.06]
  [chunk] action @ t=0: [-0.78 -1.89]
…ied)

- egomimic/algo/hnet_outer_stage.py: HNetOuterStage class. Inherits
  from OuterStage with inner_stage = HNetCore (stage tree). Owns
  cond_encoder, input_modules (summed per-token contributions),
  action_out head. Three forward paths inherited from the old
  HNetPolicy pattern: forward(batch, ctx) dispatcher (padded/packed),
  generate (offline AR), init_step_state + step (online single-tick).

- egomimic/algo/loss.py: HNetLoss class. Reads batch[pred_action] +
  batch[actions], adds per-chunker ratio_loss_from_aux from ctx.aux.

- scripts/test_hnet_outer_stage.py: equivalence smoke. Instantiates
  the existing hnet_pushshapes.yaml subcomponents, wraps the SAME
  instances in HNetPolicy and HNetOuterStage, ties their action_out
  heads, runs identical padded forward. Verified bitwise (max diff
  0.00e+00 at fixed seed) on H200 alloc 8989249. Also smoke-tests
  the step inference path (shape + finite).

The old HNetPolicy class is still in egomimic/algo/hnet.py and not yet
removed. Algo-class refactor and yaml updates come in follow-up
commits on this branch.

Old checkpoints will NOT load — state_dict keys move from policy.*
to outer_stage.* (or wherever the algo class places it). Per clean-
break policy.
Algo class:
- HNet.__init__ now takes outer_stage: HNetOuterStage + loss: Optional[Loss]
  instead of cond_encoder + hnet + action_dim + action_horizon +
  d_model + action_head_type + input_modules. action_horizon read from
  outer_stage.
- Loss defaults to HNetLoss() if not provided.
- self.nets is now ModuleDict({outer_stage, loss}); old keys
  (self.nets[policy], self.nets[cond_encoder], ...) are exposed via
  @Property forwarding to outer_stage submodules so legacy callsites
  in forward_eval / _teacher_forced_packed / _ar_rollout_packed / step
  inference keep working.
- forward_training builds (batch, ctx), calls outer_stage(batch, ctx)
  + loss(batch, ctx), unpacks per-term breakdown (ctx.action_loss,
  ctx.ratio_loss) into the predictions dict for logging.

HNetLoss:
- Computes action MSE + ratio_loss_from_aux(ctx.aux) and stashes the
  per-term split on ctx for the algo to log separately.

HNetOuterStage:
- Adds back-compat bridge methods forward_padded(actions, obs) and
  forward_packed(actions, obs, cu, msl) returning (pred, aux). The
  forward_eval / _teacher_forced_packed paths use these; .generate /
  .step / .init_step_state already had matching signatures.

hnet_pushshapes.yaml:
- New outer_stage: block wraps cond_encoder + hnet stage tree +
  input_modules + action_head_type. Top-level keeps training-recipe
  knobs (init_weights_range, lr_multipliers, ...) and embodiment
  wiring.
- loss: block omitted (defaults to HNetLoss()).

scripts/test_hnet_refactor_e2e.py: packed-mode forward_training smoke.
Verified passing on H200 alloc 8989249 — produces action_loss 1.51 +
ratio_loss 0.032 + chunker stats for a 2-episode packed batch
(T=12+20). Padded mode hits a pre-existing torch SDPA error
(Explicit attn_mask should not be set when is_causal=True) in
train mode — this is in the trunk code, not introduced by the
refactor (the production training uses packed mode and never hits
the padded-train path).

Old HNetPolicy class is still in algo/hnet.py for now (no longer
used by HNet algo); will be removed in a cleanup commit once all
stage-based + flat yamls are migrated.
…hema

Same outer_stage block pattern as the base hnet_pushshapes.yaml, applied
to the variant configs. Each yaml moves cond_encoder + hnet stage tree
+ (optional) input_modules + action_head_type under outer_stage; keeps
training-recipe knobs + embodiment wiring at the top level.

Configs migrated:
- hnet_pushshapes_big.yaml             (d_model 256, 21M params)
- hnet_pushshapes_crossattn.yaml       (cond_mode: cross_attn)
- hnet_pushshapes_mamba_encdec.yaml    (M8 encoder/decoder)
- hnet_pushshapes_obs_ar.yaml          (ObsToken input module)
- hnet_pushshapes_obs_ar_large.yaml    (ObsToken + d_model 256 + T8)
- hnet_pushshapes_recipe.yaml          (H-Net paper recipe)

scripts/test_hnet_yamls_load.py: batch instantiate smoke. Verified
on H200 alloc 8989249 — all 7 stage-based yamls (base + 6 variants)
instantiate from hydra config and produce sensible param counts
(5.5M baseline up to 42.8M obs_ar_large).
- egomimic/algo/flat_fused_outer_stage.py: FlatFusedOuterStage class.
  Structurally a rename of FlatFusedPolicy with OuterStage inheritance
  plus an OuterStage forward(batch, ctx) dispatcher delegating to the
  existing forward_padded / forward_packed. Legacy generate / step /
  init_step_state preserved verbatim. encode / decode raise
  NotImplementedError since the interleaved 2T-token flow does not
  cleanly split along encode -> trunk -> decode.

- egomimic/algo/hnet.py: HNetFused is now a thin pass-through subclass
  of HNet, kept as a separate _target_ for the existing flat yamls.
  All flat-fused behavior moved into FlatFusedOuterStage; HNet.__init__
  already tolerates outer_stage.inner_stage=None.

- 3 flat yamls migrated to outer_stage schema:
  hnet_pushshapes_fused.yaml, hnet_pushshapes_fused_lowlr.yaml,
  hnet_pushshapes_fused_pusher.yaml.

- scripts/test_hnet_yamls_load.py extended to cover all 10 H-Net
  configs. Verified on H200 alloc 8989249: 10/10 instantiate
  successfully (7 HNetOuterStage + 3 FlatFusedOuterStage). Param
  counts sensible.

Old FlatFusedPolicy class still in algo/hnet.py for now; cleanup of
unused legacy classes (HNetPolicy, FlatFusedPolicy) is a follow-up.
ElmoPA and others added 28 commits June 7, 2026 04:39
…y retained — it is NOT dead)

Pure-delete-with-grep-proof. Each symbol re-grepped over egomimic/ +
hydra_configs/ + tests/ + scripts/ (excl __pycache__/external/scratch/logs)
IMMEDIATELY before deletion to reconfirm zero LIVE refs.

Deleted (proven zero live refs):
  - egomimic/models/diffusion_policy.py        (whole file -> scratch; only self-ref)
  - egomimic/models/ddim_scheduler.py          (whole file -> scratch; live DDIM is
                                                diffusion/sampling.ddim_sample)
  - egomimic/models/hnet/_smoke_stages.py      (whole file -> scratch; only its own
                                                __main__ self-invoke)
  - algo/loss.py CompositeLoss + MSELoss       (no _target_, no code ctor; live losses
                                                are HNetLoss()/DFoTLoss() built in code)
  - algo/packed_base.py HNet._ar_rollout_packed (no caller; live eval is
                                                 forward_eval -> _teacher_forced_packed)
  - algo/packed_outer_stage.py HNetOuterStage.generate (dead; only ref was a docstring.
                                                 step/init_step_state KEPT — they ARE the
                                                 live closed-loop path, called at
                                                 packed_base.py policy.init_step_state/step)
  - pl_utils/pl_data_utils.py RLDBModule, DualDataModuleWrapper, DataModuleWrapper
                                                (deprecated; live wrapper is
                                                 MultiDataModuleWrapper, ref'd by 21 files)

Also dropped now-unused imports (typing.Optional in packed_outer_stage.py;
typing.Iterable/List/Optional in loss.py) and updated 2 yaml comment mirrors
that named the now-deleted CompositeLoss (dfot_pushshapes.yaml, hnet_pushshapes.yaml).
Deleted files moved to scratch/dead_code_c4/ (gitignored), not destroyed.

SCOPE CORRECTION — HNetPolicy NOT deleted: the campaign evidence claimed
HNetPolicy was zero-ref, but that grep excluded scripts/. The required
grep over scripts/ found 3 LIVE importers+instantiators:
  scripts/smoke_packed_training.py   (documented live tooling in CLAUDE.md L538)
  scripts/test_mamba_regression.py   (old-vs-new equivalence regression)
  scripts/test_hnet_outer_stage.py   (HNetPolicy-vs-HNetOuterStage equivalence smoke)
The grep proof-gate fails for HNetPolicy, so it is retained (its .generate
method stays with the class). All other 5 targets pass cleanly.

Proofs (run on own a40 alloc, repo's symlinked .venv):
  - import-smoke: `import egomimic.algo, egomimic.models, egomimic.pl_utils` -> IMPORT OK
  - retained symbols import: Loss/HNetLoss/DFoTLoss, MultiDataModuleWrapper,
    HNet+HNetPolicy, HNetOuterStage (step=True init_step_state=True generate=False)
  - deleted symbols confirmed gone (CompositeLoss/MSELoss absent; dead model files absent)
  - py_compile all edited .py: OK
  - pytest tests/: 128 passed / 8 failed / 4 skipped — IDENTICAL to baseline.
    The 8 failures are the documented pre-existing set (7x TestAlgoWiring
    old-HNet-signature: "HNet.__init__() missing 1 required positional argument:
    'outer_stage'"; 1x test_full_pipeline_collects_per_feature_stats missing-zarr).
    Zero NEW failures. Deletes touch zero reachable code paths -> no torch.equal needed.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
… + log_info onto base Algo

Three blocks of code were byte-identical (verified via `diff`, rc=0) across
the policy algos and are collapsed into a single home on the base `Algo`:

  * the per-embodiment key-resolution loop (`for emb in self.domains: ...`
    resolving resolved_ac_keys / proprio_keys / lang_keys / camera_keys via
    norm_stats) -- HNet packed_base.py:489-514 == DFoT diffusion/algo.py:169-194
    == WindowedBC bc.py:699-724. Now `Algo._resolve_embodiment_keys(norm_stats)`;
    each subclass calls it from __init__.
  * `_build_obs` -- HNet packed_base.py:615-624 == DFoT diffusion/algo.py:246-255
    (WindowedBC already inherited HNet's). Now defined once on `Algo`; the HNet
    and DFoT overrides are deleted (inherited).
  * `log_info` -- HNet packed_base.py:778-783 == DFoT diffusion/algo.py:419-425.
    The base `Algo.log_info` (formerly a NotImplementedError stub) now carries
    this exact body as the shared default; the HNet/DFoT overrides are deleted.

Net -116/+83 lines. Behaviour is preserved by construction: the moved text is
identical, so every subclass resolves the same function object from `Algo`
(no subclass re-introduces a private copy). bc.py drops its now-unused
`get_embodiment_id` import; packed_base/DFoT keep theirs (still used elsewhere).

Proofs (run on a40 alloc 3325792, fixed seeds):
  * DFoT 1-epoch pixel smoke Train/Loss = 0.28878551721572876 -- BIT-IDENTICAL
    to the pre-c5 baseline (all 17 digits; this is the deterministic, 0.0-jitter
    path per the dedup_baseline manifest).
  * BC SMOKE=1 train: c5 Train/Loss = [1.3453816, 0.1752842]. A clean pre-c5
    baseline RE-RUN on the same node gives [1.3452528, 0.1739942] -- i.e. the BC
    smoke is itself run-to-run nondeterministic at ~1e-3 (CUDA/image-encoder/
    sim-eval RNG), and the c5 run lands CLOSER to the manifest values
    [1.3453673, 0.1749004] than the clean-tree rerun does (row0 |c5-manifest|
    =1.4e-5 vs |baseline_rerun-manifest|=1.1e-4). The refactor is within the
    tree's own deterministic-replay band.
  * New permanent guard tests/test_embodiment_key_resolution_shared.py (3 tests):
    asserts HNet/WindowedBC/DFoT all resolve the SAME `Algo` function objects
    for _resolve_embodiment_keys / _build_obs / log_info (import-identity) and
    produce byte-equal key sets + obs selection on a fixed norm_stats fixture.
  * pytest tests/ = 131 passed / 8 failed / 4 skipped. The 8 failures are the
    documented pre-existing baseline (7x TestAlgoWiring old-HNet-signature +
    1x missing-zarr-data); +3 passed are the new permanent tests. Zero NEW
    failures vs the 128/8/4 profile.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…mg_utils.img_chw_to_uint8

Six evaluators each carried a byte-equivalent (C,H,W) float-in-[0,1] ->
(H,W,C) uint8-in-[0,255] converter under three private names:
  - _img_chw_to_uint8 : video_rollout, pixel_video_rollout,
                        spatial_video_rollout, policy_action
  - _u8               : bundle_anchored
  - _to_uint8_hwc     : eval/core/eval_vae_recon
Four were the 4-line clip->*255->transpose form; two (policy_action, _u8)
were the same logic as a 2-liner. All six now delegate to a single canonical
egomimic.eval.core.img_utils.img_chw_to_uint8; the local defs are removed and
each call site renamed.

EXCLUDED (genuinely different, left untouched): eval_dfot_self_rollout
._img_chw_to_uint8 uses an x.max()<=1.5 auto-scale heuristic + clip(0,255).

PROOFS (a40 job 3325794, fixed seeds):
  - tests/test_eval_img_utils.py (NEW, permanent gate):
      test_canonical_matches_every_original_body PASSED
        np.array_equal(canonical, orig_4liner) and ==orig_2liner on a fixed
        torch.manual_seed(0) float tensor spanning out-of-[0,1] (rand*1.4-0.2).
      test_touched_eval_modules_import PASSED  (import-smoke of all 7 modules)
  - full tests/: pre-c6 (stashed) = 8 failed / 131 passed / 4 skipped;
    post-c6 = 8 failed / 133 passed / 4 skipped. SAME 8 pre-existing failures
    (7x TestAlgoWiring old-HNet-signature + 1x TestInferNormFromPacked
    missing-zarr-data); +2 passed = the 2 new equality/import tests. Zero NEW
    failures. Behaviour-preserving: the 6 bodies were already identical.

Deleted-helper provenance saved to scratch/c6_deleted_helpers/ (gitignored).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…uperset + hoist loss-reducer skeleton onto Algo

Two structural near-duplicates removed:

1. Frame sampler. PixelSpatialDFoTOuterStage._sample_frames_packed (image
   only) duplicated the fixed_window/start_to_end/random_subsample branch
   logic of PixelObsActionDFoTOuterStage._sample_windows_packed (image+
   action). The action version is a strict superset: its image cropping and
   cu_seqlens are byte-identical whether or not actions are supplied (the
   action branch performs zero extra RNG draws). Hoisted the superset onto the
   parent PixelSpatialDFoTOuterStage, made it accept actions=None, and reduced
   _sample_frames_packed to a thin delegate
   (_sample_windows_packed(images, None, cu)). Deleted the duplicate copy from
   the PixelObsAction subclass (now inherited).

2. Loss reducer. DFoT.compute_losses was the pure sum-per-embodiment
   {emb}_action_loss -> action_loss skeleton; promoted it to the Algo base as
   the default compute_losses and deleted DFoT's byte-identical override.
   VAE / HNet / HPT / PI keep their own overrides because they are genuine
   SUPERSETS (recon/kl/lpips, ratio_loss, domain-count division) — NOT folded.

PROOFS (a40 alloc 3325796, fixed seeds):
- BEFORE-state probe: current image-only sampler vs current superset produce
  torch.equal image crops + cu_seqlens across all 3 modes (img_equal=True,
  cu_equal=True for fixed_window / start_to_end / random_subsample).
- Permanent test tests/test_c7_sampler_reducer_equality.py (3 tests, all pass):
  * image-only sampler == superset(actions=None) torch.equal across all 3
    modes and all episode-length regimes (<n, ==n, >n);
  * hoisted Algo.compute_losses torch.equal to legacy DFoT.compute_losses on
    fixed predictions;
  * guard: HNet ratio_loss reducer is NOT reproduced by the base default
    (catches a future wrong fold of the superset overrides).
- DFoT 1-epoch pixel-policy smoke (exercises the rewritten packed sampler,
  frame_sampling=fixed_window, + the inherited reducer): Train/Loss =
  Train/action_loss = Train/emb15_action_loss = 0.28878551721572876, BIT-
  IDENTICAL (all 17 digits) to the pre-collapse baseline.
- pytest tests/ = 136 passed / 8 failed / 4 skipped: the 8 failures are the
  documented pre-existing set (7x TestAlgoWiring old-HNet-signature + 1x
  missing-zarr-data InferNorm), ZERO new failures; suite grew by the 3 new
  permanent equality tests.

No hydra configs reference the touched methods (no config mirror needed).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…eads/stems/diffusion) + utils

End state: `ls egomimic/models/*.py` shows ONLY __init__.py. Every move is a
git mv (R-status); the 2 splits git-mv the file to its primary home first
(history-preserving) then extract the other-role classes into new files in this
same commit. All moved class bodies are byte-identical modulo import lines
(verified by class-body diff vs HEAD). Every importer + config _target_ updated
in this commit; grep confirms zero remaining old-module dotted refs.

WHOLE-FILE MOVES (git mv):
  fm_policy.py        -> heads/fm_policy.py           (policy output head)
  denoising_policy.py -> heads/denoising_policy.py    (diffusion policy head base)
  denoising_nets.py   -> diffusion/denoising_nets.py  (legacy diffusion nets)
  image_vae.py        -> diffusion/image_vae.py       (DFoT pixel<->latent codec)
  preprocess_pi_obs.py-> utils/preprocess_pi_obs.py   (data preprocessing, OUT of models/)

SPLITS (git mv to primary home + extract):
  act_nets.py -> stems/resnet_conv.py (primary: Module/ConvBase/CoordConv2d/ResNet18Conv)
              + cores/act_transformer.py (PositionalEncoding/Transformer/StyleEncoder)
  hpt_nets.py -> stems/hpt_stems.py (primary: PolicyStem/MLPPolicyStem/ResNet)
              + cores/hpt_transformer.py (CrossAttention/Attention/MLP/BlockWithMasking/
                                          MultiheadAttention/SimpleTransformer)
              + heads/hpt_heads.py (PolicyHead/MLPPolicyHead/TransformerDecoderBlock/
                                    MultiBlockTransformerDecoder)

DEAD-CODE PRUNE (grep-proven 0 external refs):
  hpt_nets: STPolicyStem, AttentivePooling, vit_base_patch16, T5TokenizerWrapper,
            T5Encoder, L2Norm  (also drops the heavy timm/transformers T5/ViT imports)
  denoising_nets: ConditionalClassifier1D, CrossTransformerCfg2, CrossTransformerProj

VERIFY (on a40 alloc, PYTHONPATH=repo working tree, Python 3.11):
  - `import egomimic` OK; every touched module imports OK (pi/rollout fail only on
    pre-existing missing optional deps openpi/robot_utils, BEFORE the moved import lines).
  - py_compile OK for all 10 new/moved files.
  - Hydra compose-check passes for every model config whose _target_ moved.
  - OLD-ckpt path map appended to scratch/hierarchy_path_map.txt (gates phase folds
    it into PORT_NOTES).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
In-place class rename (no file moved): the packed-sequence policy Algo base
was misleadingly named ``HNet`` while it is actually the shared base for the
packed-sequence algos (WindowedBC subclasses it; the inner H-Net stage tree
is supplied via ``outer_stage`` and is a separate, correctly-named concept in
models/hnet). Renamed the class to ``PackedAlgoBase`` to reflect its role.

Changes (all in one commit):
- packed_base.py: ``class HNet(Algo)`` -> ``class PackedAlgoBase(Algo)``;
  docstring updated; kept ``HNet = PackedAlgoBase`` compat alias at module
  bottom (commented) so OLD ckpts/configs whose resolved _target_ still names
  ``egomimic.algo.packed_base.HNet`` keep resolving.
- bc.py: import + ``class WindowedBC(PackedAlgoBase)`` + base-referring
  docstring/comments updated.
- algo/__init__.py: import-example comment updated.
- 7 model configs (hnet_pushshapes*.yaml): _target_ ->
  egomimic.algo.packed_base.PackedAlgoBase.
- scripts/smoke_packed_{training_e2e,validation}.py: import (as HNetAlgo) +
  Algo-method docstrings updated.
- tests/{test_c7_sampler_reducer_equality, test_embodiment_key_resolution_shared,
  test_training_recipe}.py: import + class refs updated.

OUT OF SCOPE (untouched per task): models/hnet (architecture HNet), HNetCore /
HNetOuterStage / HNetLoss / HNetSimEval (architecture/stage names), HNetPolicy
(landmine: proven alive), algo/obs_transforms.py (landmine: designed extension
point).

Verified on A40 alloc: `import egomimic` OK; PackedAlgoBase + HNet-alias
identity OK (HNet is PackedAlgoBase); WindowedBC subclass OK; all 7 hnet
configs compose with new _target_; old dotted path resolves via compat alias.
Tests: test_c7 + test_embodiment_key_resolution_shared + test_config_compose
all pass; test_training_recipe shows the SAME 7 pre-existing TestAlgoWiring
failures (old constructor signature, missing ``outer_stage``) and ZERO new
failures vs the 136/8/4 baseline.

Path-map (old->new dotted) appended to scratch/hierarchy_path_map.txt for the
gates phase to fold into PORT_NOTES.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…s + dead-file purge

Relocate the misplaced contents of egomimic/utils/ to semantic homes and delete
grep-proven dead files. One commit per the hierarchy-pass group rule; revertible
via tag pre-utils-hier.

RENAMES (git mv, R-status):
  utils/timing_callback.py   -> pl_utils/callbacks/timing_callback.py
  utils/instantiators.py     -> pl_utils/instantiators.py
  utils/logging_utils.py     -> pl_utils/logging_utils.py
  utils/rich_utils.py        -> pl_utils/rich_utils.py
  utils/utils.py             -> pl_utils/utils.py
  utils/tensor_utils.py      -> vendored/robomimic_tensor_utils.py  (988-line verbatim
                                robomimic vendor; +vendored/README.md provenance note)

egomimicUtils.py SPLIT (source file STAYS in utils/ as the generic remainder —
constants ARIA/EXTRINSICS/INTRINSICS, geometry, str2bool, interpolate_*,
CameraTransforms, download_from_huggingface, STD_SCALE):
  model helpers -> models/cores/model_utils.py (NEW): get_sinusoid_encoding_table,
    reverse_kl_from_samples, frechet_gaussian_over_time, EinOpsRearrange, AlohaFK
  drawing fns merged into utils/viz_utils.py (dependency FLIPPED — viz_utils now
    OWNS the drawing fns and pulls only INTRINSICS/cam_frame_to_cam_pixels/
    ee_pose_to_cam_frame from egomimicUtils): draw_actions, draw_dot_on_frame,
    draw_rotation_text, draw_annotation_text, miniviewer (+fmt helper).
  All 11 moved bodies are byte-identical to originals (verified via AST diff
  vs HEAD); only import lines differ.

model_utils.py placed in models/cores/ (not loose in models/) to keep the
models-group gate "models/ has only role dirs + __init__.py" intact.

DELETED dead (grep-proven 0 importers; scratch copies in scratch/utils_hier_deleted/):
  utils/memory_utils.py, utils/real_utils.py, utils/obs_utils.py (only keep_keys,
  0 refs), egomimic/__init__.pyc, egomimic/keypoints.jpeg.

Importers + config _target_ updated in this same commit (grep-exhaustive over
egomimic/ tests/ scripts/ Tsimulation/ hydra_configs/):
  callbacks/defaults.yaml wandb_profiler._target_ -> pl_utils.callbacks.timing_callback
  trainHydra.py (instantiators/logging_utils/utils), norm_stats.py (utils),
  pl_model.py (tensor_utils->vendored), hpt_heads.py / eval_hpt.py / algo/zoo/hpt.py
  (model helpers), pushshapes.py / eval_act.py / robot/rollout.py /
  data_visualization.py (drawing fns). pl_utils/utils.py internal rich_utils ref flipped.

Added pl_utils/__init__.py (was implicit ns pkg) so find_packages discovers it.

VERIFY (alloc 3325801, a40, pact-2 .venv): import egomimic + all 15 touched
modules import clean from THIS tree; pytest tests/ = 136 passed / 8 failed
(all pre-existing: 7 TestAlgoWiring old-HNet-sig + 1 missing-zarr) / 4 skipped —
ZERO new failures vs baseline; test_config_compose 25/25; parent config composes
and moved callbacks _target_ resolves to the new class.

Path-map (old->new dotted) appended to scratch/hierarchy_path_map.txt for the
gates phase to fold into PORT_NOTES.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Move (R-status rename, byte-identical modulo 2 fixed import lines):
- egomimic/rldb/zarr/zarr_write_test.py -> egomimic/scripts/eva_process/zarr_write_test.py
  (HDF5->zarr conversion CLI, not a test; already targets eva_process.
   Fixed two stale imports as part of the move:
     egomimic.rldb.zarr.ZarrWriter           -> egomimic.rldb.zarr.zarr_writer.ZarrWriter (empty __init__)
     egomimic.scripts.eva_process.zarr_utils -> egomimic.scripts.eva_process.eva_utils    (file renamed earlier))

Delete dead (grep-proven zero importers; scratch backups in scratch/rldb_deleted_backup/):
- egomimic/rldb/compression_utils.py          av/jpeg video codec, no importers
- egomimic/rldb/data_utils.py                 slerp/ypr quat math, superseded by egomimic.utils.pose_utils
- egomimic/rldb/zarr/benchmark_forward_pass.py dead benchmark script
- egomimic/rldb/zarr/test_zarr.py             broken: imports nonexistent egomimic.rldb.utils.S3RLDBDataset
- egomimic/rldb/scripts/ (whole subpackage)   nds_pq/str2bool/etc already live in egomimic.utils.egomimicUtils

Verified on a40 alloc: `import egomimic` OK, moved module imports OK,
deleted modules raise ModuleNotFoundError, pytest tests/ = 136 passed / 8 failed
(pre-existing) / 4 skipped (zero new failures), rldb test_dataset_filter 8 passed.
No config _target_ referenced any touched path.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…s sim-glue -> embodiment

Group: eval + pl_utils strays (hierarchy pass).

1) git mv egomimic/pl_utils/test_model_wrapper.py -> tests/test_model_wrapper.py
   (R-status rename; 94% similar). tests/ has no __init__.py so pytest imports
   it as top-level module test_model_wrapper; updated the in-test _target_
   (egomimic.pl_utils.test_model_wrapper.DummyAlgo -> test_model_wrapper.DummyAlgo)
   and the two __module__ assertions to match. Behaviour identical: 2 passed +
   1 pre-existing fail (lr_scheduler dict-wrap) both before and after the move.

2) SPLIT: extracted the PushShapes sim-eval glue (_env_to_zarr_pushshapes,
   _ENV_TO_ZARR, _state_to_init) out of egomimic/eval/core/eval_sim.py (the
   algo-agnostic evaluator, which stays as the primary home of the eval classes)
   into a NEW embodiment helper module egomimic/rldb/embodiment/pushshapes_sim.py.
   The 3 symbols are byte-identical (verified via AST extraction). eval_sim.py
   re-imports them so the legacy names (incl. the facade path
   egomimic.eval.eval_sim._env_to_zarr_pushshapes / ._state_to_init used by
   scripts/verify_*) keep resolving to the SAME objects.

Importers repointed to the canonical home in this same commit:
  - egomimic/eval/dfot/eval_dfot_self_rollout.py  (_state_to_init)
  - scripts/verify_normalization.py / verify_normalization2.py / verify_image.py
    (_env_to_zarr_pushshapes)

Verification (a40 alloc): import egomimic OK; pushshapes_sim / eval_sim re-export /
eval_dfot_self_rollout / legacy facade eval_sim all import and resolve to identical
objects; all 4 eval_sim class _target_s resolve via hydra get_class; verify_* scripts
py_compile clean; pytest tests/test_config_compose.py 25/25; pytest tests/ = 138 passed /
9 failed / 4 skipped (baseline 136/8/4 + the now-collected test_model_wrapper 2-pass
+1-pre-existing-fail; zero NEW failures, 8 pre-existing TestAlgoWiring + 1 missing-zarr
unchanged).

Per-folder .py counts: pl_utils 8->7, tests 12->13, rldb/embodiment 4->5,
eval/core 8->8 (SPLIT: eval_sim.py stays; glue fragment extracted to embodiment).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…lib + ops homes, dead launchers

Folder-group of the pact-2 hierarchy pass. Every move is a git-mv R-rename
(history preserved); moved code is byte-identical modulo import lines.

Moves (all git mv, R-status):
- 6 regression smokes scripts/test_*.py -> tests/regression/ . Each got a
  module-level pytest.skip(allow_module_level=True) guard (the +9 lines) so
  pytest collection stays clean -- they hardcode an EgoVerse-clone-3 path and
  load configs removed from this repo + need GPU/checkpoints; run manually.
- 7 packed/composite/teacher smokes scripts/smoke_*.py -> tests/regression/
  (byte-identical, no test_ prefix so pytest never collects them).
- scripts/smoke_sim_eval.py -> egomimic/eval/core/ckpt_loading.py: it is a
  LIBRARY (load_algo_from_ckpt + _MockTrainer, the CLI main() rides along).
  Its sibling import rebased to egomimic.eval.core.eval_sim (off the legacy
  facade). All 5 importers updated in this commit:
    scripts/{tf_dec_overlay,tf_decoupled_eval,tf_chunk_eval,
             eval_cfg_latest,eval_fsd_latest}.py
- root setup_nvm.sh, run_eva_docker.sh, pull_models.sh -> scripts/ops/ .

Deletes (with proof):
- scripts/sbatch_train_hnet_fused_{50,80}ep_cosine.sh -- both pass
  model=hnet_pushshapes_fused, a config removed from hydra_configs/ (survives
  only quarantined under scratch/flat_fused_quarantine/). Dead launchers.
- scripts/__pycache__/ .

Docs touched for accuracy (not load-bearing): egomimic/robot/eva/eva.md
run_eva_docker.sh path; CLAUDE.md smoke-script paths + section heading.

Verify (on a40 alloc 3325804): import egomimic OK; new ckpt_loading path
exposes load_algo_from_ckpt + _MockTrainer; all 5 importers parse; pytest
tests/ --collect-only stays 150 collected, 0 errors (regression dir =
0 collected / 6 skipped); full tests/ = 138 passed / 9 failed / 10 skipped,
identical pass+fail to the pre-change HEAD (the 1 test_model_wrapper failure
+ 7 TestAlgoWiring + 1 missing-zarr all pre-exist), +6 = the guarded smokes.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The hierarchy pass (10b2398) moved test_model_wrapper.py from
egomimic/pl_utils/ into tests/, newly subjecting it to `pytest tests/`
collection. One assertion was stale: it asserted optimizers["lr_scheduler"]
*is* a StepLR, but ModelWrapper.configure_optimizers() returns the Lightning
scheduler-config dict {"scheduler": <StepLR>, "interval", "frequency"}. The
production contract is correct; the test was written against an older bare-
scheduler return. Target the nested ["scheduler"] key. No production behavior
changed (debug-the-assertion only). Restores ZERO-new-failures gate.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…ap, deep-clean log

DESIGN.md §9: hierarchy contract — role-dirs (cores/heads/stems) vs
subsystem-dirs (hnet/diffusion) asymmetry is intentional; root scripts/
(launchers, run-as-script) vs egomimic/scripts/ (importable data CLIs) split;
final egomimic/models/ tree (zero loose .py).

PORT_NOTES.md: hierarchy-pass record (6 group commits + tags + per-group R-rename
counts), the gate-fix for the newly-collected stale test_model_wrapper assertion,
final-gate results, and the FULL old->new OLD-ckpt _target_ path map. Also added
the deep-clean dead-code-purge record (collapses c4-c7) that the gate-check found
missing from the dedup-gate section (which only covered c1-c3).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…oc-gap fix)

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…-outer-stage)

COMBINE A. Unify the three DFoT video-rollout evaluators into ONE
family-agnostic DFoTVideoRolloutEval. Each outer stage now owns a
``rollout_video_episode`` hook (the family-specific sampler + decode);
the eval owns the family-INVARIANT skeleton.

Decode-on-outer-stage. The family-specific code is MOVED byte-for-byte
from the evaluators INTO the outer stages:
  * ObsActionImageDFoTOuterStage: unconditional chunk/AR bundle sampler,
    slice the flat VAE-latent portion, frozen-VAE decode. Single-panel
    (t=0 GT prepended). Metric prefix "video".
  * ImageSpatialDFoTOuterStage: per-step (state,action) cond, conditional
    chunk/AR spatial-latent sampler (optional GT-context anchor), frozen-
    VAE decode. Side-by-side [GT|pred]. Prefix "spatial".
  * PixelSpatialDFoTOuterStage: sliding-window pixel rollout anchored on
    the first GT frame(s), no VAE, + PSNR/SSIM/LPIPS. Side-by-side. Prefix
    "pixel".

The unified eval dispatches on three class attrs the outer stage
advertises (video_metric_prefix / video_panel / video_has_extra_metrics)
and owns: packed/padded episode indexing, per-step recon-MSE accumulation,
panel assembly, perceptual-metric averaging, mp4 emission.

eval_dfot_spatial_video_rollout.py + eval_dfot_pixel_video_rollout.py
deleted (scratch backup kept). DFoTSpatialVideoRolloutEval /
DFoTPixelVideoRolloutEval kept as compat aliases of the unified class —
the __init__ is a true superset and every old config now passes its knobs
explicitly, so the aliases are pure _target_ redirects.

_target_ mirrored SAME commit in the live configs that named the old
classes (eval_dfot_image_spatial: +explicit n_context_frames=0;
eval_dfot_pixel: +explicit n_context_frames=1, rollout_window=9 — all were
the old class defaults). eval_dfot_obs_action_image already named the kept
class. The eval/__init__ _MODULE_HOMES legacy-import names + dfot/__init__
re-exports now point at the unified module.

PROOF (per the universal bar, all 3 families). Fixed-seed eval OUTPUT
equality old-vs-new in separate processes: the metrics dict (torch.equal,
incl. PSNR/SSIM/LPIPS) AND the rendered frame tensors (torch.equal) on a
random-weight algo + one real packed batch. PASS for image_spatial,
obs_action_image (BOTH chunk + AR sub-evals), and pixel — bit-identical
(A40 deterministic; verified by an independent rerun). tests/ back to the
8-failed baseline (the 9th, test_touched_eval_modules_import, updated for
the collapse), all 20 evaluator yamls compose+resolve.

Net -222 LOC (940 -> 269 eval LOC; family code now lives on the stages).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…b base

Three structural dedups on the DFoT evaluators (egomimic/eval/dfot/), all
behaviour-preserving (proven by fixed-seed torch.equal old-vs-new + textual
identity of the moved method bodies):

1. POLICY PAIR MERGE. git mv eval_dfot_policy_action.py -> eval_dfot_policy.py
   and fold DFoTPolicyRecedingHorizonEval (which subclasses DFoTPolicyActionEval
   and reuses _rollout/_ddim_from_v verbatim) into the same module; delete
   eval_dfot_policy_receding_horizon.py. _ddim_from_v + _rollout bodies are
   byte-identical to the pre-combineB sources; the RH compute_metrics_and_viz
   body is verbatim. Mirrors _target_ in the two policy evaluator configs and
   the eval/__init__ legacy-import facade + dfot/__init__ exports.

2. SHARED ANCHORED-DDIM HELPER. New eval/dfot/_sampling.py::anchored_ddim_rollout
   factors the single-tensor sched[:,:n]=clean anchored loop shared by
   DFoTBundleAnchoredEval._rollout (1D bundle) and
   ImageSpatialDFoTOuterStage._rollout_latent (5D spatial latent). Both call
   sites adopt it with shape adapters; proven torch.equal to the verbatim inline
   loop for both the 1D and 5D shapes. The 2D-policy dual-stream _rollout
   (co-denoises x_lat + x_act through a dual-output backbone with a hand-rolled
   v-pred step) is genuinely different and is left duplicated (documented in the
   helper docstring).

3. KNOB BASE. New eval/dfot/_base.py::DFoTVideoEvalMixin hoists the common knob
   storage (embodiment_name, image_key, video_subdir/_video_subdir,
   recon_loss_n_frames, upscale_to, n_chunk_steps) + the video_dir() override
   shared by bundle_anchored / policy / video_rollout. Each __init__ keeps its
   own per-class defaults and passes them through store_dfot_knobs explicitly;
   resolved attributes verified identical to the per-class expected defaults.

Proof: scratch/proof_combineB.py (removed post-proof) — 14/14 PASS, all
torch.equal with maxdiff 0.00e+00 on A40. Full tests/ = 139 passed / 8 failed
(pre-existing, unrelated: test_packed_pipeline + test_training_recipe wiring) /
10 skipped — zero new failures. All three touched evaluator configs compose.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Records the dedup-campaign DFoT-evaluator combine (combines A c73a554 + B
7fde862): rollout trio -> 1 family-agnostic DFoTVideoRolloutEval
(decode-on-outer-stage), policy pair merge (eval_dfot_policy, RH subclasses
Action, shared _rollout), shared anchored-DDIM helper (_sampling), knob/path
mixin (_base). self_rollout untouched (genuinely-different uint8 variant).

eval/dfot/ 7-eval-file set: 1784 -> 1278 lines (-506, -28%); 4 per-family
modules deleted, 2 reusable helpers added; class count consolidated. Per-file
before/after table + the full _target_ map (live + dead-on-disk configs, all
resolve via __init__ compat aliases) recorded in PORT_NOTES.

Final gates (alloc 3326107, a40 megabot, pact-2 symlinked .venv):
  - pytest tests/: 139 passed / 8 failed / 10 skipped (8 = pre-existing; ZERO new)
  - compose sweep: TOTAL_PASS=107 / TOTAL_FAIL=2 (2 = pre-broken viz/pi_cartesian_lang*,
    NOT DFoT); all 11 DFoT evaluator yamls compose
  - real eval forward: evaluator=eval_dfot_image_spatial + eval_dfot_pixel each
    built a REAL DFoT algo (random weights, fixed seed) and ran one
    compute_metrics_and_viz end-to-end through the unified eval + outer-stage
    decode hook -> finite metrics (11 / 13), mp4 written (599954 / 288794 B)

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
HPTEvalVideo and PIEvalVideo each carried a byte-for-byte equivalent
"apply the revert transform once, then reuse it for both the cam-frame
paired/final MSE and the viz video" block (HPT named the prediction key
main_pred_key, PI named it pred_key -- same f"{embodiment}_{ac_key}"
value). Hoist it into one helper, eval/core/_viz_shared.py
cam_frame_mse_and_viz_batches(...), and have both evaluators delegate.
The per-evaluator-specific metrics (HPT's Frechet / Reverse-KL / aux+shared
heads, PI's native-frame MSE) stay in place; only the genuinely-common cam
block moves. Net -38 lines across the two zoo files.

Proven output-identical: for each evaluator, built from its composed
evaluator config (real viz_func, hydra-composed) with a stub algo + fixed-
seed batch and a deterministic injected transform, compute_metrics_and_viz
run with the pre-combineC class vs the refactored class produces
torch.equal cam_paired_mse_avg / cam_final_mse_avg (both 0.5509150624) and
torch.equal preds_for_viz / gt_batch_viz tensors, with identical metric-key
sets. tests/test_pi.py + tests/test_config_compose.py: 25 passed, 1 skipped
(zero new failures).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
… peers

The zoo/ grouping was wrong: HPT and PI are actively-developed algorithms
benchmarked against the H-Net line, not a pen of frozen third-party baselines.
They are first-class peers of bc. Dissolve algo/zoo/ and the mirrored eval/zoo/
into per-algo folders; algo/ and eval/ mirror each other.

Moves (git mv, R-status renames):
  egomimic/algo/zoo/hpt.py      -> egomimic/algo/hpt/hpt.py
  egomimic/algo/zoo/pi.py       -> egomimic/algo/pi/pi.py
  egomimic/algo/zoo/act.py      -> egomimic/algo/act/act.py
  egomimic/eval/zoo/eval_hpt.py -> egomimic/eval/hpt/eval_hpt.py
  egomimic/eval/zoo/eval_pi.py  -> egomimic/eval/pi/eval_pi.py
  egomimic/eval/zoo/eval_act.py -> egomimic/eval/act/eval_act.py

Each new folder gets an __init__.py re-exporting its public class
(egomimic.algo.hpt.HPT/HPTModel, egomimic.algo.pi.PI, egomimic.algo.act.ACT/
ACTModel; egomimic.eval.{hpt,pi,act}.<...>EvalVideo). The two zoo/__init__.py
are git rm'd; the algo/zoo PEP-562 lazy-PI shim is gone but PI laziness is
preserved (top-level egomimic.algo never imports egomimic.algo.pi eagerly).

Mirrored in the SAME commit: 16 yaml _target_s (12 hpt model configs, act.yaml,
pi0.5_base.yaml, eval_hpt.yaml, eval_pi.yaml) + every importer (eval/__init__.py
_MODULE_HOMES facade, algo/__init__.py doc comments, tests/test_pi.py).

Principle (DESIGN.md §9.4): all algorithms are first-class peers under algo/,
each its own folder when it may grow; bc stays a single flat module-home (left
untouched here); eval mirrors algo; the shared spine (algo.py/PackedAlgoBase,
packed_base, loss, outer_stage, obs_transforms, packed_outer_stage) stays flat
at algo/ top — not zoo, not bc-specific. Shared eval helper
eval/core/_viz_shared.cam_frame_mse_and_viz_batches stays in eval/core/
(import path unchanged). Old-ckpt _target_ resolution: scratch/
hierarchy_path_map.txt (ZOO DISSOLVE block) + PORT_NOTES.md.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
… -> */algo.py, drop loss/packed_base/outer_stage shims) — safety snapshot before cotrain+eval port
…ndEncoder + inner_working_dim polymorphism

- models/hnet/per_embodiment_stage.py: per-emb outer stages, shared inner trunk; dispatch by ctx.embodiment_id (ported from elmo/hnet-cotrain-circle-stick, import remapped to models.hnet)
- models/hnet/stages.py: inner_working_dim property (base=input_dim, ChunkerStage=output_dim) for polymorphic dim-handoff
- models/hnet/hnet.py: replace isinstance(ChunkerStage) dim-check with stages[i].inner_working_dim (-9 lines)
- models/hnet/context.py: add embodiment_id field
- models/stems/cond_encoders.py: MultiEmbodimentCondEncoder + ignored embodiment_id kwarg on CondEncoderModule.encode
- algo/hnet/algo.py: thread embodiment_id + domain_by_id reverse map through HNetOuterStage/PackedAlgoBase forward paths

Gate: hnet regression 57 passed/2 skipped; PerEmbodimentStage construct+dispatch+guard smoke green.
…override resolver

- embodiment.py: PUSHSHAPES_SIM_STICK = 16 (resolves by name; no per-emb handler needed)
- zarr_dataset_multi.py: LocalEpisodeResolverWithEmbodimentOverride — re-tags shared-metadata zarrs to a config-supplied embodiment so circle/stick dispatch apart
- schedulers.py.piecewise_linear already present in pact-2 (skipped)

Gate: stick id=16, circle id=15, resolver imports, viz_gt_preds present.
… skynet paths)

- model/hnet_pushshapes_cotrain.yaml: PackedAlgoBase + HNetOuterStage with MultiEmbodimentCondEncoder (per-emb circle/stick) and 2-level PerEmbodimentStage chain (per-emb EncDec+Chunker r=8 outer, shared EncDec+Chunker r=4+Compute inner). _target_ paths remapped to models.hnet/models.stems; algo.hnet.HNet alias preserved. No recipe knobs (single LR, source-faithful); comment points to warmup_cosine for fresh launches.
- data/tsimulation_cotrain.yaml: circle_750 + stick_312 on skynet; stick via LocalEpisodeResolverWithEmbodimentOverride. chunking=sequential (stick has a 1068-frame episode > max_seq_len=1024).

Gate: full trainHydra smoke on L40S — compose + load both datasets + norm_stats + construct 15.6M-param model + 2-step forward_training, clean exit.
… cross-emb composite

- probes/eval_boundary_strip.py: horizontal multi-stage render (Stage k labels, time on x-axis, per-emb keys, frame-level upsample); threads embodiment_id via domain_by_id
- probes/eval_pca_tokens.py: embodiment threading + n_extra_train_episodes train mix-in
- core/eval_composite.py: EvalVideoList below_indices (vstack boundary strip below traj+PCA row)
- core/eval_combined_rows.py (new): CombinedRowsEval — vstack per-emb composites into one 2-row video
- evaluator/eval_hnet_pairs.yaml + _combined.yaml: wire the cross-emb composite (eval_hnet/pca/boundary), paths remapped to core/probes
- eval_hnet.py unchanged (already had per-emb viz_func)

Gate: mode=eval on cotrain model rendered per-emb composites (traj + PCA + horizontal Stage0/Stage1 boundary strip below), verified visually.
…ndowedBC mimic, full-episode TF)

Mimics WindowedBC on the HNet algo with FULL-EPISODE teacher forcing:
- algo/hnet/algo.py: HNetOuterStage action_head_type=gmm (pre-instantiated GMMActionHead); decode() stashes head on ctx.extras. New GMMLoss (peer of HNetLoss): builds per-obs-step 8-action chunked targets (repeat-pad at episode ends, packed+padded) and computes GMM-NLL via head.nll + ratio loss.
- embodiment.py: PUSHSHAPES_SIM_SMALL_CIRCLE = 17.
- model/hnet_cotrain_gmm_obs.yaml: ObsToken (ResNet VisualCore + proprio) obs-as-input, no-AdaLN (cond_key=null), per-emb outer ChunkerStage -> shared inner ChunkerStage -> ComputeStage, GMM head chunk_len=8, GMMLoss, cotrain circle+small_circle.
- data/tsimulation_cotrain_small_circle.yaml: full-episode TF (chunking=none); small_circle PLACEHOLDER at circle path via embodiment_override.

Gate: trainHydra smoke on L40S — construct 15.4M-param model + 2-step forward_training with GMMLoss, clean exit.
…it by filter)

Found the small-circle data: /coc/flash7/paphiwetsa3/datasets/circle_co_big_small holds 953 big-circle + 955 small-circle episodes, BOTH tagged embodiment=pushshapes_sim, distinguished only by task_description.env_args.pusher_shape (circle vs circle_small). Split by episode-name DatasetFilter:
- pushshapes_sim:              folder=circle_co_big_small, filter "circle_small not in episode_hash" (905 big)
- pushshapes_sim_small_circle: same folder, filter "circle_small in episode_hash" + embodiment_override (955 small)
No more placeholder. max_seq_len 1024->2560 (longest small episode = 2255 frames; full-episode TF), batch_size 1 (long episodes + ResNet), model action_horizon 1024->2560.

Gate: trainHydra smoke on L40S — both filtered splits load, norm_stats, 15.4M-param construct, 2-step forward_training with GMMLoss, clean exit.
Make val-overlay + closed-loop sim work with the chunked GMM head:
- HNetOuterStage._decode_pred_actions: eval bridges (forward_padded/packed) decode GMM params -> sample (low-noise) -> chunk pos-0 point action per frame for the teacher-forced overlay.
- inference_step: open-loop action-chunk buffer (mirrors WindowedBCPolicy.step queue) -- on an obs-step decode the chunk_len actions, dispense one/frame with NO model call between, re-observe every chunk_len frames (obs_stride==chunk_len). Non-GMM unchanged (one action/frame).
- HNetOuterStage.step: thread embodiment_id onto the per-step ctx so PerEmbodimentStage dispatches in the AR cache.
- PerEmbodimentStage._allocate: pact-2 AR-cache scheme (per-emb sub-stage cache; only active emb advanced).

Gate (untrained, L40S): val overlay renders both embs + actions_paired_mse computed; sim eval rolls out both embs (coverage 0.0 untrained) -- both paths clean.
Obs-input H-Net GMM cotrain model for the PushShapes circle/small-circle campaign:
- first_token causal chunker with grab_prev_end (concat[first-token, prev-chunk-end]
  -> MLP) + duplicate upsampler + residual mixer (stages.py, residual_mixer.py)
- per-embodiment GMM heads (gmm_heads dict) + per-emb obs encoder (embodiment_id
  threaded through ObsToken; MultiEmbodimentCondEncoder d_cond) so only the
  high-level trunk is shared (algo.py, stems/)
- obs_stride arch (strided token stream + tiled chunk targets) + fp32-matched sim
  rollout dtype (algo.py)
- ratio-loss scheduler callback + GMM / first-end model+data configs
- scan_interface + circle_small sim-env support

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01TYmG3nhxt7LYiaaSJPsKEV
Eval and visualization tooling for the GMM cotrain models:
- boundary-strip / PCA-token probes, keyframe + boundary viz, overlay loader,
  sim-eval + ckpt-loading updates (eval/core, eval/probes)
- chunkviz explorer: export.py (per-frame chunk ids + cross-emb shared PCA with
  embodiment-removal variants raw / mean-diff / LDA), build_html.py (self-contained
  HTML), viewer_template.html (multi-model dropdown + single/compare + PCA-removal
  toggle), serve / nocache_server
- GMM evaluator configs (circle / cotrain / smallcircle / firstend)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01TYmG3nhxt7LYiaaSJPsKEV
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant