tsim: sim env + viz/scripts + physics tuning#512
Draft
ElmoPA wants to merge 78 commits into
Draft
Conversation
Squash of: - a31d8ab4 tshape sim environment - ac272e8a Add Tsimulation viz/scripted/stats tools + physics tuning
…ding Squash of: - f261ff68 tsim training configs/embodiment - b38264ab Add pushshapes_sim HPT training: keymap, viz, eval fixes - 1e174131 Add episode-level packed dataloading
Squash of 10 commits from temp-arch-flexible: 7faf2012, 49ed0d34, 8387986e, 11a266ce, 3fe9a353, 0ea9d013, c83b6e69, 63a2e852, 7b71650a, a063c021
- test_hnet_nets.py (57): routing, chunk, dechunk, isotropic, stages (padded + packed), HNet assembly, ratio_loss, chunk_stats, STE, RMSNorm, AdaLN. - test_packed_pipeline.py (9): normalize broadcast on padded vs packed; _iter_leaves descent; multi-frame JPEG decode; end-to-end packed stats collection. - test_training_recipe.py (20): apply_optimization_params, init_weights height-scaled init, apply_lr_multiplier per-stage stamping, parameter_groups (default, with bias/norm WD=0, per-stage groups, AdamW-consumable). Plus algo wiring tests for the opt-in init_weights_range / lr_multipliers / use_parameter_groups / weight_decay kwargs. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Imports from EgoVerse7@temp-arch-flexible working tree as of 2026-05-20. Includes: - algo: input_modules.py + obs_transforms.py (new modules) - callbacks: chunker_residual_scheduler, ckpt_chunker(+dropout), random_attn_dropout - data configs: tsimulation_400ep, tsimulation_allep + tweaks to existing tsim configs - model configs: hnet_pushshapes_mamba_encdec, hnet_pushshapes_obs_ar + tweaks - eval/* edits, models/hnet_nets/* edits, schedulers, uv.lock - removes scripts/install_cuda_kernels.sh and egomimic/eval/eval_hnet_sim.py Excluded: egomimic/algo/hnet.py.bak.preinput (manual backup) and drift_eval_out_* (eval artifacts).
…screte diffusion + DDPM/DDIM samplers
… per-frame obs cond)
…l); drop 400ep config
…ens defensive fix
…ckedSimEval rename, composite eval
…r); supports growing-T AR
- algo.py: add ar_inference_step_size (sub-steps per env tick at closed-loop AR); thread cfg_scale through _inference_step_ar/_inference_step_chunk; remove unused shape var - backbone.py: force_uncond branch for CFG two-pass blending; wire attn/resid dropout into Isotropic trunk - sampling.py: _CFGBackbone wrapper for cfg_scale > 1 sampling; schedule-matrix CFG plumbing Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…val viz - model/dfot_pushshapes.yaml: scale to 67M (d_model=512, T=12, num_heads=8, d_intermediate=2048); attn dropout=0.1, resid dropout=0.1, cond_dropout_prob=0.1; cfg_scale field; causal=true - data/tsimulation_full.yaml: 750-episode circle_750 dataset, batch_size=16 - evaluator/eval_dfot_val.yaml + eval_dfot_full.yaml: cfg_scale + ar_chunk_size + ar_step_size knobs - callbacks/ckpt_attn_dropout.yaml: composed callback (checkpoints + random_attn_dropout with values [0.1, 0.5, 0.8, 0.9, 0.95, 0.97, 0.98]) - eval/eval_dfot_val.py: 96x96 -> 512x512 nearest-neighbor upscale, palette (gt=green, chunk=red, ar=yellow), world-coord pixel scaling, threaded cfg_scale Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- sbatch_train_dfot_200ep_full750_pace.sh: minor cleanup - sbatch_train_dfot_400ep_full750_pace.sh: 400ep H200 launch with scheduler.max_steps=18800 (fixes the 200ep cosine-not-decaying bug) - sbatch_train_dfot_400ep_attndrop.sh: 400ep + random_attn_dropout (the 5.6x sim_coverage win) - scripts/eval_cfg_latest.py: post-hoc DFoT eval CLI with --cfg-scale, --ar-chunk-size, --ar-step-size, --ar-inference-chunk-size, --ar-inference-step-size, --skip-val/--skip-sim - scripts/eval_fsd_latest.py: convenience wrapper for inference_mode=chunk - scripts/sbatch_fsd_eval.sh + sbatch_sim_sweep.sh: sbatch templates for closed-loop sim sweeps Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Introduces the abstract bases for the upcoming refactor: - egomimic/algo/outer_stage.py: OuterStage base. Owns an inner_stage field (the trunk). Subclasses implement encode (raw batch -> trunk-input tensor; can sample noise / record state on ctx) and decode (trunk-output -> per-modality prediction keys on batch). - egomimic/algo/loss.py: Loss base + CompositeLoss (weighted sum of terms) + MSELoss (per-modality MSE between pred_key and target_key). Loss policy becomes data — a hydra config block — rather than inheritance. No algorithm code uses these yet; HNetOuterStage / DFoTOuterStage and the algo.hnet.HNet / algo.dfot.DFoT refactors come in follow-up commits on this branch.
Foundational refactor for the upcoming DFoTOuterStage + DFoTLoss classes: - q_sample(x, t) -> dict: forward-noising step. Returns x_t, noise, alpha_t, sigma_t, logsnr, and the precond_scale * logsnr time_cond the backbone consumes. No backbone call inside. - compute_loss(v_pred, q_state) -> per-token weighted MSE: takes the dict from q_sample plus the backbones v_pred and computes the SNR-weighted epsilon-MSE. - forward(backbone, x, t, cond): kept as a back-compat wrapper that calls q_sample, runs the backbone, then compute_loss. Existing callers (DFoT.forward_training) are unaffected. Bitwise verified equivalent to the prior single-method path via the included /tmp/test_diffusion_split.py smoke (loss + x_pred match exactly, same random seed). discrete_diffusion.py is unsplit for now; current configs use continuous.
…verified) - egomimic/algo/dfot/outer_stage.py: DFoTOuterStage subclass. encode: encode obs to per-token cond, sample noise levels, run diffusion.q_sample, store q_state + external_cond on ctx, return noisy x_t. decode: write batch[pred_v] for the loss to read. forward override threads cu_seqlens/max_seqlen (packed mode) and time_cond into the backbone call. - egomimic/algo/loss.py: DFoTLoss class. Reads batch[pred_v] and ctx.q_state, calls diffusion.compute_loss (SNR-weighted eps-MSE), reduces to scalar. Bitwise verified via /tmp/test_dfot_outer_stage.py: padded-mode loss through DFoTOuterStage + DFoTLoss matches DFoT.forward_training to 0.0e+00 difference at fixed seed. Real CondEncoderModule + DFoTBackbone + ContinuousDiffusion submodules; no mocks of the math path. Algo class (DFoT.forward_training) is NOT yet wired to use these — that comes in the next commit on this branch. Inference paths (closed-loop AR sample_step, chunk plan-execute) also deferred.
Algo class: - __init__ now takes outer_stage: DFoTOuterStage and optional loss: Loss (auto-built as DFoTLoss(outer_stage.diffusion) if None). - Removes legacy cond_encoder, backbone, diffusion_type, diffusion_kwargs, cond_output_key args — they now live on the outer_stage subblock. - Adds @Property accessors for cond_encoder, backbone, diffusion, outer_stage, loss so existing inference paths (_inference_step_ar, _inference_step_chunk, _sample_chunk, forward_eval) keep working unchanged via property forwarding. - forward_training shrinks ~40 LOC -> ~20 LOC: build ctx, call outer_stage(batch, ctx), call loss(batch, ctx). No more inline diffusion math; no more cu_seqlens threading at this level. - Adds ar_inference_step_size knob. dfot_pushshapes.yaml: - New outer_stage: block wraps cond_encoder + backbone + diffusion. - Removes top-level diffusion_type / diffusion_kwargs; the diffusion module is now its own _target_ inside outer_stage. - loss: omitted (uses default DFoTLoss(outer_stage.diffusion)). End-to-end smoke (scripts/test_dfot_refactor_e2e.py) verifies the config instantiates via hydra and forward_training emits a finite scalar loss. Bitwise loss-equivalence was already shown in the prior commit (test_dfot_outer_stage.py). Old checkpoints WILL NOT load — state_dict keys moved from nets.{cond_encoder,backbone}.* to nets.outer_stage.{cond_encoder,inner_stage}.* This is intentional per the agreed clean-break refactor.
Adds scripts/test_dfot_inference.py: instantiates the refactored DFoT from dfot_pushshapes.yaml, runs inference_step in both ar and chunk modes, asserts action is (action_dim,) and finite. Verifies the @Property accessors (self.backbone, self.cond_encoder, self.diffusion) forward correctly to outer_stage submodules so the closed-loop AR and chunk-mode inference paths keep working after the refactor. Passing on compute node 8997316: [ar] action @ t=0: [0.30 0.47] [ar] action @ t=1: [0.44 1.06] [chunk] action @ t=0: [-0.78 -1.89]
…ied) - egomimic/algo/hnet_outer_stage.py: HNetOuterStage class. Inherits from OuterStage with inner_stage = HNetCore (stage tree). Owns cond_encoder, input_modules (summed per-token contributions), action_out head. Three forward paths inherited from the old HNetPolicy pattern: forward(batch, ctx) dispatcher (padded/packed), generate (offline AR), init_step_state + step (online single-tick). - egomimic/algo/loss.py: HNetLoss class. Reads batch[pred_action] + batch[actions], adds per-chunker ratio_loss_from_aux from ctx.aux. - scripts/test_hnet_outer_stage.py: equivalence smoke. Instantiates the existing hnet_pushshapes.yaml subcomponents, wraps the SAME instances in HNetPolicy and HNetOuterStage, ties their action_out heads, runs identical padded forward. Verified bitwise (max diff 0.00e+00 at fixed seed) on H200 alloc 8989249. Also smoke-tests the step inference path (shape + finite). The old HNetPolicy class is still in egomimic/algo/hnet.py and not yet removed. Algo-class refactor and yaml updates come in follow-up commits on this branch. Old checkpoints will NOT load — state_dict keys move from policy.* to outer_stage.* (or wherever the algo class places it). Per clean- break policy.
Algo class:
- HNet.__init__ now takes outer_stage: HNetOuterStage + loss: Optional[Loss]
instead of cond_encoder + hnet + action_dim + action_horizon +
d_model + action_head_type + input_modules. action_horizon read from
outer_stage.
- Loss defaults to HNetLoss() if not provided.
- self.nets is now ModuleDict({outer_stage, loss}); old keys
(self.nets[policy], self.nets[cond_encoder], ...) are exposed via
@Property forwarding to outer_stage submodules so legacy callsites
in forward_eval / _teacher_forced_packed / _ar_rollout_packed / step
inference keep working.
- forward_training builds (batch, ctx), calls outer_stage(batch, ctx)
+ loss(batch, ctx), unpacks per-term breakdown (ctx.action_loss,
ctx.ratio_loss) into the predictions dict for logging.
HNetLoss:
- Computes action MSE + ratio_loss_from_aux(ctx.aux) and stashes the
per-term split on ctx for the algo to log separately.
HNetOuterStage:
- Adds back-compat bridge methods forward_padded(actions, obs) and
forward_packed(actions, obs, cu, msl) returning (pred, aux). The
forward_eval / _teacher_forced_packed paths use these; .generate /
.step / .init_step_state already had matching signatures.
hnet_pushshapes.yaml:
- New outer_stage: block wraps cond_encoder + hnet stage tree +
input_modules + action_head_type. Top-level keeps training-recipe
knobs (init_weights_range, lr_multipliers, ...) and embodiment
wiring.
- loss: block omitted (defaults to HNetLoss()).
scripts/test_hnet_refactor_e2e.py: packed-mode forward_training smoke.
Verified passing on H200 alloc 8989249 — produces action_loss 1.51 +
ratio_loss 0.032 + chunker stats for a 2-episode packed batch
(T=12+20). Padded mode hits a pre-existing torch SDPA error
(Explicit attn_mask should not be set when is_causal=True) in
train mode — this is in the trunk code, not introduced by the
refactor (the production training uses packed mode and never hits
the padded-train path).
Old HNetPolicy class is still in algo/hnet.py for now (no longer
used by HNet algo); will be removed in a cleanup commit once all
stage-based + flat yamls are migrated.
…hema Same outer_stage block pattern as the base hnet_pushshapes.yaml, applied to the variant configs. Each yaml moves cond_encoder + hnet stage tree + (optional) input_modules + action_head_type under outer_stage; keeps training-recipe knobs + embodiment wiring at the top level. Configs migrated: - hnet_pushshapes_big.yaml (d_model 256, 21M params) - hnet_pushshapes_crossattn.yaml (cond_mode: cross_attn) - hnet_pushshapes_mamba_encdec.yaml (M8 encoder/decoder) - hnet_pushshapes_obs_ar.yaml (ObsToken input module) - hnet_pushshapes_obs_ar_large.yaml (ObsToken + d_model 256 + T8) - hnet_pushshapes_recipe.yaml (H-Net paper recipe) scripts/test_hnet_yamls_load.py: batch instantiate smoke. Verified on H200 alloc 8989249 — all 7 stage-based yamls (base + 6 variants) instantiate from hydra config and produce sensible param counts (5.5M baseline up to 42.8M obs_ar_large).
- egomimic/algo/flat_fused_outer_stage.py: FlatFusedOuterStage class. Structurally a rename of FlatFusedPolicy with OuterStage inheritance plus an OuterStage forward(batch, ctx) dispatcher delegating to the existing forward_padded / forward_packed. Legacy generate / step / init_step_state preserved verbatim. encode / decode raise NotImplementedError since the interleaved 2T-token flow does not cleanly split along encode -> trunk -> decode. - egomimic/algo/hnet.py: HNetFused is now a thin pass-through subclass of HNet, kept as a separate _target_ for the existing flat yamls. All flat-fused behavior moved into FlatFusedOuterStage; HNet.__init__ already tolerates outer_stage.inner_stage=None. - 3 flat yamls migrated to outer_stage schema: hnet_pushshapes_fused.yaml, hnet_pushshapes_fused_lowlr.yaml, hnet_pushshapes_fused_pusher.yaml. - scripts/test_hnet_yamls_load.py extended to cover all 10 H-Net configs. Verified on H200 alloc 8989249: 10/10 instantiate successfully (7 HNetOuterStage + 3 FlatFusedOuterStage). Param counts sensible. Old FlatFusedPolicy class still in algo/hnet.py for now; cleanup of unused legacy classes (HNetPolicy, FlatFusedPolicy) is a follow-up.
…y retained — it is NOT dead)
Pure-delete-with-grep-proof. Each symbol re-grepped over egomimic/ +
hydra_configs/ + tests/ + scripts/ (excl __pycache__/external/scratch/logs)
IMMEDIATELY before deletion to reconfirm zero LIVE refs.
Deleted (proven zero live refs):
- egomimic/models/diffusion_policy.py (whole file -> scratch; only self-ref)
- egomimic/models/ddim_scheduler.py (whole file -> scratch; live DDIM is
diffusion/sampling.ddim_sample)
- egomimic/models/hnet/_smoke_stages.py (whole file -> scratch; only its own
__main__ self-invoke)
- algo/loss.py CompositeLoss + MSELoss (no _target_, no code ctor; live losses
are HNetLoss()/DFoTLoss() built in code)
- algo/packed_base.py HNet._ar_rollout_packed (no caller; live eval is
forward_eval -> _teacher_forced_packed)
- algo/packed_outer_stage.py HNetOuterStage.generate (dead; only ref was a docstring.
step/init_step_state KEPT — they ARE the
live closed-loop path, called at
packed_base.py policy.init_step_state/step)
- pl_utils/pl_data_utils.py RLDBModule, DualDataModuleWrapper, DataModuleWrapper
(deprecated; live wrapper is
MultiDataModuleWrapper, ref'd by 21 files)
Also dropped now-unused imports (typing.Optional in packed_outer_stage.py;
typing.Iterable/List/Optional in loss.py) and updated 2 yaml comment mirrors
that named the now-deleted CompositeLoss (dfot_pushshapes.yaml, hnet_pushshapes.yaml).
Deleted files moved to scratch/dead_code_c4/ (gitignored), not destroyed.
SCOPE CORRECTION — HNetPolicy NOT deleted: the campaign evidence claimed
HNetPolicy was zero-ref, but that grep excluded scripts/. The required
grep over scripts/ found 3 LIVE importers+instantiators:
scripts/smoke_packed_training.py (documented live tooling in CLAUDE.md L538)
scripts/test_mamba_regression.py (old-vs-new equivalence regression)
scripts/test_hnet_outer_stage.py (HNetPolicy-vs-HNetOuterStage equivalence smoke)
The grep proof-gate fails for HNetPolicy, so it is retained (its .generate
method stays with the class). All other 5 targets pass cleanly.
Proofs (run on own a40 alloc, repo's symlinked .venv):
- import-smoke: `import egomimic.algo, egomimic.models, egomimic.pl_utils` -> IMPORT OK
- retained symbols import: Loss/HNetLoss/DFoTLoss, MultiDataModuleWrapper,
HNet+HNetPolicy, HNetOuterStage (step=True init_step_state=True generate=False)
- deleted symbols confirmed gone (CompositeLoss/MSELoss absent; dead model files absent)
- py_compile all edited .py: OK
- pytest tests/: 128 passed / 8 failed / 4 skipped — IDENTICAL to baseline.
The 8 failures are the documented pre-existing set (7x TestAlgoWiring
old-HNet-signature: "HNet.__init__() missing 1 required positional argument:
'outer_stage'"; 1x test_full_pipeline_collects_per_feature_stats missing-zarr).
Zero NEW failures. Deletes touch zero reachable code paths -> no torch.equal needed.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
… + log_info onto base Algo
Three blocks of code were byte-identical (verified via `diff`, rc=0) across
the policy algos and are collapsed into a single home on the base `Algo`:
* the per-embodiment key-resolution loop (`for emb in self.domains: ...`
resolving resolved_ac_keys / proprio_keys / lang_keys / camera_keys via
norm_stats) -- HNet packed_base.py:489-514 == DFoT diffusion/algo.py:169-194
== WindowedBC bc.py:699-724. Now `Algo._resolve_embodiment_keys(norm_stats)`;
each subclass calls it from __init__.
* `_build_obs` -- HNet packed_base.py:615-624 == DFoT diffusion/algo.py:246-255
(WindowedBC already inherited HNet's). Now defined once on `Algo`; the HNet
and DFoT overrides are deleted (inherited).
* `log_info` -- HNet packed_base.py:778-783 == DFoT diffusion/algo.py:419-425.
The base `Algo.log_info` (formerly a NotImplementedError stub) now carries
this exact body as the shared default; the HNet/DFoT overrides are deleted.
Net -116/+83 lines. Behaviour is preserved by construction: the moved text is
identical, so every subclass resolves the same function object from `Algo`
(no subclass re-introduces a private copy). bc.py drops its now-unused
`get_embodiment_id` import; packed_base/DFoT keep theirs (still used elsewhere).
Proofs (run on a40 alloc 3325792, fixed seeds):
* DFoT 1-epoch pixel smoke Train/Loss = 0.28878551721572876 -- BIT-IDENTICAL
to the pre-c5 baseline (all 17 digits; this is the deterministic, 0.0-jitter
path per the dedup_baseline manifest).
* BC SMOKE=1 train: c5 Train/Loss = [1.3453816, 0.1752842]. A clean pre-c5
baseline RE-RUN on the same node gives [1.3452528, 0.1739942] -- i.e. the BC
smoke is itself run-to-run nondeterministic at ~1e-3 (CUDA/image-encoder/
sim-eval RNG), and the c5 run lands CLOSER to the manifest values
[1.3453673, 0.1749004] than the clean-tree rerun does (row0 |c5-manifest|
=1.4e-5 vs |baseline_rerun-manifest|=1.1e-4). The refactor is within the
tree's own deterministic-replay band.
* New permanent guard tests/test_embodiment_key_resolution_shared.py (3 tests):
asserts HNet/WindowedBC/DFoT all resolve the SAME `Algo` function objects
for _resolve_embodiment_keys / _build_obs / log_info (import-identity) and
produce byte-equal key sets + obs selection on a fixed norm_stats fixture.
* pytest tests/ = 131 passed / 8 failed / 4 skipped. The 8 failures are the
documented pre-existing baseline (7x TestAlgoWiring old-HNet-signature +
1x missing-zarr-data); +3 passed are the new permanent tests. Zero NEW
failures vs the 128/8/4 profile.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…mg_utils.img_chw_to_uint8
Six evaluators each carried a byte-equivalent (C,H,W) float-in-[0,1] ->
(H,W,C) uint8-in-[0,255] converter under three private names:
- _img_chw_to_uint8 : video_rollout, pixel_video_rollout,
spatial_video_rollout, policy_action
- _u8 : bundle_anchored
- _to_uint8_hwc : eval/core/eval_vae_recon
Four were the 4-line clip->*255->transpose form; two (policy_action, _u8)
were the same logic as a 2-liner. All six now delegate to a single canonical
egomimic.eval.core.img_utils.img_chw_to_uint8; the local defs are removed and
each call site renamed.
EXCLUDED (genuinely different, left untouched): eval_dfot_self_rollout
._img_chw_to_uint8 uses an x.max()<=1.5 auto-scale heuristic + clip(0,255).
PROOFS (a40 job 3325794, fixed seeds):
- tests/test_eval_img_utils.py (NEW, permanent gate):
test_canonical_matches_every_original_body PASSED
np.array_equal(canonical, orig_4liner) and ==orig_2liner on a fixed
torch.manual_seed(0) float tensor spanning out-of-[0,1] (rand*1.4-0.2).
test_touched_eval_modules_import PASSED (import-smoke of all 7 modules)
- full tests/: pre-c6 (stashed) = 8 failed / 131 passed / 4 skipped;
post-c6 = 8 failed / 133 passed / 4 skipped. SAME 8 pre-existing failures
(7x TestAlgoWiring old-HNet-signature + 1x TestInferNormFromPacked
missing-zarr-data); +2 passed = the 2 new equality/import tests. Zero NEW
failures. Behaviour-preserving: the 6 bodies were already identical.
Deleted-helper provenance saved to scratch/c6_deleted_helpers/ (gitignored).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…uperset + hoist loss-reducer skeleton onto Algo
Two structural near-duplicates removed:
1. Frame sampler. PixelSpatialDFoTOuterStage._sample_frames_packed (image
only) duplicated the fixed_window/start_to_end/random_subsample branch
logic of PixelObsActionDFoTOuterStage._sample_windows_packed (image+
action). The action version is a strict superset: its image cropping and
cu_seqlens are byte-identical whether or not actions are supplied (the
action branch performs zero extra RNG draws). Hoisted the superset onto the
parent PixelSpatialDFoTOuterStage, made it accept actions=None, and reduced
_sample_frames_packed to a thin delegate
(_sample_windows_packed(images, None, cu)). Deleted the duplicate copy from
the PixelObsAction subclass (now inherited).
2. Loss reducer. DFoT.compute_losses was the pure sum-per-embodiment
{emb}_action_loss -> action_loss skeleton; promoted it to the Algo base as
the default compute_losses and deleted DFoT's byte-identical override.
VAE / HNet / HPT / PI keep their own overrides because they are genuine
SUPERSETS (recon/kl/lpips, ratio_loss, domain-count division) — NOT folded.
PROOFS (a40 alloc 3325796, fixed seeds):
- BEFORE-state probe: current image-only sampler vs current superset produce
torch.equal image crops + cu_seqlens across all 3 modes (img_equal=True,
cu_equal=True for fixed_window / start_to_end / random_subsample).
- Permanent test tests/test_c7_sampler_reducer_equality.py (3 tests, all pass):
* image-only sampler == superset(actions=None) torch.equal across all 3
modes and all episode-length regimes (<n, ==n, >n);
* hoisted Algo.compute_losses torch.equal to legacy DFoT.compute_losses on
fixed predictions;
* guard: HNet ratio_loss reducer is NOT reproduced by the base default
(catches a future wrong fold of the superset overrides).
- DFoT 1-epoch pixel-policy smoke (exercises the rewritten packed sampler,
frame_sampling=fixed_window, + the inherited reducer): Train/Loss =
Train/action_loss = Train/emb15_action_loss = 0.28878551721572876, BIT-
IDENTICAL (all 17 digits) to the pre-collapse baseline.
- pytest tests/ = 136 passed / 8 failed / 4 skipped: the 8 failures are the
documented pre-existing set (7x TestAlgoWiring old-HNet-signature + 1x
missing-zarr-data InferNorm), ZERO new failures; suite grew by the 3 new
permanent equality tests.
No hydra configs reference the touched methods (no config mirror needed).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…eads/stems/diffusion) + utils
End state: `ls egomimic/models/*.py` shows ONLY __init__.py. Every move is a
git mv (R-status); the 2 splits git-mv the file to its primary home first
(history-preserving) then extract the other-role classes into new files in this
same commit. All moved class bodies are byte-identical modulo import lines
(verified by class-body diff vs HEAD). Every importer + config _target_ updated
in this commit; grep confirms zero remaining old-module dotted refs.
WHOLE-FILE MOVES (git mv):
fm_policy.py -> heads/fm_policy.py (policy output head)
denoising_policy.py -> heads/denoising_policy.py (diffusion policy head base)
denoising_nets.py -> diffusion/denoising_nets.py (legacy diffusion nets)
image_vae.py -> diffusion/image_vae.py (DFoT pixel<->latent codec)
preprocess_pi_obs.py-> utils/preprocess_pi_obs.py (data preprocessing, OUT of models/)
SPLITS (git mv to primary home + extract):
act_nets.py -> stems/resnet_conv.py (primary: Module/ConvBase/CoordConv2d/ResNet18Conv)
+ cores/act_transformer.py (PositionalEncoding/Transformer/StyleEncoder)
hpt_nets.py -> stems/hpt_stems.py (primary: PolicyStem/MLPPolicyStem/ResNet)
+ cores/hpt_transformer.py (CrossAttention/Attention/MLP/BlockWithMasking/
MultiheadAttention/SimpleTransformer)
+ heads/hpt_heads.py (PolicyHead/MLPPolicyHead/TransformerDecoderBlock/
MultiBlockTransformerDecoder)
DEAD-CODE PRUNE (grep-proven 0 external refs):
hpt_nets: STPolicyStem, AttentivePooling, vit_base_patch16, T5TokenizerWrapper,
T5Encoder, L2Norm (also drops the heavy timm/transformers T5/ViT imports)
denoising_nets: ConditionalClassifier1D, CrossTransformerCfg2, CrossTransformerProj
VERIFY (on a40 alloc, PYTHONPATH=repo working tree, Python 3.11):
- `import egomimic` OK; every touched module imports OK (pi/rollout fail only on
pre-existing missing optional deps openpi/robot_utils, BEFORE the moved import lines).
- py_compile OK for all 10 new/moved files.
- Hydra compose-check passes for every model config whose _target_ moved.
- OLD-ckpt path map appended to scratch/hierarchy_path_map.txt (gates phase folds
it into PORT_NOTES).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
In-place class rename (no file moved): the packed-sequence policy Algo base
was misleadingly named ``HNet`` while it is actually the shared base for the
packed-sequence algos (WindowedBC subclasses it; the inner H-Net stage tree
is supplied via ``outer_stage`` and is a separate, correctly-named concept in
models/hnet). Renamed the class to ``PackedAlgoBase`` to reflect its role.
Changes (all in one commit):
- packed_base.py: ``class HNet(Algo)`` -> ``class PackedAlgoBase(Algo)``;
docstring updated; kept ``HNet = PackedAlgoBase`` compat alias at module
bottom (commented) so OLD ckpts/configs whose resolved _target_ still names
``egomimic.algo.packed_base.HNet`` keep resolving.
- bc.py: import + ``class WindowedBC(PackedAlgoBase)`` + base-referring
docstring/comments updated.
- algo/__init__.py: import-example comment updated.
- 7 model configs (hnet_pushshapes*.yaml): _target_ ->
egomimic.algo.packed_base.PackedAlgoBase.
- scripts/smoke_packed_{training_e2e,validation}.py: import (as HNetAlgo) +
Algo-method docstrings updated.
- tests/{test_c7_sampler_reducer_equality, test_embodiment_key_resolution_shared,
test_training_recipe}.py: import + class refs updated.
OUT OF SCOPE (untouched per task): models/hnet (architecture HNet), HNetCore /
HNetOuterStage / HNetLoss / HNetSimEval (architecture/stage names), HNetPolicy
(landmine: proven alive), algo/obs_transforms.py (landmine: designed extension
point).
Verified on A40 alloc: `import egomimic` OK; PackedAlgoBase + HNet-alias
identity OK (HNet is PackedAlgoBase); WindowedBC subclass OK; all 7 hnet
configs compose with new _target_; old dotted path resolves via compat alias.
Tests: test_c7 + test_embodiment_key_resolution_shared + test_config_compose
all pass; test_training_recipe shows the SAME 7 pre-existing TestAlgoWiring
failures (old constructor signature, missing ``outer_stage``) and ZERO new
failures vs the 136/8/4 baseline.
Path-map (old->new dotted) appended to scratch/hierarchy_path_map.txt for the
gates phase to fold into PORT_NOTES.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…s + dead-file purge
Relocate the misplaced contents of egomimic/utils/ to semantic homes and delete
grep-proven dead files. One commit per the hierarchy-pass group rule; revertible
via tag pre-utils-hier.
RENAMES (git mv, R-status):
utils/timing_callback.py -> pl_utils/callbacks/timing_callback.py
utils/instantiators.py -> pl_utils/instantiators.py
utils/logging_utils.py -> pl_utils/logging_utils.py
utils/rich_utils.py -> pl_utils/rich_utils.py
utils/utils.py -> pl_utils/utils.py
utils/tensor_utils.py -> vendored/robomimic_tensor_utils.py (988-line verbatim
robomimic vendor; +vendored/README.md provenance note)
egomimicUtils.py SPLIT (source file STAYS in utils/ as the generic remainder —
constants ARIA/EXTRINSICS/INTRINSICS, geometry, str2bool, interpolate_*,
CameraTransforms, download_from_huggingface, STD_SCALE):
model helpers -> models/cores/model_utils.py (NEW): get_sinusoid_encoding_table,
reverse_kl_from_samples, frechet_gaussian_over_time, EinOpsRearrange, AlohaFK
drawing fns merged into utils/viz_utils.py (dependency FLIPPED — viz_utils now
OWNS the drawing fns and pulls only INTRINSICS/cam_frame_to_cam_pixels/
ee_pose_to_cam_frame from egomimicUtils): draw_actions, draw_dot_on_frame,
draw_rotation_text, draw_annotation_text, miniviewer (+fmt helper).
All 11 moved bodies are byte-identical to originals (verified via AST diff
vs HEAD); only import lines differ.
model_utils.py placed in models/cores/ (not loose in models/) to keep the
models-group gate "models/ has only role dirs + __init__.py" intact.
DELETED dead (grep-proven 0 importers; scratch copies in scratch/utils_hier_deleted/):
utils/memory_utils.py, utils/real_utils.py, utils/obs_utils.py (only keep_keys,
0 refs), egomimic/__init__.pyc, egomimic/keypoints.jpeg.
Importers + config _target_ updated in this same commit (grep-exhaustive over
egomimic/ tests/ scripts/ Tsimulation/ hydra_configs/):
callbacks/defaults.yaml wandb_profiler._target_ -> pl_utils.callbacks.timing_callback
trainHydra.py (instantiators/logging_utils/utils), norm_stats.py (utils),
pl_model.py (tensor_utils->vendored), hpt_heads.py / eval_hpt.py / algo/zoo/hpt.py
(model helpers), pushshapes.py / eval_act.py / robot/rollout.py /
data_visualization.py (drawing fns). pl_utils/utils.py internal rich_utils ref flipped.
Added pl_utils/__init__.py (was implicit ns pkg) so find_packages discovers it.
VERIFY (alloc 3325801, a40, pact-2 .venv): import egomimic + all 15 touched
modules import clean from THIS tree; pytest tests/ = 136 passed / 8 failed
(all pre-existing: 7 TestAlgoWiring old-HNet-sig + 1 missing-zarr) / 4 skipped —
ZERO new failures vs baseline; test_config_compose 25/25; parent config composes
and moved callbacks _target_ resolves to the new class.
Path-map (old->new dotted) appended to scratch/hierarchy_path_map.txt for the
gates phase to fold into PORT_NOTES.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Move (R-status rename, byte-identical modulo 2 fixed import lines):
- egomimic/rldb/zarr/zarr_write_test.py -> egomimic/scripts/eva_process/zarr_write_test.py
(HDF5->zarr conversion CLI, not a test; already targets eva_process.
Fixed two stale imports as part of the move:
egomimic.rldb.zarr.ZarrWriter -> egomimic.rldb.zarr.zarr_writer.ZarrWriter (empty __init__)
egomimic.scripts.eva_process.zarr_utils -> egomimic.scripts.eva_process.eva_utils (file renamed earlier))
Delete dead (grep-proven zero importers; scratch backups in scratch/rldb_deleted_backup/):
- egomimic/rldb/compression_utils.py av/jpeg video codec, no importers
- egomimic/rldb/data_utils.py slerp/ypr quat math, superseded by egomimic.utils.pose_utils
- egomimic/rldb/zarr/benchmark_forward_pass.py dead benchmark script
- egomimic/rldb/zarr/test_zarr.py broken: imports nonexistent egomimic.rldb.utils.S3RLDBDataset
- egomimic/rldb/scripts/ (whole subpackage) nds_pq/str2bool/etc already live in egomimic.utils.egomimicUtils
Verified on a40 alloc: `import egomimic` OK, moved module imports OK,
deleted modules raise ModuleNotFoundError, pytest tests/ = 136 passed / 8 failed
(pre-existing) / 4 skipped (zero new failures), rldb test_dataset_filter 8 passed.
No config _target_ referenced any touched path.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…s sim-glue -> embodiment
Group: eval + pl_utils strays (hierarchy pass).
1) git mv egomimic/pl_utils/test_model_wrapper.py -> tests/test_model_wrapper.py
(R-status rename; 94% similar). tests/ has no __init__.py so pytest imports
it as top-level module test_model_wrapper; updated the in-test _target_
(egomimic.pl_utils.test_model_wrapper.DummyAlgo -> test_model_wrapper.DummyAlgo)
and the two __module__ assertions to match. Behaviour identical: 2 passed +
1 pre-existing fail (lr_scheduler dict-wrap) both before and after the move.
2) SPLIT: extracted the PushShapes sim-eval glue (_env_to_zarr_pushshapes,
_ENV_TO_ZARR, _state_to_init) out of egomimic/eval/core/eval_sim.py (the
algo-agnostic evaluator, which stays as the primary home of the eval classes)
into a NEW embodiment helper module egomimic/rldb/embodiment/pushshapes_sim.py.
The 3 symbols are byte-identical (verified via AST extraction). eval_sim.py
re-imports them so the legacy names (incl. the facade path
egomimic.eval.eval_sim._env_to_zarr_pushshapes / ._state_to_init used by
scripts/verify_*) keep resolving to the SAME objects.
Importers repointed to the canonical home in this same commit:
- egomimic/eval/dfot/eval_dfot_self_rollout.py (_state_to_init)
- scripts/verify_normalization.py / verify_normalization2.py / verify_image.py
(_env_to_zarr_pushshapes)
Verification (a40 alloc): import egomimic OK; pushshapes_sim / eval_sim re-export /
eval_dfot_self_rollout / legacy facade eval_sim all import and resolve to identical
objects; all 4 eval_sim class _target_s resolve via hydra get_class; verify_* scripts
py_compile clean; pytest tests/test_config_compose.py 25/25; pytest tests/ = 138 passed /
9 failed / 4 skipped (baseline 136/8/4 + the now-collected test_model_wrapper 2-pass
+1-pre-existing-fail; zero NEW failures, 8 pre-existing TestAlgoWiring + 1 missing-zarr
unchanged).
Per-folder .py counts: pl_utils 8->7, tests 12->13, rldb/embodiment 4->5,
eval/core 8->8 (SPLIT: eval_sim.py stays; glue fragment extracted to embodiment).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…lib + ops homes, dead launchers
Folder-group of the pact-2 hierarchy pass. Every move is a git-mv R-rename
(history preserved); moved code is byte-identical modulo import lines.
Moves (all git mv, R-status):
- 6 regression smokes scripts/test_*.py -> tests/regression/ . Each got a
module-level pytest.skip(allow_module_level=True) guard (the +9 lines) so
pytest collection stays clean -- they hardcode an EgoVerse-clone-3 path and
load configs removed from this repo + need GPU/checkpoints; run manually.
- 7 packed/composite/teacher smokes scripts/smoke_*.py -> tests/regression/
(byte-identical, no test_ prefix so pytest never collects them).
- scripts/smoke_sim_eval.py -> egomimic/eval/core/ckpt_loading.py: it is a
LIBRARY (load_algo_from_ckpt + _MockTrainer, the CLI main() rides along).
Its sibling import rebased to egomimic.eval.core.eval_sim (off the legacy
facade). All 5 importers updated in this commit:
scripts/{tf_dec_overlay,tf_decoupled_eval,tf_chunk_eval,
eval_cfg_latest,eval_fsd_latest}.py
- root setup_nvm.sh, run_eva_docker.sh, pull_models.sh -> scripts/ops/ .
Deletes (with proof):
- scripts/sbatch_train_hnet_fused_{50,80}ep_cosine.sh -- both pass
model=hnet_pushshapes_fused, a config removed from hydra_configs/ (survives
only quarantined under scratch/flat_fused_quarantine/). Dead launchers.
- scripts/__pycache__/ .
Docs touched for accuracy (not load-bearing): egomimic/robot/eva/eva.md
run_eva_docker.sh path; CLAUDE.md smoke-script paths + section heading.
Verify (on a40 alloc 3325804): import egomimic OK; new ckpt_loading path
exposes load_algo_from_ckpt + _MockTrainer; all 5 importers parse; pytest
tests/ --collect-only stays 150 collected, 0 errors (regression dir =
0 collected / 6 skipped); full tests/ = 138 passed / 9 failed / 10 skipped,
identical pass+fail to the pre-change HEAD (the 1 test_model_wrapper failure
+ 7 TestAlgoWiring + 1 missing-zarr all pre-exist), +6 = the guarded smokes.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The hierarchy pass (10b2398) moved test_model_wrapper.py from egomimic/pl_utils/ into tests/, newly subjecting it to `pytest tests/` collection. One assertion was stale: it asserted optimizers["lr_scheduler"] *is* a StepLR, but ModelWrapper.configure_optimizers() returns the Lightning scheduler-config dict {"scheduler": <StepLR>, "interval", "frequency"}. The production contract is correct; the test was written against an older bare- scheduler return. Target the nested ["scheduler"] key. No production behavior changed (debug-the-assertion only). Restores ZERO-new-failures gate. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…ap, deep-clean log DESIGN.md §9: hierarchy contract — role-dirs (cores/heads/stems) vs subsystem-dirs (hnet/diffusion) asymmetry is intentional; root scripts/ (launchers, run-as-script) vs egomimic/scripts/ (importable data CLIs) split; final egomimic/models/ tree (zero loose .py). PORT_NOTES.md: hierarchy-pass record (6 group commits + tags + per-group R-rename counts), the gate-fix for the newly-collected stale test_model_wrapper assertion, final-gate results, and the FULL old->new OLD-ckpt _target_ path map. Also added the deep-clean dead-code-purge record (collapses c4-c7) that the gate-check found missing from the dedup-gate section (which only covered c1-c3). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…oc-gap fix) Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…-outer-stage)
COMBINE A. Unify the three DFoT video-rollout evaluators into ONE
family-agnostic DFoTVideoRolloutEval. Each outer stage now owns a
``rollout_video_episode`` hook (the family-specific sampler + decode);
the eval owns the family-INVARIANT skeleton.
Decode-on-outer-stage. The family-specific code is MOVED byte-for-byte
from the evaluators INTO the outer stages:
* ObsActionImageDFoTOuterStage: unconditional chunk/AR bundle sampler,
slice the flat VAE-latent portion, frozen-VAE decode. Single-panel
(t=0 GT prepended). Metric prefix "video".
* ImageSpatialDFoTOuterStage: per-step (state,action) cond, conditional
chunk/AR spatial-latent sampler (optional GT-context anchor), frozen-
VAE decode. Side-by-side [GT|pred]. Prefix "spatial".
* PixelSpatialDFoTOuterStage: sliding-window pixel rollout anchored on
the first GT frame(s), no VAE, + PSNR/SSIM/LPIPS. Side-by-side. Prefix
"pixel".
The unified eval dispatches on three class attrs the outer stage
advertises (video_metric_prefix / video_panel / video_has_extra_metrics)
and owns: packed/padded episode indexing, per-step recon-MSE accumulation,
panel assembly, perceptual-metric averaging, mp4 emission.
eval_dfot_spatial_video_rollout.py + eval_dfot_pixel_video_rollout.py
deleted (scratch backup kept). DFoTSpatialVideoRolloutEval /
DFoTPixelVideoRolloutEval kept as compat aliases of the unified class —
the __init__ is a true superset and every old config now passes its knobs
explicitly, so the aliases are pure _target_ redirects.
_target_ mirrored SAME commit in the live configs that named the old
classes (eval_dfot_image_spatial: +explicit n_context_frames=0;
eval_dfot_pixel: +explicit n_context_frames=1, rollout_window=9 — all were
the old class defaults). eval_dfot_obs_action_image already named the kept
class. The eval/__init__ _MODULE_HOMES legacy-import names + dfot/__init__
re-exports now point at the unified module.
PROOF (per the universal bar, all 3 families). Fixed-seed eval OUTPUT
equality old-vs-new in separate processes: the metrics dict (torch.equal,
incl. PSNR/SSIM/LPIPS) AND the rendered frame tensors (torch.equal) on a
random-weight algo + one real packed batch. PASS for image_spatial,
obs_action_image (BOTH chunk + AR sub-evals), and pixel — bit-identical
(A40 deterministic; verified by an independent rerun). tests/ back to the
8-failed baseline (the 9th, test_touched_eval_modules_import, updated for
the collapse), all 20 evaluator yamls compose+resolve.
Net -222 LOC (940 -> 269 eval LOC; family code now lives on the stages).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…b base Three structural dedups on the DFoT evaluators (egomimic/eval/dfot/), all behaviour-preserving (proven by fixed-seed torch.equal old-vs-new + textual identity of the moved method bodies): 1. POLICY PAIR MERGE. git mv eval_dfot_policy_action.py -> eval_dfot_policy.py and fold DFoTPolicyRecedingHorizonEval (which subclasses DFoTPolicyActionEval and reuses _rollout/_ddim_from_v verbatim) into the same module; delete eval_dfot_policy_receding_horizon.py. _ddim_from_v + _rollout bodies are byte-identical to the pre-combineB sources; the RH compute_metrics_and_viz body is verbatim. Mirrors _target_ in the two policy evaluator configs and the eval/__init__ legacy-import facade + dfot/__init__ exports. 2. SHARED ANCHORED-DDIM HELPER. New eval/dfot/_sampling.py::anchored_ddim_rollout factors the single-tensor sched[:,:n]=clean anchored loop shared by DFoTBundleAnchoredEval._rollout (1D bundle) and ImageSpatialDFoTOuterStage._rollout_latent (5D spatial latent). Both call sites adopt it with shape adapters; proven torch.equal to the verbatim inline loop for both the 1D and 5D shapes. The 2D-policy dual-stream _rollout (co-denoises x_lat + x_act through a dual-output backbone with a hand-rolled v-pred step) is genuinely different and is left duplicated (documented in the helper docstring). 3. KNOB BASE. New eval/dfot/_base.py::DFoTVideoEvalMixin hoists the common knob storage (embodiment_name, image_key, video_subdir/_video_subdir, recon_loss_n_frames, upscale_to, n_chunk_steps) + the video_dir() override shared by bundle_anchored / policy / video_rollout. Each __init__ keeps its own per-class defaults and passes them through store_dfot_knobs explicitly; resolved attributes verified identical to the per-class expected defaults. Proof: scratch/proof_combineB.py (removed post-proof) — 14/14 PASS, all torch.equal with maxdiff 0.00e+00 on A40. Full tests/ = 139 passed / 8 failed (pre-existing, unrelated: test_packed_pipeline + test_training_recipe wiring) / 10 skipped — zero new failures. All three touched evaluator configs compose. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Records the dedup-campaign DFoT-evaluator combine (combines A c73a554 + B 7fde862): rollout trio -> 1 family-agnostic DFoTVideoRolloutEval (decode-on-outer-stage), policy pair merge (eval_dfot_policy, RH subclasses Action, shared _rollout), shared anchored-DDIM helper (_sampling), knob/path mixin (_base). self_rollout untouched (genuinely-different uint8 variant). eval/dfot/ 7-eval-file set: 1784 -> 1278 lines (-506, -28%); 4 per-family modules deleted, 2 reusable helpers added; class count consolidated. Per-file before/after table + the full _target_ map (live + dead-on-disk configs, all resolve via __init__ compat aliases) recorded in PORT_NOTES. Final gates (alloc 3326107, a40 megabot, pact-2 symlinked .venv): - pytest tests/: 139 passed / 8 failed / 10 skipped (8 = pre-existing; ZERO new) - compose sweep: TOTAL_PASS=107 / TOTAL_FAIL=2 (2 = pre-broken viz/pi_cartesian_lang*, NOT DFoT); all 11 DFoT evaluator yamls compose - real eval forward: evaluator=eval_dfot_image_spatial + eval_dfot_pixel each built a REAL DFoT algo (random weights, fixed seed) and ran one compute_metrics_and_viz end-to-end through the unified eval + outer-stage decode hook -> finite metrics (11 / 13), mp4 written (599954 / 288794 B) Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
HPTEvalVideo and PIEvalVideo each carried a byte-for-byte equivalent
"apply the revert transform once, then reuse it for both the cam-frame
paired/final MSE and the viz video" block (HPT named the prediction key
main_pred_key, PI named it pred_key -- same f"{embodiment}_{ac_key}"
value). Hoist it into one helper, eval/core/_viz_shared.py
cam_frame_mse_and_viz_batches(...), and have both evaluators delegate.
The per-evaluator-specific metrics (HPT's Frechet / Reverse-KL / aux+shared
heads, PI's native-frame MSE) stay in place; only the genuinely-common cam
block moves. Net -38 lines across the two zoo files.
Proven output-identical: for each evaluator, built from its composed
evaluator config (real viz_func, hydra-composed) with a stub algo + fixed-
seed batch and a deterministic injected transform, compute_metrics_and_viz
run with the pre-combineC class vs the refactored class produces
torch.equal cam_paired_mse_avg / cam_final_mse_avg (both 0.5509150624) and
torch.equal preds_for_viz / gt_batch_viz tensors, with identical metric-key
sets. tests/test_pi.py + tests/test_config_compose.py: 25 passed, 1 skipped
(zero new failures).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
… peers
The zoo/ grouping was wrong: HPT and PI are actively-developed algorithms
benchmarked against the H-Net line, not a pen of frozen third-party baselines.
They are first-class peers of bc. Dissolve algo/zoo/ and the mirrored eval/zoo/
into per-algo folders; algo/ and eval/ mirror each other.
Moves (git mv, R-status renames):
egomimic/algo/zoo/hpt.py -> egomimic/algo/hpt/hpt.py
egomimic/algo/zoo/pi.py -> egomimic/algo/pi/pi.py
egomimic/algo/zoo/act.py -> egomimic/algo/act/act.py
egomimic/eval/zoo/eval_hpt.py -> egomimic/eval/hpt/eval_hpt.py
egomimic/eval/zoo/eval_pi.py -> egomimic/eval/pi/eval_pi.py
egomimic/eval/zoo/eval_act.py -> egomimic/eval/act/eval_act.py
Each new folder gets an __init__.py re-exporting its public class
(egomimic.algo.hpt.HPT/HPTModel, egomimic.algo.pi.PI, egomimic.algo.act.ACT/
ACTModel; egomimic.eval.{hpt,pi,act}.<...>EvalVideo). The two zoo/__init__.py
are git rm'd; the algo/zoo PEP-562 lazy-PI shim is gone but PI laziness is
preserved (top-level egomimic.algo never imports egomimic.algo.pi eagerly).
Mirrored in the SAME commit: 16 yaml _target_s (12 hpt model configs, act.yaml,
pi0.5_base.yaml, eval_hpt.yaml, eval_pi.yaml) + every importer (eval/__init__.py
_MODULE_HOMES facade, algo/__init__.py doc comments, tests/test_pi.py).
Principle (DESIGN.md §9.4): all algorithms are first-class peers under algo/,
each its own folder when it may grow; bc stays a single flat module-home (left
untouched here); eval mirrors algo; the shared spine (algo.py/PackedAlgoBase,
packed_base, loss, outer_stage, obs_transforms, packed_outer_stage) stays flat
at algo/ top — not zoo, not bc-specific. Shared eval helper
eval/core/_viz_shared.cam_frame_mse_and_viz_batches stays in eval/core/
(import path unchanged). Old-ckpt _target_ resolution: scratch/
hierarchy_path_map.txt (ZOO DISSOLVE block) + PORT_NOTES.md.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
… -> */algo.py, drop loss/packed_base/outer_stage shims) — safety snapshot before cotrain+eval port
…ndEncoder + inner_working_dim polymorphism - models/hnet/per_embodiment_stage.py: per-emb outer stages, shared inner trunk; dispatch by ctx.embodiment_id (ported from elmo/hnet-cotrain-circle-stick, import remapped to models.hnet) - models/hnet/stages.py: inner_working_dim property (base=input_dim, ChunkerStage=output_dim) for polymorphic dim-handoff - models/hnet/hnet.py: replace isinstance(ChunkerStage) dim-check with stages[i].inner_working_dim (-9 lines) - models/hnet/context.py: add embodiment_id field - models/stems/cond_encoders.py: MultiEmbodimentCondEncoder + ignored embodiment_id kwarg on CondEncoderModule.encode - algo/hnet/algo.py: thread embodiment_id + domain_by_id reverse map through HNetOuterStage/PackedAlgoBase forward paths Gate: hnet regression 57 passed/2 skipped; PerEmbodimentStage construct+dispatch+guard smoke green.
…override resolver - embodiment.py: PUSHSHAPES_SIM_STICK = 16 (resolves by name; no per-emb handler needed) - zarr_dataset_multi.py: LocalEpisodeResolverWithEmbodimentOverride — re-tags shared-metadata zarrs to a config-supplied embodiment so circle/stick dispatch apart - schedulers.py.piecewise_linear already present in pact-2 (skipped) Gate: stick id=16, circle id=15, resolver imports, viz_gt_preds present.
… skynet paths) - model/hnet_pushshapes_cotrain.yaml: PackedAlgoBase + HNetOuterStage with MultiEmbodimentCondEncoder (per-emb circle/stick) and 2-level PerEmbodimentStage chain (per-emb EncDec+Chunker r=8 outer, shared EncDec+Chunker r=4+Compute inner). _target_ paths remapped to models.hnet/models.stems; algo.hnet.HNet alias preserved. No recipe knobs (single LR, source-faithful); comment points to warmup_cosine for fresh launches. - data/tsimulation_cotrain.yaml: circle_750 + stick_312 on skynet; stick via LocalEpisodeResolverWithEmbodimentOverride. chunking=sequential (stick has a 1068-frame episode > max_seq_len=1024). Gate: full trainHydra smoke on L40S — compose + load both datasets + norm_stats + construct 15.6M-param model + 2-step forward_training, clean exit.
… cross-emb composite - probes/eval_boundary_strip.py: horizontal multi-stage render (Stage k labels, time on x-axis, per-emb keys, frame-level upsample); threads embodiment_id via domain_by_id - probes/eval_pca_tokens.py: embodiment threading + n_extra_train_episodes train mix-in - core/eval_composite.py: EvalVideoList below_indices (vstack boundary strip below traj+PCA row) - core/eval_combined_rows.py (new): CombinedRowsEval — vstack per-emb composites into one 2-row video - evaluator/eval_hnet_pairs.yaml + _combined.yaml: wire the cross-emb composite (eval_hnet/pca/boundary), paths remapped to core/probes - eval_hnet.py unchanged (already had per-emb viz_func) Gate: mode=eval on cotrain model rendered per-emb composites (traj + PCA + horizontal Stage0/Stage1 boundary strip below), verified visually.
…ndowedBC mimic, full-episode TF) Mimics WindowedBC on the HNet algo with FULL-EPISODE teacher forcing: - algo/hnet/algo.py: HNetOuterStage action_head_type=gmm (pre-instantiated GMMActionHead); decode() stashes head on ctx.extras. New GMMLoss (peer of HNetLoss): builds per-obs-step 8-action chunked targets (repeat-pad at episode ends, packed+padded) and computes GMM-NLL via head.nll + ratio loss. - embodiment.py: PUSHSHAPES_SIM_SMALL_CIRCLE = 17. - model/hnet_cotrain_gmm_obs.yaml: ObsToken (ResNet VisualCore + proprio) obs-as-input, no-AdaLN (cond_key=null), per-emb outer ChunkerStage -> shared inner ChunkerStage -> ComputeStage, GMM head chunk_len=8, GMMLoss, cotrain circle+small_circle. - data/tsimulation_cotrain_small_circle.yaml: full-episode TF (chunking=none); small_circle PLACEHOLDER at circle path via embodiment_override. Gate: trainHydra smoke on L40S — construct 15.4M-param model + 2-step forward_training with GMMLoss, clean exit.
…it by filter) Found the small-circle data: /coc/flash7/paphiwetsa3/datasets/circle_co_big_small holds 953 big-circle + 955 small-circle episodes, BOTH tagged embodiment=pushshapes_sim, distinguished only by task_description.env_args.pusher_shape (circle vs circle_small). Split by episode-name DatasetFilter: - pushshapes_sim: folder=circle_co_big_small, filter "circle_small not in episode_hash" (905 big) - pushshapes_sim_small_circle: same folder, filter "circle_small in episode_hash" + embodiment_override (955 small) No more placeholder. max_seq_len 1024->2560 (longest small episode = 2255 frames; full-episode TF), batch_size 1 (long episodes + ResNet), model action_horizon 1024->2560. Gate: trainHydra smoke on L40S — both filtered splits load, norm_stats, 15.4M-param construct, 2-step forward_training with GMMLoss, clean exit.
Make val-overlay + closed-loop sim work with the chunked GMM head: - HNetOuterStage._decode_pred_actions: eval bridges (forward_padded/packed) decode GMM params -> sample (low-noise) -> chunk pos-0 point action per frame for the teacher-forced overlay. - inference_step: open-loop action-chunk buffer (mirrors WindowedBCPolicy.step queue) -- on an obs-step decode the chunk_len actions, dispense one/frame with NO model call between, re-observe every chunk_len frames (obs_stride==chunk_len). Non-GMM unchanged (one action/frame). - HNetOuterStage.step: thread embodiment_id onto the per-step ctx so PerEmbodimentStage dispatches in the AR cache. - PerEmbodimentStage._allocate: pact-2 AR-cache scheme (per-emb sub-stage cache; only active emb advanced). Gate (untrained, L40S): val overlay renders both embs + actions_paired_mse computed; sim eval rolls out both embs (coverage 0.0 untrained) -- both paths clean.
Obs-input H-Net GMM cotrain model for the PushShapes circle/small-circle campaign: - first_token causal chunker with grab_prev_end (concat[first-token, prev-chunk-end] -> MLP) + duplicate upsampler + residual mixer (stages.py, residual_mixer.py) - per-embodiment GMM heads (gmm_heads dict) + per-emb obs encoder (embodiment_id threaded through ObsToken; MultiEmbodimentCondEncoder d_cond) so only the high-level trunk is shared (algo.py, stems/) - obs_stride arch (strided token stream + tiled chunk targets) + fp32-matched sim rollout dtype (algo.py) - ratio-loss scheduler callback + GMM / first-end model+data configs - scan_interface + circle_small sim-env support Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01TYmG3nhxt7LYiaaSJPsKEV
Eval and visualization tooling for the GMM cotrain models: - boundary-strip / PCA-token probes, keyframe + boundary viz, overlay loader, sim-eval + ckpt-loading updates (eval/core, eval/probes) - chunkviz explorer: export.py (per-frame chunk ids + cross-emb shared PCA with embodiment-removal variants raw / mean-diff / LDA), build_html.py (self-contained HTML), viewer_template.html (multi-model dropdown + single/compare + PCA-removal toggle), serve / nocache_server - GMM evaluator configs (circle / cotrain / smallcircle / firstend) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01TYmG3nhxt7LYiaaSJPsKEV
Contributor
Author
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.

tsim: sim env + viz/scripts + physics tuning
Squash of:
dataset: tsim training configs + HPT keymap/viz/eval + packed dataloading
Squash of:
hnet: flexible stage refactor + packed training + viz/infra
Squash of 10 commits from temp-arch-flexible: 7faf2012, 49ed0d34, 8387986e, 11a266ce, 3fe9a353, 0ea9d013, c83b6e69, 63a2e852, 7b71650a, a063c021
HNet packed training: test suite (86 tests)
stages (padded + packed), HNet assembly, ratio_loss, chunk_stats,
STE, RMSNorm, AdaLN.
packed; _iter_leaves descent; multi-frame JPEG decode; end-to-end
packed stats collection.
init_weights height-scaled init, apply_lr_multiplier per-stage
stamping, parameter_groups (default, with bias/norm WD=0,
per-stage groups, AdamW-consumable). Plus algo wiring tests for
the opt-in init_weights_range / lr_multipliers /
use_parameter_groups / weight_decay kwargs.
Co-Authored-By: Claude Opus 4.7 (1M context) noreply@anthropic.com
wip: uncommitted edits from EgoVerse7 (active work-in-progress)
Imports from EgoVerse7@temp-arch-flexible working tree as of 2026-05-20.
Includes:
Excluded: egomimic/algo/hnet.py.bak.preinput (manual backup) and drift_eval_out_* (eval artifacts).
dfot: v1 algo + Isotropic backbone w/ per-token AdaLN + continuous/discrete diffusion + DDPM/DDIM samplers
dfot: fix inference_step obs shaping (don't unsqueeze; drop dead ac_keys fallback)
dfot: packed-mode training + eval (cu_seqlens-aware backbone forward, per-frame obs cond)
dfot: causal-AR staircase sampler + schedule-matrix sampler + online rollout helper
dfot: PACE sbatch for 80ep packed training on PushShapes circle/basic
dfot: 20x training budget (12800 steps over 80 ep), --time 8h
dfot: full 750-ep training (200 ep, 32000 steps, tsimulation_full.yaml); drop 400ep config
dfot eval: val-data evaluator with full-chunk + staircase-AR overlay viz
dfot: relaunch sbatch with eval_dfot_val (teacher-forced viz) + seq_lens defensive fix
dfot refactor: unify sampling, fix AR _ar_pred, AR inference_step, PackedSimEval rename, composite eval
dfot eval: fix HNetSimEval->PackedSimEval comment reference
dfot sampling: split sample_step (primitive) from sample (loop wrapper); supports growing-T AR
dfot: inline AR rollout state into DFoT; delete CausalARRollout class
dfot algo: ar_inference_step_size knob, CFG plumbing, AdaLN dropouts
Co-Authored-By: Claude Opus 4.7 (1M context) noreply@anthropic.com
dfot configs+viz: model scaling, CFG, attn_dropout callback, 512x512 val viz
Co-Authored-By: Claude Opus 4.7 (1M context) noreply@anthropic.com
dfot scripts: 200/400ep sbatch variants + eval CLI tools
Co-Authored-By: Claude Opus 4.7 (1M context) noreply@anthropic.com
outer_stage + loss: abstract base classes for OuterStage/Loss refactor
Introduces the abstract bases for the upcoming refactor:
egomimic/algo/outer_stage.py: OuterStage base. Owns an
inner_stage field (the trunk). Subclasses implement encode
(raw batch -> trunk-input tensor; can sample noise / record state
on ctx) and decode (trunk-output -> per-modality prediction keys
on batch).
egomimic/algo/loss.py: Loss base + CompositeLoss (weighted
sum of terms) + MSELoss (per-modality MSE between pred_key and
target_key). Loss policy becomes data — a hydra config block —
rather than inheritance.
No algorithm code uses these yet; HNetOuterStage / DFoTOuterStage and
the algo.hnet.HNet / algo.dfot.DFoT refactors come in follow-up commits
on this branch.
continuous_diffusion: split forward into q_sample + compute_loss
Foundational refactor for the upcoming DFoTOuterStage + DFoTLoss
classes:
q_sample(x, t) -> dict: forward-noising step. Returns x_t, noise,
alpha_t, sigma_t, logsnr, and the precond_scale * logsnr time_cond
the backbone consumes. No backbone call inside.
compute_loss(v_pred, q_state) -> per-token weighted MSE: takes the
dict from q_sample plus the backbones v_pred and computes the
SNR-weighted epsilon-MSE.
forward(backbone, x, t, cond): kept as a back-compat wrapper that
calls q_sample, runs the backbone, then compute_loss. Existing
callers (DFoT.forward_training) are unaffected.
Bitwise verified equivalent to the prior single-method path via
the included /tmp/test_diffusion_split.py smoke (loss + x_pred match
exactly, same random seed).
discrete_diffusion.py is unsplit for now; current configs use
continuous.
dfot outer_stage + DFoTLoss: training-path classes (loss equivalence verified)
egomimic/algo/dfot/outer_stage.py: DFoTOuterStage subclass.
encode: encode obs to per-token cond, sample noise levels, run
diffusion.q_sample, store q_state + external_cond on ctx, return
noisy x_t. decode: write batch[pred_v] for the loss to read.
forward override threads cu_seqlens/max_seqlen (packed mode) and
time_cond into the backbone call.
egomimic/algo/loss.py: DFoTLoss class. Reads batch[pred_v] and
ctx.q_state, calls diffusion.compute_loss (SNR-weighted eps-MSE),
reduces to scalar.
Bitwise verified via /tmp/test_dfot_outer_stage.py: padded-mode loss
through DFoTOuterStage + DFoTLoss matches DFoT.forward_training to
0.0e+00 difference at fixed seed. Real CondEncoderModule +
DFoTBackbone + ContinuousDiffusion submodules; no mocks of the math
path.
Algo class (DFoT.forward_training) is NOT yet wired to use these —
that comes in the next commit on this branch. Inference paths
(closed-loop AR sample_step, chunk plan-execute) also deferred.
dfot algo + yaml: refactor to outer_stage + loss
Algo class:
loss: Loss (auto-built as DFoTLoss(outer_stage.diffusion) if None).
diffusion_kwargs, cond_output_key args — they now live on the
outer_stage subblock.
diffusion, outer_stage, loss so existing inference paths
(_inference_step_ar, _inference_step_chunk, _sample_chunk,
forward_eval) keep working unchanged via property forwarding.
outer_stage(batch, ctx), call loss(batch, ctx). No more inline
diffusion math; no more cu_seqlens threading at this level.
dfot_pushshapes.yaml:
diffusion module is now its own target inside outer_stage.
End-to-end smoke (scripts/test_dfot_refactor_e2e.py) verifies the
config instantiates via hydra and forward_training emits a finite
scalar loss. Bitwise loss-equivalence was already shown in the prior
commit (test_dfot_outer_stage.py).
Old checkpoints WILL NOT load — state_dict keys moved from
nets.{cond_encoder,backbone}.* to nets.outer_stage.{cond_encoder,inner_stage}.*
This is intentional per the agreed clean-break refactor.
dfot inference smoke: verify AR + chunk paths after outer_stage refactor
Adds scripts/test_dfot_inference.py: instantiates the refactored DFoT
from dfot_pushshapes.yaml, runs inference_step in both ar and chunk
modes, asserts action is (action_dim,) and finite. Verifies the
@Property accessors (self.backbone, self.cond_encoder, self.diffusion)
forward correctly to outer_stage submodules so the closed-loop AR
and chunk-mode inference paths keep working after the refactor.
Passing on compute node 8997316:
[ar] action @ t=0: [0.30 0.47]
[ar] action @ t=1: [0.44 1.06]
[chunk] action @ t=0: [-0.78 -1.89]
hnet_outer_stage + HNetLoss: H-Net OuterStage subclass (bitwise verified)
egomimic/algo/hnet_outer_stage.py: HNetOuterStage class. Inherits
from OuterStage with inner_stage = HNetCore (stage tree). Owns
cond_encoder, input_modules (summed per-token contributions),
action_out head. Three forward paths inherited from the old
HNetPolicy pattern: forward(batch, ctx) dispatcher (padded/packed),
generate (offline AR), init_step_state + step (online single-tick).
egomimic/algo/loss.py: HNetLoss class. Reads batch[pred_action] +
batch[actions], adds per-chunker ratio_loss_from_aux from ctx.aux.
scripts/test_hnet_outer_stage.py: equivalence smoke. Instantiates
the existing hnet_pushshapes.yaml subcomponents, wraps the SAME
instances in HNetPolicy and HNetOuterStage, ties their action_out
heads, runs identical padded forward. Verified bitwise (max diff
0.00e+00 at fixed seed) on H200 alloc 8989249. Also smoke-tests
the step inference path (shape + finite).
The old HNetPolicy class is still in egomimic/algo/hnet.py and not yet
removed. Algo-class refactor and yaml updates come in follow-up
commits on this branch.
Old checkpoints will NOT load — state_dict keys move from policy.*
to outer_stage.* (or wherever the algo class places it). Per clean-
break policy.
hnet algo + yaml: refactor to outer_stage + loss (base config)
Algo class:
instead of cond_encoder + hnet + action_dim + action_horizon +
d_model + action_head_type + input_modules. action_horizon read from
outer_stage.
(self.nets[policy], self.nets[cond_encoder], ...) are exposed via
@Property forwarding to outer_stage submodules so legacy callsites
in forward_eval / _teacher_forced_packed / _ar_rollout_packed / step
inference keep working.
ctx.ratio_loss) into the predictions dict for logging.
HNetLoss:
per-term split on ctx for the algo to log separately.
HNetOuterStage:
forward_packed(actions, obs, cu, msl) returning (pred, aux). The
forward_eval / _teacher_forced_packed paths use these; .generate /
.step / .init_step_state already had matching signatures.
hnet_pushshapes.yaml:
input_modules + action_head_type. Top-level keeps training-recipe
knobs (init_weights_range, lr_multipliers, ...) and embodiment
wiring.
scripts/test_hnet_refactor_e2e.py: packed-mode forward_training smoke.
Verified passing on H200 alloc 8989249 — produces action_loss 1.51 +
ratio_loss 0.032 + chunker stats for a 2-episode packed batch
(T=12+20). Padded mode hits a pre-existing torch SDPA error
(Explicit attn_mask should not be set when is_causal=True) in
train mode — this is in the trunk code, not introduced by the
refactor (the production training uses packed mode and never hits
the padded-train path).
Old HNetPolicy class is still in algo/hnet.py for now (no longer
used by HNet algo); will be removed in a cleanup commit once all
stage-based + flat yamls are migrated.
hnet yamls: migrate remaining 6 stage-based configs to outer_stage schema
Same outer_stage block pattern as the base hnet_pushshapes.yaml, applied
to the variant configs. Each yaml moves cond_encoder + hnet stage tree
training-recipe knobs + embodiment wiring at the top level.
Configs migrated:
scripts/test_hnet_yamls_load.py: batch instantiate smoke. Verified
on H200 alloc 8989249 — all 7 stage-based yamls (base + 6 variants)
instantiate from hydra config and produce sensible param counts
(5.5M baseline up to 42.8M obs_ar_large).
flat_fused_outer_stage + HNetFused thin alias + 3 flat yamls migrated
egomimic/algo/flat_fused_outer_stage.py: FlatFusedOuterStage class.
Structurally a rename of FlatFusedPolicy with OuterStage inheritance
plus an OuterStage forward(batch, ctx) dispatcher delegating to the
existing forward_padded / forward_packed. Legacy generate / step /
init_step_state preserved verbatim. encode / decode raise
NotImplementedError since the interleaved 2T-token flow does not
cleanly split along encode -> trunk -> decode.
egomimic/algo/hnet.py: HNetFused is now a thin pass-through subclass
of HNet, kept as a separate target for the existing flat yamls.
All flat-fused behavior moved into FlatFusedOuterStage; HNet.init
already tolerates outer_stage.inner_stage=None.
3 flat yamls migrated to outer_stage schema:
hnet_pushshapes_fused.yaml, hnet_pushshapes_fused_lowlr.yaml,
hnet_pushshapes_fused_pusher.yaml.
scripts/test_hnet_yamls_load.py extended to cover all 10 H-Net
configs. Verified on H200 alloc 8989249: 10/10 instantiate
successfully (7 HNetOuterStage + 3 FlatFusedOuterStage). Param
counts sensible.
Old FlatFusedPolicy class still in algo/hnet.py for now; cleanup of
unused legacy classes (HNetPolicy, FlatFusedPolicy) is a follow-up.
forward_training smoke for all 10 H-Net yamls + mamba regression check
eval: migrate algo.nets[] accesses to property accessors (refactor follow-up)
PACT: VAE + obs+action+image DFoT + bundle-aware evals + bcrnn
Joint state+vae_latent+action diffusion forcing.
DiT3D video diffusion: all fixes + VAE v6 config
Co-Authored-By: Claude Opus 4.7 (1M context) noreply@anthropic.com
WIP checkpoint: obs-action DFoT 2D policy + spatial_rh sim controller + new evals (pre-hnet-variants merge)
snapshot_pre-restack_2026-05-31_decouple_spatial_tf_evals
pre-restructure snapshot (DESIGN.md step 0)
restructure: sweep root debris -> scratch/ (DESIGN.md step 1)
Move WIP-session / sibling-repo dead weight out of repo root into a
gitignored scratch/ archive (MOVES not deletes; tracked files via git mv
so history is preserved at old paths). Adds scratch/MANIFEST.md and
ignores /scratch/.
Swept (67 files):
Deviation from DESIGN literal counts (40 .sh / 9 png/mp4): 5 of those are
original-repo files, not sibling debris, so KEPT in root:
pull_models.sh, run_eva_docker.sh, setup_nvm.sh (infra, git-added 2025),
convention.png, mano_keypoints.png (embedded in CONTRIBUTING_DATA.md).
Design's stated intent ("sibling-repo dead weight") preserved exactly.
See scratch/MANIFEST.md.
Co-Authored-By: Claude Opus 4.8 noreply@anthropic.com
restructure: collapse H-Net to models/hnet + flip hnet_core import (DESIGN steps 3-4)
Step 3 (collapse H-Net):
tree: cross-attn + residual_scale + causal_conv1d + adaln_per_token).
dup; its config/context/routing were byte-identical to the superset, so git
attributes those 3 as renames into models/hnet).
egomimic.models.hnet_nets.X -> egomimic.models.hnet.X.
that aliases each models.hnet. into sys.modules under the legacy
hnet_nets. key (so both top-level and submodule-path imports -- incl.
private symbols the tests import -- resolve to the SAME live module object)
and re-exports the top-level names. Keeps all legacy import paths alive
until the step-13 flip.
Step 4 (flip hnet_core import):
bc_rnn_nets._hnet_vendored.{context,hnet,stages} -> egomimic.models.hnet.*.
Superset extra flags (cross-attn / AdaLN / window) all default OFF, so the
obs-only HNetCore never touches the diverged paths.
Verification (A40, fixed-seed HNetCore forward, baseline captured in a
worktree at the pre-step-3 commit using _hnet_vendored vs post-flip using
models.hnet):
772f6f..f6fe005 match exactly pre/post; 12,163,840 params / 121 tensors.
bc_rnn, act/hpt, both callbacks) via the shim.
Co-Authored-By: Claude Opus 4.8 noreply@anthropic.com
restructure: resolve BC-RNN algos -> flat algo/bc.py + WindowedBC (DESIGN step 5, amended)
Amended step 5 (user override of DESIGN's algo/bc/ package): the active BC
algo stays ONE FLAT FILE -- no package, no algo/bc/ directory. Backbone
(lstm/transformer/hnet) switching stays purely config-side via the existing
core_net knob.
Changes:
preserved).
names kept as module-level aliases (BCRNN = WindowedBC, BCRNNPolicy =
WindowedBCPolicy) -> same class objects, so isinstance/pickle/Hydra resolve
unchanged. Internal instantiation + name-bearing error/doc strings updated.
from egomimic.algo.bc (BCRNN/BCRNNPolicy/WindowedBC/WindowedBCPolicy +
_cut_windows/_cut_windows_strided/_pack_to_padded), so the legacy import path
and any target: egomimic.algo.bc_rnn.BCRNN config keep resolving.
(old egomimic.algo.bc_rnn.BCRNN still works via shim + alias).
history preserved): egomimic/algo/bcrnn/{init,algo,outer_stage}.py +
its only config egomimic/hydra_configs/model/bcrnn_pushshapes.yaml ->
scratch/algo_bcrnn/. That dup is a separate robomimic-BC_RNN reimpl on the
Algo/OuterStage spine, not wired into the kept pipeline, with a stale config
(references pre-collapse egomimic.models.hnet_nets.* paths). Logged in
scratch/MANIFEST.md with a REQUEST-DELETE entry (user's call; not auto-deleted).
Verification (a40 compute node, sibling .venv):
WindowedBC is an HNet subclass; aliases are object-identical; shim re-exports
the same objects.
egomimic.algo.bc.WindowedBC.
AND NEW target (egomimic.algo.bc.WindowedBC) under a fixed seed:
state_dict torch.equal across all 137 tensors (23,511,752 params).
non-ignored files.
Co-Authored-By: Claude Opus 4.8 noreply@anthropic.com
restructure: relocate bc_rnn_nets to role homes + tests suite (DESIGN steps 6, 6.5)
DESIGN.md step 6: git mv the bc_rnn_nets members to their role homes and keep a
bc_rnn_nets/init FACADE re-exporting everything from the new homes (the old
import paths + yaml target submodule paths stay alive until step 13).
models/stems/ obs_encoder.py, visual_core.py
models/cores/ lstm_core.py, transformer_core.py, hnet_core.py
models/heads/ gmm_head.py, query_decoder.py
All 7 moves are R100 (pure git mv, content byte-identical). The facade aliases
each legacy submodule into sys.modules under egomimic.models.bc_rnn_nets.
(same mechanism as the step-3 hnet_nets shim) so package-name imports, submodule
imports, and yaml target paths all resolve to the SAME role-home module
objects. Removed the leftover empty _hnet_vendored/ dir from the step-3 collapse.
Flipped the 7 BC-RNN configs' _target_s to the role paths (ObsEncoder ->
stems.obs_encoder, visual_core.VisualCore -> stems.visual_core, LSTMCore ->
cores.lstm_core, TransformerCore -> cores.transformer_core, HNetCore ->
cores.hnet_core, GMMActionHead -> heads.gmm_head, QueryActionDecoder ->
heads.query_decoder).
DESIGN.md amendment 6.5: distill the session proof patterns into a pytest suite
(GPU-alloc runnable; all forces CPU for determinism):
tests/test_core_defaults_byte_identical.py -- lstm/tx/hnet construct +
torch.equal across two fixed-seed builds + match committed ref fingerprints.
tests/test_causality.py -- TX + HNet prefix-consistency; TX future-perturb
no-leak; query-decoder future-perturbation EXACT-ZERO (torch.equal).
tests/test_train_rollout_parity.py -- forward vs sequential step() for all 3
cores + the chunk8 query-decoder queue replay.
tests/test_config_compose.py -- all 7 BC-RNN (+ legacy-path assertion) + 13
dfot + 5 vae configs compose through train_zarr_cartesian.
Verification (a40 alloc 3325503): 40/40 new tests GREEN; 57/57 test_hnet_nets
GREEN; all 7 BC-RNN configs --cfg job compose; import egomimic + algo.bc +
models.hnet + hnet_nets shim + algo.dfot clean. The 8 reds in the wider suite
(test_training_recipe::TestAlgoWiring algo.hnet outer_stage signature drift;
test_packed_pipeline missing on-disk zarr) are pre-existing -- reproduced
identically at the step-5 commit ccff845, untouched by this move.
Co-Authored-By: Claude Opus 4.8 noreply@anthropic.com
restructure: relocate DFoT pieces -> models/diffusion + algo/diffusion (DESIGN step 7)
Split egomimic/algo/dfot into its model + algo halves via git mv:
backbones/{backbone,dit3d_backbone,spatial_backbone}
diffusion/{continuous_diffusion,discrete_diffusion,noise_schedule}
embeddings.py, sampling.py
algo.py, outer_stages/{outer_stage + 9 *_outer_stage}
vae_algo.py (was egomimic/algo/vae/algo.py)
Intra-tree imports rewritten to the new role homes (backbones import
models.diffusion.embeddings + models.hnet.isotropic_builder; algo imports
models.diffusion.{backbones,diffusion,sampling} + models.hnet.cond_encoders).
algo/dfot/init and algo/vae/init kept ALIVE as thin facades: every
legacy egomimic.algo.dfot. / egomimic.algo.vae.algo dotted path is
registered in sys.modules pointing at the real relocated module, so the yaml
_target_s (algo.dfot.DFoT, algo.dfot.outer_stage.DFoTOuterStage,
algo.dfot.{continuous,discrete}_diffusion.*, algo.vae.VAE) all still resolve.
Shim identity verified (algo.dfot.DFoT is algo.diffusion.DFoT).
Verify: 25/25 test_config_compose (13 DFoT + 5 VAE + 7 BC-RNN); import
egomimic + DFoT/VAE relocation imports clean.
Co-Authored-By: Claude Opus 4.8 noreply@anthropic.com
restructure: home zoo + curate eval into role buckets (DESIGN step 8)
Zoo (git mv, no behaviour change):
yaml _target_s (egomimic.algo.act.ACT / .hpt.HPT / .pi.PI) still resolve.
out of the tests/ suite, as before: it requires openpi).
Eval curated into egomimic/eval/{core,tf,dfot,probes,zoo}/ (git mv):
Inter-eval imports rewritten to the bucketed paths; the dfot evals' imports of
the DFoT model pieces flipped to canonical egomimic.models.diffusion.* (off
the algo.dfot shim).
EDITED the ~20 evaluator-yaml _target_s DIRECTLY to the bucketed eval paths
(DESIGN warns target resolution is weaker through init shims), e.g.
egomimic.eval.eval_sim.PackedSimEval -> egomimic.eval.core.eval_sim.PackedSimEval.
eval/init kept ALIVE as a facade: every legacy egomimic.eval.eval_
PYTHON import path is sys.modules-aliased to its bucketed module, so the code
consumers (trainHydra: egomimic.eval.eval.Eval; scripts/ smoke+verify helpers)
keep working until the final flip (step 13).
Verify: 25/25 test_config_compose; 7 BC-RNN compose via hydra --cfg job;
20/20 evaluator yamls compose + every egomimic.eval.* target resolves;
import egomimic + DFoT/zoo/eval-bucket imports clean; full suite unchanged
from baseline (8 pre-existing fails: 7 TestAlgoWiring sig drift + 1 data-missing
packed_pipeline; no new failures).
Co-Authored-By: Claude Opus 4.8 noreply@anthropic.com
restructure: close BC-RNN sim-eval gap (DESIGN steps 9-10)
Step 9 (get_keymap_eval):
plus a goal_pose passthrough keyed key_type="goal_keys". goal_keys is NOT in
MultiDataset.NORMALIZE_KEY_TYPES, so goal_pose is read into the packed batch
raw/un-normalized and passed straight through to PackedSimEval, which reads
batch["goal_pose"] in batch_to_env_init to set the env goal. Serves both
circle proxies. The 7 BC-RNN launchers already point KM at this symbol.
pact-2's eval uses init_mode="replay" and never reads init_action (verified: no
reference in egomimic/eval or egomimic/algo), so it would be dead weight. The
~17-line design target counted EV2's init_action; documented inline.
Step 10 (close sim-eval gap):
inference_step(obs_zarr, t, emb_id, T_max=self.max_steps) (eval_sim.py:251) no
longer TypeErrors. T_max is the sim rollout horizon, a different quantity from
the policy's action-queue length, so the internal init_step_state buffer is
still sized from policy.action_horizon; T_max is accepted/tolerated to match
the eval contract (matches DFoT.inference_step, which already takes T_max).
launchers. eval_hnet_sim.yaml (HNetSimEval/PackedSimEval) has no rollout_mode
key and drives AR natively via the per-token inference_step, so the override
raised ConfigAttributeError. No other eval-only override (delta_action/
temporal_ensemble/chunk_k/goal_in_obs) remains in the launchers.
headline). WindowedBC.init calls Algo.init (not HNet.init), so
the inherited HNet.process_batch_for_training (hnet.py:889) hit
AttributeError: 'WindowedBC' object has no attribute 'train_obs_transforms'
on BOTH the train and the validation paths, before any rollout. Initialize
self.train_obs_transforms = [] in WindowedBC.init: the empty list makes
the
if self.train_obs_transforms and self.outer_stage.trainingguardshort-circuit, also sidestepping the (nonexistent) outer_stage. WindowedBC
has no train-only obs augmentation, so [] is the correct value. Pre-existing
latent bug from the H-Net restructure (steps 3-8); surfaced only now because
this is the first time BC-RNN sim eval actually runs.
Verification:
get_keymap_eval KM and no rollout_mode=ar.
WindowedBC/BCRNN alias intact.
goal_pose].
bc_rnn_pushshapes_paperexact, max_steps=8, 1 val batch, random-init weights)
ran end-to-end on an A40 and LOGGED A COVERAGE NUMBER for the first time:
Valid/emb15_sim_coverage = 0.0
Valid/emb15_sim_success_rate = 0.0
EVAL_EXIT=0. Coverage 0.0 is expected for an untrained model; the point is
the eval stack now runs the full loop through inference_step(...,T_max=) and
the goal_pose passthrough. (mode=eval used to bypass the unrelated training
loop; the WindowedBC fix above was required to get past validation_step.)
test_packed_pipeline full-pipeline-stats) are PRE-EXISTING at HEAD (proven by
re-running with these changes stashed) - HNet.init now requires an
outer_stage arg the older fixtures don't pass. Tracked separately.
Co-Authored-By: Claude Opus 4.8 noreply@anthropic.com
structural fixes: input_modules->stems, packed_base rename, flat_fused quarantine
PHASE 1 structural fixes (post-DESIGN step 10), all pure moves + compat shims:
git mv egomimic/algo/input_modules.py -> egomimic/models/stems/input_modules.py
(DESIGN stems role home). Fixed its internal import to the canonical
post-collapse home models.hnet.cond_encoders (was models.hnet_nets.*).
Compat shim left at algo/input_modules.py re-exporting all 3 classes; updated
direct importers (algo/packed_base.py, algo/hnet_outer_stage.py) and the two
obs_ar config target paths to the new home.
git mv egomimic/algo/zoo/test_pi.py -> tests/test_pi.py and guarded it with
pytest.importorskip(openpi) so it SKIPS cleanly (was a collection ERROR; the
PI algo needs the optional openpi pkg, absent in the default venv).
Quarantined dormant B-family flat-fused legacy -> scratch/flat_fused_quarantine/
(flat_fused_outer_stage.py + 3 hnet_pushshapes_fused*.yaml), unreferenced by
pact-2 mission. HNetFused stays as dormant dead code in packed_base.py;
MANIFEST.md + REQUEST-DELETE entry added.
git mv egomimic/algo/hnet.py -> egomimic/algo/packed_base.py (role-clarifying:
per-emb-norm + packed-path base, NOT the models/hnet/ stage tree). Class names
unchanged (HNet stays HNet). Compat shim at algo/hnet.py re-exports the full
surface; updated direct importers (bc.py, test_training_recipe.py). Configs
keep using egomimic.algo.hnet.* via the shim.
bc_rnn.py shim untouched (step 13). Verified: import smoke (shim identity ==
canonical), 7 hnet configs compose, tests/ at baseline (no new failures;
test_pi now skips cleanly).
Co-Authored-By: Claude Opus 4.8 noreply@anthropic.com
step 13: final flip — shims deleted, configs mirrored, dormant purge, packed_outer_stage rename
DESIGN.md step 13 + amendments A & B. algo/ END-STATE: no shims, no dormant
code, honest names everywhere.
across all hydra configs + non-config importers (66 mechanical + 9 manual).
algo.hnet.* -> algo.packed_base.*
algo.bc_rnn.* -> algo.bc.*
algo.{act,hpt,pi}.* -> algo.zoo.*
algo.input_modules.* -> models.stems.input_modules.*
algo.dfot.* -> algo.diffusion.* / models.diffusion.*
algo.vae.* -> algo.diffusion.{VAE,vae_algo}
models.hnet_nets.* -> models.hnet.*
models.bc_rnn_nets..* -> models.{stems,cores,heads}.* (role-routed)
input_modules}.py, algo/dfot/, algo/vae/, models/hnet_nets/init.py,
models/bc_rnn_nets/init.py.
(1239 -> 935 lines, -304) into scratch/flat_fused_quarantine/.
importers + 7 hnet configs flipped same commit, no shim.
Verify (a40 alloc): config compose 37/37 PASS; tests 122 pass / 8 pre-existing
fail (identical to pre-flip baseline, ZERO new); state_dict parity LSTM+HNet+
chunk8-Q all torch.equal vs pre-flip; SMOKE=1 train_bc_rnn_hnet.sh TRAIN_EXIT=0.
Co-Authored-By: Claude Opus 4.8 noreply@anthropic.com
dedup collapse 1: unify 3 pixel-policy DFoT outer stages into one pixel_mode-parameterized stage
Collapse the three near-duplicate pixel-policy outer-stage classes under
egomimic/algo/diffusion/outer_stages/ into ONE parameterized class
PixelObsActionDFoTOuterStage, selected by the
pixel_modeconfig knob:pixel_mode="policy" <- PixelObsActionPolicyDFoTOuterStage (Design A:
action broadcast into RGB channels, jointly diffused)
pixel_mode="regress" <- PixelObsActionRegressPolicyDFoTOuterStage (Design B:
RGB-only diffusion + conv action_head off pred x0)
pixel_mode="decoupled" <- PixelObsActionDecoupledDFoTOuterStage (DEC: action as
separate DiT3D token with independent noise level)
Each mode reproduces the corresponding old class EXACTLY and preserves the
duck-typed attribute surface the algo inference paths consume (
_action_channelsfor policy,
action_headfor regress,decouple_action_noisefor decoupled,plus the mode-correct
action_slice). The 3 model configs are mirrored in thissame commit:
_target_-> PixelObsActionDFoTOuterStage +pixel_mode: <mode>.Old class files moved to scratch/dedup_c1_old_stages/ (gitignored).
PROVEN behavioral equality (fixed seed, a40, srun on overcap alloc; harness at
scratch/proof_dedup_c1.py, all old vs new instantiated from the SAME resolved
sub-configs with identical RNG):
(a) Fixed-seed construction parity — state_dict keys identical AND every tensor
torch.equal:
policy: 82 keys, 7,325,460 params, keys_identical=True, all torch.equal
regress: 90 keys, 7,397,134 params, keys_identical=True, all torch.equal
decoupled: 87 keys, 7,322,894 params, keys_identical=True, all torch.equal
action_slice old==new for every mode (policy slice(3,5); regress/decoupled
slice(0,0)); mode attribute surface present on the unified class.
(b) Forward parity on a fixed-seed packed batch (2 episodes, T=9) — every
output tensor torch.equal (EXACT, no allclose fallback needed):
policy: forward_return, pred_v, qstate_x_t -> torch.equal
regress: forward_return, pred_v, pred_action, loss, x_t -> torch.equal
decoupled: forward_return, pred_v, qstate_x_t, loss -> torch.equal
(c) Hydra-compose of all 3 mirrored configs PASS; composed outer_stage.target
resolves to egomimic.algo.diffusion.PixelObsActionDFoTOuterStage with the
correct pixel_mode each.
Regression: pytest tests/ = 122 passed / 8 failed / 4 skipped — identical to the
step-13 baseline. The 8 failures are pre-existing and unrelated (7 TestAlgoWiring
old-HNet-signature, 1 packed_pipeline missing-zarr-data). All 25 config-compose
tests pass, including the 3 mirrored pixel configs. Zero new failures.
Co-Authored-By: Claude Opus 4.8 noreply@anthropic.com
dedup collapse 2: move SimpleConv + CondEncoderModule into models/stems/ (models/hnet now pure chunking machinery)
Image-encoder consolidation. Relocates the two input-side encoder modules out
of egomimic/models/hnet/ (which is meant to hold ONLY H-Net chunking
machinery) into their role home egomimic/models/stems/:
git mv egomimic/models/hnet/image_encoders.py egomimic/models/stems/image_encoders.py (SimpleConv)
git mv egomimic/models/hnet/cond_encoders.py egomimic/models/stems/cond_encoders.py (CondEncoderModule)
These are PURE MOVES — no logic edits. The only in-file content change is a
single docstring line in image_encoders.py whose
_target_:example path wasupdated hnet->stems. cond_encoders.py is byte-identical to its pre-move source.
All 33 references (13 python import sites + 33 yaml
_target_occurrencesacross 20 model configs) flipped to the new stems path in this same commit;
hnet/init.py and stems/init.py re-exports updated; obs_encoder.py
docstring path corrected. After the move
git grepshows models/hnet containsno encoder/stem code (only a prose cross-reference comment in context.py).
PROVEN BEHAVIORAL EQUALITY (must function identically after the edit):
(a) Import-identity via temporary shim, checked with Python
isthen shimremoved in this commit:
OldSimpleConv is NewSimpleConv -> True
OldCondEncoderModule is NewCondEncoderModule -> True
hnet.init-resolved CondEncoderModule is new-> True
(b) Byte-identity modulo path lines: cond_encoders.py diff vs pre-move tag is
EMPTY; image_encoders.py diff is exactly ONE line (the
_target_docstringexample path).
(c) Construction state_dict torch.equal (old source extracted from tag
dedup-c2-pre vs new package source, fixed-seed init): all cases equal --
SimpleConv(4ch) 18 keys, SimpleConv(3ch) 14 keys, CondEnc+img 24 keys,
CondEnc+obs 10 keys, CondEnc(empty) 0 keys; plus forward torch.equal=True.
(d) Config mirror in this commit + compose-check: 20/20 affected configs
compose and instantiate the cond_encoder node (20 nodes) via new targets;
broader sweep 56/56 model configs compose, 0 failures.
Regression: pytest tests/ == 122 passed / 8 failed / 4 skipped, the SAME 8
pre-existing TestAlgoWiring + TestInferNormFromPacked failures present on tag
dedup-c2-pre (verified by running the suite on a worktree of the pre-move tag).
ZERO new failures.
Co-Authored-By: Claude Opus 4.8 noreply@anthropic.com
dedup collapse 3: factor shared zarr read/decode logic into rldb/zarr/_common.py
The two near-duplicate zarr loader paths — the padded/windowed reader
ZarrDataset.getitem and the packed/span reader ZarrDataset._read_span
(consumed by ZarrEpisodePackedDataset) — each inlined byte-for-byte copies of
the same JPEG window decode, single-frame JPEG decode, JSON-array decode,
float32 tensorization, and embodiment tagging. ZarrActionExpertDataset._load_obs_at
held a third copy of the single-frame JPEG decode.
This collapse extracts that shared logic into the new module
egomimic/rldb/zarr/_common.py as five pure helpers:
decode_jpeg_single(buf) -> CHW float image in [0,1]
decode_jpeg_window(buffers) -> stacked (T,C,H,W), per-frame decode
decode_json_array(arr, fn) -> [fn(v) for v in arr]
tensorize_float32(data, *, skip_object_dtype) (the ONE predicate the two
loaders genuinely differ on: _read_span skips
object-dtype arrays, getitem does not)
tag_embodiment(data, emb) -> stamps embodiment + metadata.robot_name
Each helper is a verbatim extraction of the pre-collapse loop body. The two
loaders now call the helpers and keep ONLY their genuine differences:
getitem keeps its horizon-windowing + repeat-last padding + bounded
JPEG-fail resample loop; _read_span keeps its exact-span read + seq_len /
episode_idx metadata. Dead
import simplejpegremoved from both loader files(decode now lives in _common). No public API / signature changes: _read_span,
getitem, _load_obs_at keep identical signatures; the sole _read_span call
site (zarr_dataset_packed.py) and the _load_obs_at call sites are unchanged
(git grep verified — zero call-site edits needed).
PROVEN BEHAVIORAL EQUALITY (fixed fixture episodes from
/coc/flash7/paphiwetsa3/datasets/new_circle_3, a40 overcap alloc, srun):
(a) New permanent suite tests/test_loader_equality.py (6 tests) PASSES both
BEFORE the refactor (anchoring reference behavior captured from the
pre-collapse code at tag dedup-c3-pre) and AFTER:
- TestReferenceHashes: both loaders reproduce frozen sha256 reference
hashes of the decoded front_img_1 / state_agent_obj / actions tensors
captured from the pre-collapse code. Post-refactor hashes match
exactly:
front_img_1 68f20e3c2c5f72b0 | (290,3,96,96) f32
state_agent_obj 26a652f406d275d2 | (290,5) f32
actions 6f7a2b1ab531506b | (290,2) f32
- TestCrossLoaderEquality: padded full-window vs packed span reads are
torch.equal per-frame across 4 episodes (+ embodiment id identical).
- TestNormalizationPathEquality: MultiDataset.normalize applied to BOTH
loaders' outputs is torch.equal (proves the normalization path is
identical across loaders).
- TestPackMetadata: pack_collate emits the documented seq_lens /
cu_seqlens / max_seq_len / batch_size, and the concatenated per-frame
stream equals the per-span reads in order.
The suite auto-skips off-cluster (fixture-missing guard).
(b) Direct old-vs-new bit-identity proof (scratch/proof_old_vs_new.py,
gitignored): the PRE-collapse loader modules extracted from tag
dedup-c3-pre via
git archiveand the POST-collapse live modules are runon the same 3 episodes through BOTH the padded getitem and packed
_read_span paths. Result: 33 key comparisons across 3 episodes x 2 paths,
every output tensor torch.equal(old, new) == True (front_img_1,
state_agent_obj, actions all exact).
Regression: pytest tests/ == 128 passed / 8 failed / 4 skipped. The 8 failures
are the SAME pre-existing failures present on tag dedup-c3-pre (verified by
running the suite on a worktree of the pre tag: 122 passed / 8 failed / 4
skipped — 7 TestAlgoWiring old-HNet-signature + 1 TestInferNormFromPacked
missing-zarr-data, all in code this commit does not touch). 128 = the baseline
122 + the 6 new equality tests. ZERO new failures.
Co-Authored-By: Claude Opus 4.8 noreply@anthropic.com
docs: record dedup-campaign global acceptance gates (2026-06-06)
Append the dated dedup-campaign record to PORT_NOTES.md after the 3
behavior-preserving collapses (c1 f06330c pixel-DFoT outer-stage unify,
c2 32eb1fc hnet->stems encoder move, c3 c289657 zarr _common factor)
passed the global gates on alloc 3325596 (a40):
MissingConfigException on parent default, untouched by collapses).
TestAlgoWiring+InferNorm fails; +6 new c3 loader-equality tests). Zero new.
BC delta ~1e-5, within 1e-3 gate.
Co-Authored-By: Claude Opus 4.8 noreply@anthropic.com
dedup collapse 4: dead-code purge (5 of 6 zero-ref symbols; HNetPolicy retained — it is NOT dead)
Pure-delete-with-grep-proof. Each symbol re-grepped over egomimic/ +
hydra_configs/ + tests/ + scripts/ (excl pycache/external/scratch/logs)
IMMEDIATELY before deletion to reconfirm zero LIVE refs.
Deleted (proven zero live refs):
diffusion/sampling.ddim_sample)
main self-invoke)
are HNetLoss()/DFoTLoss() built in code)
forward_eval -> _teacher_forced_packed)
step/init_step_state KEPT — they ARE the
live closed-loop path, called at
packed_base.py policy.init_step_state/step)
(deprecated; live wrapper is
MultiDataModuleWrapper, ref'd by 21 files)
Also dropped now-unused imports (typing.Optional in packed_outer_stage.py;
typing.Iterable/List/Optional in loss.py) and updated 2 yaml comment mirrors
that named the now-deleted CompositeLoss (dfot_pushshapes.yaml, hnet_pushshapes.yaml).
Deleted files moved to scratch/dead_code_c4/ (gitignored), not destroyed.
SCOPE CORRECTION — HNetPolicy NOT deleted: the campaign evidence claimed
HNetPolicy was zero-ref, but that grep excluded scripts/. The required
grep over scripts/ found 3 LIVE importers+instantiators:
scripts/smoke_packed_training.py (documented live tooling in CLAUDE.md L538)
scripts/test_mamba_regression.py (old-vs-new equivalence regression)
scripts/test_hnet_outer_stage.py (HNetPolicy-vs-HNetOuterStage equivalence smoke)
The grep proof-gate fails for HNetPolicy, so it is retained (its .generate
method stays with the class). All other 5 targets pass cleanly.
Proofs (run on own a40 alloc, repo's symlinked .venv):
import egomimic.algo, egomimic.models, egomimic.pl_utils-> IMPORT OKHNet+HNetPolicy, HNetOuterStage (step=True init_step_state=True generate=False)
The 8 failures are the documented pre-existing set (7x TestAlgoWiring
old-HNet-signature: "HNet.init() missing 1 required positional argument:
'outer_stage'"; 1x test_full_pipeline_collects_per_feature_stats missing-zarr).
Zero NEW failures. Deletes touch zero reachable code paths -> no torch.equal needed.
Co-Authored-By: Claude Opus 4.8 noreply@anthropic.com
dedup collapse 5: hoist shared embodiment-key resolution + _build_obs + log_info onto base Algo
Three blocks of code were byte-identical (verified via
diff, rc=0) acrossthe policy algos and are collapsed into a single home on the base
Algo:for emb in self.domains: ...resolving resolved_ac_keys / proprio_keys / lang_keys / camera_keys via
norm_stats) -- HNet packed_base.py:489-514 == DFoT diffusion/algo.py:169-194
== WindowedBC bc.py:699-724. Now
Algo._resolve_embodiment_keys(norm_stats);each subclass calls it from init.
_build_obs-- HNet packed_base.py:615-624 == DFoT diffusion/algo.py:246-255(WindowedBC already inherited HNet's). Now defined once on
Algo; the HNetand DFoT overrides are deleted (inherited).
log_info-- HNet packed_base.py:778-783 == DFoT diffusion/algo.py:419-425.The base
Algo.log_info(formerly a NotImplementedError stub) now carriesthis exact body as the shared default; the HNet/DFoT overrides are deleted.
Net -116/+83 lines. Behaviour is preserved by construction: the moved text is
identical, so every subclass resolves the same function object from
Algo(no subclass re-introduces a private copy). bc.py drops its now-unused
get_embodiment_idimport; packed_base/DFoT keep theirs (still used elsewhere).Proofs (run on a40 alloc 3325792, fixed seeds):
to the pre-c5 baseline (all 17 digits; this is the deterministic, 0.0-jitter
path per the dedup_baseline manifest).
baseline RE-RUN on the same node gives [1.3452528, 0.1739942] -- i.e. the BC
smoke is itself run-to-run nondeterministic at ~1e-3 (CUDA/image-encoder/
sim-eval RNG), and the c5 run lands CLOSER to the manifest values
[1.3453673, 0.1749004] than the clean-tree rerun does (row0 |c5-manifest|
=1.4e-5 vs |baseline_rerun-manifest|=1.1e-4). The refactor is within the
tree's own deterministic-replay band.
asserts HNet/WindowedBC/DFoT all resolve the SAME
Algofunction objectsfor _resolve_embodiment_keys / _build_obs / log_info (import-identity) and
produce byte-equal key sets + obs selection on a fixed norm_stats fixture.
documented pre-existing baseline (7x TestAlgoWiring old-HNet-signature +
1x missing-zarr-data); +3 passed are the new permanent tests. Zero NEW
failures vs the 128/8/4 profile.
Co-Authored-By: Claude Opus 4.8 noreply@anthropic.com
dedup collapse 6: hoist 6 identical eval uint8 helpers -> eval/core/img_utils.img_chw_to_uint8
Six evaluators each carried a byte-equivalent (C,H,W) float-in-[0,1] ->
(H,W,C) uint8-in-[0,255] converter under three private names:
spatial_video_rollout, policy_action
Four were the 4-line clip->*255->transpose form; two (policy_action, _u8)
were the same logic as a 2-liner. All six now delegate to a single canonical
egomimic.eval.core.img_utils.img_chw_to_uint8; the local defs are removed and
each call site renamed.
EXCLUDED (genuinely different, left untouched): eval_dfot_self_rollout
._img_chw_to_uint8 uses an x.max()<=1.5 auto-scale heuristic + clip(0,255).
PROOFS (a40 job 3325794, fixed seeds):
test_canonical_matches_every_original_body PASSED
np.array_equal(canonical, orig_4liner) and ==orig_2liner on a fixed
torch.manual_seed(0) float tensor spanning out-of-[0,1] (rand*1.4-0.2).
test_touched_eval_modules_import PASSED (import-smoke of all 7 modules)
post-c6 = 8 failed / 133 passed / 4 skipped. SAME 8 pre-existing failures
(7x TestAlgoWiring old-HNet-signature + 1x TestInferNormFromPacked
missing-zarr-data); +2 passed = the 2 new equality/import tests. Zero NEW
failures. Behaviour-preserving: the 6 bodies were already identical.
Deleted-helper provenance saved to scratch/c6_deleted_helpers/ (gitignored).
Co-Authored-By: Claude Opus 4.8 noreply@anthropic.com
dedup collapse 7: delegate image-only frame sampler to action-aware superset + hoist loss-reducer skeleton onto Algo
Two structural near-duplicates removed:
Frame sampler. PixelSpatialDFoTOuterStage._sample_frames_packed (image
only) duplicated the fixed_window/start_to_end/random_subsample branch
logic of PixelObsActionDFoTOuterStage._sample_windows_packed (image+
action). The action version is a strict superset: its image cropping and
cu_seqlens are byte-identical whether or not actions are supplied (the
action branch performs zero extra RNG draws). Hoisted the superset onto the
parent PixelSpatialDFoTOuterStage, made it accept actions=None, and reduced
_sample_frames_packed to a thin delegate
(_sample_windows_packed(images, None, cu)). Deleted the duplicate copy from
the PixelObsAction subclass (now inherited).
Loss reducer. DFoT.compute_losses was the pure sum-per-embodiment
{emb}_action_loss -> action_loss skeleton; promoted it to the Algo base as
the default compute_losses and deleted DFoT's byte-identical override.
VAE / HNet / HPT / PI keep their own overrides because they are genuine
SUPERSETS (recon/kl/lpips, ratio_loss, domain-count division) — NOT folded.
PROOFS (a40 alloc 3325796, fixed seeds):
torch.equal image crops + cu_seqlens across all 3 modes (img_equal=True,
cu_equal=True for fixed_window / start_to_end / random_subsample).
modes and all episode-length regimes (<n, ==n, >n);
fixed predictions;
(catches a future wrong fold of the superset overrides).
frame_sampling=fixed_window, + the inherited reducer): Train/Loss =
Train/action_loss = Train/emb15_action_loss = 0.28878551721572876, BIT-
IDENTICAL (all 17 digits) to the pre-collapse baseline.
documented pre-existing set (7x TestAlgoWiring old-HNet-signature + 1x
missing-zarr-data InferNorm), ZERO new failures; suite grew by the 3 new
permanent equality tests.
No hydra configs reference the touched methods (no config mirror needed).
Co-Authored-By: Claude Opus 4.8 noreply@anthropic.com
models/ hierarchy pass: relocate 7 loose files to role homes (cores/heads/stems/diffusion) + utils
End state:
ls egomimic/models/*.pyshows ONLY init.py. Every move is agit mv (R-status); the 2 splits git-mv the file to its primary home first
(history-preserving) then extract the other-role classes into new files in this
same commit. All moved class bodies are byte-identical modulo import lines
(verified by class-body diff vs HEAD). Every importer + config target updated
in this commit; grep confirms zero remaining old-module dotted refs.
WHOLE-FILE MOVES (git mv):
fm_policy.py -> heads/fm_policy.py (policy output head)
denoising_policy.py -> heads/denoising_policy.py (diffusion policy head base)
denoising_nets.py -> diffusion/denoising_nets.py (legacy diffusion nets)
image_vae.py -> diffusion/image_vae.py (DFoT pixel<->latent codec)
preprocess_pi_obs.py-> utils/preprocess_pi_obs.py (data preprocessing, OUT of models/)
SPLITS (git mv to primary home + extract):
act_nets.py -> stems/resnet_conv.py (primary: Module/ConvBase/CoordConv2d/ResNet18Conv)
+ cores/act_transformer.py (PositionalEncoding/Transformer/StyleEncoder)
hpt_nets.py -> stems/hpt_stems.py (primary: PolicyStem/MLPPolicyStem/ResNet)
+ cores/hpt_transformer.py (CrossAttention/Attention/MLP/BlockWithMasking/
MultiheadAttention/SimpleTransformer)
+ heads/hpt_heads.py (PolicyHead/MLPPolicyHead/TransformerDecoderBlock/
MultiBlockTransformerDecoder)
DEAD-CODE PRUNE (grep-proven 0 external refs):
hpt_nets: STPolicyStem, AttentivePooling, vit_base_patch16, T5TokenizerWrapper,
T5Encoder, L2Norm (also drops the heavy timm/transformers T5/ViT imports)
denoising_nets: ConditionalClassifier1D, CrossTransformerCfg2, CrossTransformerProj
VERIFY (on a40 alloc, PYTHONPATH=repo working tree, Python 3.11):
import egomimicOK; every touched module imports OK (pi/rollout fail only onpre-existing missing optional deps openpi/robot_utils, BEFORE the moved import lines).
it into PORT_NOTES).
Co-Authored-By: Claude Opus 4.8 noreply@anthropic.com
algo/ hierarchy pass: rename packed_base.HNet -> PackedAlgoBase
In-place class rename (no file moved): the packed-sequence policy Algo base
was misleadingly named
HNetwhile it is actually the shared base for thepacked-sequence algos (WindowedBC subclasses it; the inner H-Net stage tree
is supplied via
outer_stageand is a separate, correctly-named concept inmodels/hnet). Renamed the class to
PackedAlgoBaseto reflect its role.Changes (all in one commit):
class HNet(Algo)->class PackedAlgoBase(Algo);docstring updated; kept
HNet = PackedAlgoBasecompat alias at modulebottom (commented) so OLD ckpts/configs whose resolved target still names
egomimic.algo.packed_base.HNetkeep resolving.class WindowedBC(PackedAlgoBase)+ base-referringdocstring/comments updated.
egomimic.algo.packed_base.PackedAlgoBase.
Algo-method docstrings updated.
test_training_recipe}.py: import + class refs updated.
OUT OF SCOPE (untouched per task): models/hnet (architecture HNet), HNetCore /
HNetOuterStage / HNetLoss / HNetSimEval (architecture/stage names), HNetPolicy
(landmine: proven alive), algo/obs_transforms.py (landmine: designed extension
point).
Verified on A40 alloc:
import egomimicOK; PackedAlgoBase + HNet-aliasidentity OK (HNet is PackedAlgoBase); WindowedBC subclass OK; all 7 hnet
configs compose with new target; old dotted path resolves via compat alias.
Tests: test_c7 + test_embodiment_key_resolution_shared + test_config_compose
all pass; test_training_recipe shows the SAME 7 pre-existing TestAlgoWiring
failures (old constructor signature, missing
outer_stage) and ZERO newfailures vs the 136/8/4 baseline.
Path-map (old->new dotted) appended to scratch/hierarchy_path_map.txt for the
gates phase to fold into PORT_NOTES.
Co-Authored-By: Claude Opus 4.8 noreply@anthropic.com
utils/ hierarchy pass: junk-drawer split into pl_utils/vendored/models + dead-file purge
Relocate the misplaced contents of egomimic/utils/ to semantic homes and delete
grep-proven dead files. One commit per the hierarchy-pass group rule; revertible
via tag pre-utils-hier.
RENAMES (git mv, R-status):
utils/timing_callback.py -> pl_utils/callbacks/timing_callback.py
utils/instantiators.py -> pl_utils/instantiators.py
utils/logging_utils.py -> pl_utils/logging_utils.py
utils/rich_utils.py -> pl_utils/rich_utils.py
utils/utils.py -> pl_utils/utils.py
utils/tensor_utils.py -> vendored/robomimic_tensor_utils.py (988-line verbatim
robomimic vendor; +vendored/README.md provenance note)
egomimicUtils.py SPLIT (source file STAYS in utils/ as the generic remainder —
constants ARIA/EXTRINSICS/INTRINSICS, geometry, str2bool, interpolate_*,
CameraTransforms, download_from_huggingface, STD_SCALE):
model helpers -> models/cores/model_utils.py (NEW): get_sinusoid_encoding_table,
reverse_kl_from_samples, frechet_gaussian_over_time, EinOpsRearrange, AlohaFK
drawing fns merged into utils/viz_utils.py (dependency FLIPPED — viz_utils now
OWNS the drawing fns and pulls only INTRINSICS/cam_frame_to_cam_pixels/
ee_pose_to_cam_frame from egomimicUtils): draw_actions, draw_dot_on_frame,
draw_rotation_text, draw_annotation_text, miniviewer (+fmt helper).
All 11 moved bodies are byte-identical to originals (verified via AST diff
vs HEAD); only import lines differ.
model_utils.py placed in models/cores/ (not loose in models/) to keep the
models-group gate "models/ has only role dirs + init.py" intact.
DELETED dead (grep-proven 0 importers; scratch copies in scratch/utils_hier_deleted/):
utils/memory_utils.py, utils/real_utils.py, utils/obs_utils.py (only keep_keys,
0 refs), egomimic/init.pyc, egomimic/keypoints.jpeg.
Importers + config target updated in this same commit (grep-exhaustive over
egomimic/ tests/ scripts/ Tsimulation/ hydra_configs/):
callbacks/defaults.yaml wandb_profiler.target -> pl_utils.callbacks.timing_callback
trainHydra.py (instantiators/logging_utils/utils), norm_stats.py (utils),
pl_model.py (tensor_utils->vendored), hpt_heads.py / eval_hpt.py / algo/zoo/hpt.py
(model helpers), pushshapes.py / eval_act.py / robot/rollout.py /
data_visualization.py (drawing fns). pl_utils/utils.py internal rich_utils ref flipped.
Added pl_utils/init.py (was implicit ns pkg) so find_packages discovers it.
VERIFY (alloc 3325801, a40, pact-2 .venv): import egomimic + all 15 touched
modules import clean from THIS tree; pytest tests/ = 136 passed / 8 failed
(all pre-existing: 7 TestAlgoWiring old-HNet-sig + 1 missing-zarr) / 4 skipped —
ZERO new failures vs baseline; test_config_compose 25/25; parent config composes
and moved callbacks target resolves to the new class.
Path-map (old->new dotted) appended to scratch/hierarchy_path_map.txt for the
gates phase to fold into PORT_NOTES.
Co-Authored-By: Claude Opus 4.8 noreply@anthropic.com
rldb/ hierarchy pass: strays out + dead-file purge
Move (R-status rename, byte-identical modulo 2 fixed import lines):
(HDF5->zarr conversion CLI, not a test; already targets eva_process.
Fixed two stale imports as part of the move:
egomimic.rldb.zarr.ZarrWriter -> egomimic.rldb.zarr.zarr_writer.ZarrWriter (empty init)
egomimic.scripts.eva_process.zarr_utils -> egomimic.scripts.eva_process.eva_utils (file renamed earlier))
Delete dead (grep-proven zero importers; scratch backups in scratch/rldb_deleted_backup/):
Verified on a40 alloc:
import egomimicOK, moved module imports OK,deleted modules raise ModuleNotFoundError, pytest tests/ = 136 passed / 8 failed
(pre-existing) / 4 skipped (zero new failures), rldb test_dataset_filter 8 passed.
No config target referenced any touched path.
Co-Authored-By: Claude Opus 4.8 noreply@anthropic.com
eval+pl_utils hierarchy pass: test_model_wrapper -> tests/, pushshapes sim-glue -> embodiment
Group: eval + pl_utils strays (hierarchy pass).
(R-status rename; 94% similar). tests/ has no init.py so pytest imp