Pim/feat/g1 groot wbc part 2#2300
Conversation
…a separate process
…k to dev (1500 LOC)
…y-over from old quadruped branch)
…era_pose, _publish_tf refactor, spec.py docstring expansion)
The DDS callback registered via ChannelSubscriber.Init(handler, 10) doesn't fire reliably on macOS, so the adapter timed out waiting for the first LowState to capture mode_machine. Switch to passive-mode Init(None, 0) and Read() per tick. mode_machine is now hardcoded to the static value for the 29-DOF G1 instead of read-back-then-echoed. Also adds has_motor_states() to satisfy the post-refactor WholeBodyAdapter Protocol, makes _release_sport_mode null-tolerant (CheckMode returns (status, None) once nothing is active), drops the now-unused _on_low_state callback. WebsocketVisModule: restore the arm / disarm / set_dry_run socket.io handlers and the matching activate / dry_run Out[Bool] ports that an earlier cleanup pass dropped — the dashboard buttons now publish on the LCM topics the coordinator already subscribes to. TransportWholeBodyAdapter: trivial Protocol fix — implement read_odom returning None so isinstance(adapter, WholeBodyAdapter) passes the runtime_checkable type check in ControlCoordinator._setup_hardware.
The earlier cleanup pass over-deleted: openarm manipulator support, the dimos/utils/workspace.py module, the audio STT node_whisper edits, CI configs, README, command-center package-lock churn, and two go2 LFS DBs. None of those are part of the GR00T WBC story. Restore each to origin/dev's content. Drop one path that was added on this branch but doesn't exist on dev (api/requirements.txt). PR diff is now 24 files / +2623 / -210, all GR00T-WBC-relevant.
Both Mustafa's #1954 (already on dev) and the GR00T-WBC PR added this method. After the merge, both definitions live in coordinator.py — Python uses the second, the first becomes dead code, and the transport adapter (which depends on hardware_id being passed) silently picks up its default ``"wholebody"`` topic prefix instead of the component's id. Collapse to one definition that passes the union of kwargs both adapters need: dof, hardware_id, network_interface, domain_id, address, plus **adapter_kwargs. Adapters tolerate extras via their constructor's **_ catch-all.
Switch unitree-g1-groot-wbc to the architecture Mustafa landed in #1954 (G1WholeBodyConnection Module + TransportWholeBodyAdapter via LCM bridge), matching unitree-g1-coordinator. Single G1 low-level pattern in the codebase now. * Delete dimos/hardware/whole_body/unitree/g1/adapter.py — the UnitreeG1LowLevelAdapter (single-process DDS) is gone. * Rewrite unitree_g1_groot_wbc.py to compose G1WholeBodyConnection + ControlCoordinator(adapter_type="transport_lcm") + dashboard via autoconnect. * Patch wholebody_connection.py to apply the macOS-cyclonedds fixes the dropped monolith was carrying: - Init(None, 0) instead of Init(callback, 10) - Hardcode mode_machine = 5 (the static value for 29-DOF G1) - publish_loop polls subscriber.Read() per tick - Drop the now-unused _on_low_state callback
_MJCF_PATH was a relative path "data/mujoco_sim/g1_gear_wbc.xml"
which only resolved when dimos was launched from the repo root.
The mujoco viewer subprocess inherits CWD from the launching shell,
so running ``dimos run unitree-g1-groot-wbc-sim`` from any other
directory tripped FileNotFoundError on viewer startup.
get_data("mujoco_sim") resolves the absolute install path and
auto-extracts the LFS tarball on first run.
1. unitree_g1_groot_wbc_sim.py: resolve _MJCF_PATH via get_data so
both MujocoSimModule and the DIMOS_MUJOCO_VIEW=1 subprocess open
the file regardless of shell CWD. Earlier the relative path
"data/mujoco_sim/g1_gear_wbc.xml" only worked from the repo root.
2. wholebody_connection.py: hardcoded mode_machine = 5 has no
fallback if a future G1 firmware revision changes the value.
Add a one-shot warning the first time we read a LowState whose
mode_machine doesn't match. Self-disables after the first log
line so it doesn't spam the operator.
3. test_unitree_g1_groot_wbc.py: composition smoke test verifying
the three deployed modules, bridge adapter binding (transport_lcm,
confirming the migration off the deleted unitree_g1 monolith),
real-hw safety profile (auto_arm=False, auto_dry_run=True,
default_ramp_seconds=10.0), servo_arms defaults, and all bridge
+ dashboard LCM topics. No DDS, no hardware — just module
imports.
Two safety fixes for back-to-back ``dimos run`` cycles on real
hardware:
1. ``stop()`` now sends a final ``LowCmd_`` with ``mode=0x00``
(motors disabled) for every motor slot before closing the DDS
publisher. Previously the last commanded pose lingered in the
motor controllers — when the next process opened a publisher
and re-released sport mode, those stale stiff commands fought
the new participant during the handoff window, producing
audible gearbox stress.
2. ``_release_sport_mode()`` bails immediately if the first
``CheckMode()`` reports nothing active (``result is None`` or
``{"name": ""}``). A clean second start now logs "Sport mode
already released — skipping ReleaseMode" and skips the
loop-and-poll dance entirely, removing the handoff window.
``dimos/project/test_no_sections.py`` forbids ``# -----`` separator banners and ``# --- text ---`` inline section headers as a project style rule. Three files in this PR carried the pattern from earlier drafts; convert inline-section to plain ``# text`` and drop pure separators. No code-behaviour change.
* tick_loop.py / groot_wbc_task.py: import the class
(``from dimos.msgs.geometry_msgs.PoseStamped import PoseStamped``,
same for ``Twist``), not the module — module-as-type tripped
mypy ``[valid-type]``.
* coordinator.py: when constructing ``GrootWBCTask``, narrow ``hw``
to ``ConnectedWholeBody`` via isinstance before passing
``hw.adapter`` (else mypy unions the manipulator + whole-body
adapter types). Also raises a clear TypeError if a non-whole-body
component is referenced by a groot_wbc task.
* test_unitree_g1_groot_wbc.py: type ``_coordinator_kwargs() ->
dict[str, Any]`` so list/iter usages on the values typecheck.
``mypy dimos`` is clean (670 files).
Phase 1 — trivial cleanups:
- Delete test_unitree_g1_groot_wbc.py (the "tests don't run anything"
blueprint test the reviewer asked to drop).
- MujocoEngine.data public property; mujoco_sim_module stops reaching
into engine._data.
- JointServoTask.start() resets _last_update_time so a non-zero-timeout
caller doesn't time out on the first tick.
- Split wholebody_connection's two-tasks-in-one publish loop into
_drain_low_state, _verify_mode_machine_once, _snapshot_motor_imu,
_publish_motor_state_and_imu helpers; dropped the `sub is not None`
guard.
- MujocoSimModuleConfig.imu_gyro_sensor_names + imu_accel_sensor_names
configurable instead of G1-hardcoded; default list still covers the
common humanoid pelvis names.
Phase 2 — renames and moves:
- groot_wbc_task.py -> g1_groot_wbc_task.py; GrootWBCTask ->
G1GrootWBCTask (Paul: G1-specific values, name it).
- Fold _groot_wbc_common.py (joint lists, gain tables, ARM_DEFAULT_POSE)
into g1_groot_wbc_task.py and delete the common file.
- Move dimos/hardware/whole_body/mujoco/g1/adapter.py ->
dimos/simulation/adapters/whole_body/g1.py. Adapter registry now
scans both roots.
Phase 3 — safety: G1GrootWBCTask no longer falls back to zero
when a joint is missing from CoordinatorState. Adds last-known-good
caches (_cached_q_29, _cached_dq_29, _cached_q_15) and a _state_seen
guard that refuses to emit a command until at least one fully-
populated state has been observed. Stops the "packet drop -> snap
to zero pose -> robot falls" failure mode.
Phase 4 — task registry: new dimos/control/tasks/registry.py.
Each task module exposes a register(registry) hook; coordinator's
_create_task_from_config drops 100+ lines of if/elif and becomes
a single control_task_registry.create(cfg.type, cfg, hardware=...)
call. Task type "g1_groot_wbc" is the new canonical name;
"groot_wbc" kept as an alias.
Phase 5 — activate / dry_run from streams to RPC:
- Drop the `activate: In[Bool]` and `dry_run: In[Bool]` ports on
ControlCoordinator. Replaced with @rpc set_activated(engaged) and
@rpc set_dry_run(enabled) -- these are one-shot configuration
events, not continuous signals.
- WebsocketVisModule's arm/disarm/set_dry_run socket.io handlers
call the coordinator's RPC directly via RPCClient(None,
ControlCoordinator); the matching Out[DimosBool] ports go away.
- Blueprints lose their now-dead `("activate", DimosBool)` /
`("dry_run", DimosBool)` LCM transport entries.
Phase 6 — hardware addressing: _create_whole_body_adapter passes a
single canonical `address` (matching the manipulator/twist-base
pattern) instead of overloaded address + network_interface.
Adapter __init__ signatures updated accordingly.
Phase 7 — MuJoCo subprocess hygiene:
- Move the standalone viewer's logic from mujoco_view_subprocess.py
into mujoco_engine.view_main() with `python -m
dimos.simulation.engines.mujoco_engine <mjcf>` entry point.
One MuJoCo entry-point file, not two. Adds the missing lock
around `latest_*` shared state, try-finally for LCM teardown,
and tick-aware render sleep so 60 Hz holds under load.
- New MujocoViewerModule wraps the subprocess lifecycle as a
proper dimos Module (start() spawns, stop() terminates, atexit
fallback). Blueprints declare it instead of running
subprocess.Popen at import time. Configurable via -o
mujocoviewermodule.enabled=true; no more DIMOS_MUJOCO_VIEW env
var.
Phase 8 — one blueprint, --simulation flag:
- Merge unitree_g1_groot_wbc.py + unitree_g1_groot_wbc_sim.py into
a single unitree_g1_groot_wbc.py that branches on
global_config.simulation. Real-hw path: G1WholeBodyConnection +
transport_lcm adapter, 500 Hz, unarmed + dry-run, servo_arms.
Sim path: MujocoSimModule + sim_mujoco_g1 adapter, 50 Hz,
auto-arm, optional MujocoViewerModule.
- Drop the unitree-g1-groot-wbc-sim registry entry; CLI runs
`dimos --simulation run unitree-g1-groot-wbc`.
- Replace get_data() at import time with LfsPath() (Paul's
don't-block-imports note).
- Drop ROBOT_INTERFACE / GROOT_MODEL_DIR / DIMOS_DDS_DOMAIN /
DIMOS_MUJOCO_VIEW env-var reads from blueprints; users override
via `-o module.field=value` like every other dimos blueprint.
Deferred (Mustafa flagged for separate PRs, captured as TODOs):
- CoordinatorState should carry IMU data so the task doesn't have to
reach into the adapter for read_imu (TODO in g1_groot_wbc_task.py).
- Drop read_odom from WholeBodyAdapter Protocol -- most adapters
shouldn't be a source of base pose (TODO in whole_body/spec.py).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…Protocol Promoted the two follow-up TODOs from the PR #2033 review (Mustafa :381 and :91) into this PR after re-reading the actual scope: 1) IMU on CoordinatorState (Mustafa :381): - Add ``imu: dict[hardware_id, IMUState]`` field to CoordinatorState. - TickLoop polls every ConnectedWholeBody's read_imu() each tick via a new _read_all_imu() helper and stuffs the dict on the state. - G1GrootWBCTask reads from state.imu first, falls back to self._adapter.read_imu() only when state.imu is empty (e.g. unit tests that build a bare CoordinatorState). The state path is the non-coupled one going forward. 2) Drop read_odom from WholeBodyAdapter Protocol (Mustafa :91): - Remove read_odom from spec.py — the WholeBodyAdapter Protocol no longer promises base pose. Most adapters never did (real-hw G1's was hardcoded None, transport_lcm's likewise); the few that did (sim_mujoco_g1) reported pose the engine module already publishes itself, so the adapter polling was redundant. - Drop the read_odom impls from SimMujocoG1WholeBodyAdapter and TransportWholeBodyAdapter. - Drop ControlCoordinatorConfig.publish_odom + ControlCoordinator's ``odom: Out[PoseStamped]`` port + TickLoop._publish_odom + odom_callback wiring. ~50 lines deleted. - Sim path: MujocoSimModule's ``odom`` Out is now the sole producer of ``/odom``. Removed the previous ``/sim/odom`` topic override since the autoconnect default for ``(odom, PoseStamped)`` is ``/odom``. - Real-hw path: nothing publishes /odom (matches current behaviour — read_odom returned None). A future estimator Module subscribes to motor_states + imu and publishes to /odom directly, which is the architecturally correct seam. Not addressed in this PR (genuinely pre-existing, not introduced here): Paul's "ideal solution" of converting mujoco_engine.py to run as the main thread in its own process (so mujoco_process.py and mujoco_engine.py can collapse to one). That's a refactor of the pre-existing two-MuJoCo-implementations situation; the PR review only flagged "don't add a third", which Phase 7 of the previous commit already addressed by folding the standalone viewer into mujoco_engine.view_main(). Unifying mujoco_process.py with mujoco_engine.py touches the manipulator stack (xarm/piper) and is out of scope for this G1-WBC PR. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Paul's "ideal solution" from the PR #2033 review (paraphrased): "convert mujoco_engine.py to work as the main thread in a new process if possible." Done. WholeBodySimHooks (new): - Extracts the per-step SHM-bridge logic out of MujocoSimModule into a standalone class: pre_step pulls motor commands from SHM and applies PD-with-feedforward; post_step writes motor state + gripper back to SHM. The Module's _apply_shm_commands + _publish_shm_state methods, the latched _latest_pd_* state, and _gripper_joint_to_ctrl all move here. Same hook bodies now run regardless of whether the engine lives in a thread or a subprocess. mujoco_engine.engine_main (new entry point): - Standalone "whole-body sim subprocess" loaded with `python -m dimos.simulation.engines.mujoco_engine <mjcf> <shm_key> <dof> [--view]`. Owns: MujocoEngine, WholeBodySimHooks, an LCM publisher for /odom and /imu so non-adapter consumers (viser viewer etc.) still get those topics, plus signal handlers that tear down SHM + LCM cleanly on SIGTERM/SIGINT. - Replaces the previous view-only view_main(): the subprocess now runs full physics + viewer instead of just mirroring state from LCM. MujocoSimModule: - New config field `engine_mode: Literal["thread", "subprocess"]` (default "thread"). Thread mode: today's behaviour, untouched except for delegating SHM hooks to WholeBodySimHooks instead of inline methods. Subprocess mode: skips in-process engine entirely, spawns `engine_main` via subprocess.Popen, terminates it cleanly on stop(). macOS uses mjpython for the subprocess (Cocoa main- thread requirement); Linux uses sys.executable. - Subprocess + cameras combo is intentionally rejected with a clear error — no cross-process frame buffer yet; users either disable cameras or pick thread mode. MujocoViewerModule deleted: - Now redundant. The engine subprocess can launch its own viewer directly with viewer.launch_passive (on its own main thread) when the Module is configured headless=False + engine_mode="subprocess". Blueprint loses its standalone _viewer module entry; the engine subprocess handles viewing. What this does NOT do (deliberately): - Touch the manipulator stack (dimos/simulation/mujoco/mujoco_process.py). Paul didn't ask for that — he asked specifically for mujoco_engine.py to support subprocess+main-thread, which is what landed. The pre- existing two-MuJoCo-implementations situation can collapse to one in a separate refactor that addresses xarm + piper too. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…t import
Blueprints can inspect ``global_config`` at module top level to pick
between real-hw and sim backends (``unitree_g1_groot_wbc.py`` branches
on ``global_config.simulation`` to decide which WholeBodyAdapter to
wire). Until now, the ``run`` command only handed the overrides off to
``ModuleCoordinator.build`` *after* the blueprint was already imported —
so the import-time branch read the stale defaults and the resulting
BlueprintConfig was missing the modules the user was trying to
configure via ``-o module.field=value``. Repro before this fix:
$ dimos --simulation run unitree-g1-groot-wbc \\
-o mujocosimmodule.engine_mode=subprocess
ValidationError: mujocosimmodule
Extra inputs are not permitted
(Real-hw branch was chosen at import; ``mujocosimmodule`` slot doesn't
exist in the resulting config.)
Apply the same ``global_config.update(**cli_config_overrides)`` the
``show_config`` command already uses, but earlier — before
``get_by_name_or_exit`` triggers the blueprint module import.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This reverts commit 99eb7aa.
Strip self-instruction markers and verbose block comments; replace private-attribute mutation in MujocoSimModule with a public MujocoEngine.set_step_hooks API.
…c-part-2 # Conflicts: # dimos/control/tasks/registry.py # dimos/hardware/whole_body/registry.py # dimos/robot/config.py
…-part-2 # Conflicts: # dimos/control/tasks/registry.py
Codecov Report❌ Patch coverage is 📢 Thoughts on this report? Let us know! |
mjOBJ_SENSOR and mjJNT_FREE exist at runtime but aren't in the stubs mypy resolves, breaking the lint job.
Greptile SummaryThis PR lands the G1 GR00T whole-body-control walking policy as a first-class
Confidence Score: 3/5The simulation path is well-constructed and the policy task logic looks correct, but the hardcoded The GR00T task observation vector (86-dim × 6 frames), ramp lifecycle, SHM command sequencing, and dimos/robot/unitree/g1/wholebody_connection.py — Important Files Changed
Sequence DiagramsequenceDiagram
participant Twist as /g1/cmd_vel
participant CC as ControlCoordinator
participant Task as G1GrootWBCTask
participant ONNX as ONNX (balance/walk)
participant Adapter as WholeBodyAdapter
participant HW as Hardware / MuJoCo SHM
Note over CC,Task: Tick loop (500 Hz real / 50 Hz sim)
CC->>Task: compute(CoordinatorState)
Task->>Task: _refresh_state_caches(state)
Task->>ONNX: run(obs_buf [1x516])
ONNX-->>Task: action [15]
Task->>Task: "target_q = action*scale + default_15"
Task-->>CC: JointCommandOutput (15 joints)
CC->>Adapter: write_motor_commands(29 MotorCommand)
Adapter->>HW: SHM write_pd_tau_command / DDS LowCmd
Twist->>CC: Twist msg
CC->>Task: set_velocity_command(vx, vy, wz, t)
HW-->>Adapter: SHM joint_state + IMU / DDS LowState
Adapter-->>CC: read_motor_states() / read_imu()
CC-->>Task: CoordinatorState.joints + .imu (next tick)
|
| self._armed = False | ||
| logger.info( | ||
| f"G1GrootWBCTask '{self._name}' arming: " | ||
| f"ramp → default_15 over {self._arming_duration:.1f}s" | ||
| ) | ||
| else: | ||
| self._arming = False | ||
| self._armed = True | ||
| self._reset_policy_state() | ||
| logger.info(f"G1GrootWBCTask '{self._name}' armed (no ramp)") | ||
| self._arm_pending = False | ||
|
|
||
| # Unarmed & not arming: echo current joint positions. With the | ||
| # component's kp/kd applied downstream, q_tgt == q_actual yields | ||
| # pure damping (tau = -kd * dq), which mirrors the reference | ||
| # backend's inactive "hold current pose" behaviour. | ||
| if not self._armed and not self._arming: | ||
| self._last_targets = current_15.tolist() | ||
| return JointCommandOutput( | ||
| joint_names=self._joint_names_list, | ||
| positions=self._last_targets, | ||
| mode=ControlMode.SERVO_POSITION, | ||
| ) | ||
|
|
||
| # Arming: lerp ramp_start → default_15 over arming_duration. | ||
| if self._arming: | ||
| assert self._ramp_start is not None |
There was a problem hiding this comment.
Redundant double-fetch of the 15 leg/waist joints in
_refresh_state_caches
_joint_names_list (15 joints) is a prefix of _all_joint_names (29 joints), so the first loop fetches positions for each leg/waist joint for _cached_q_15, and the second loop fetches the same joints again as part of the full 29. At 500 Hz this is 30 redundant lookups per tick. The two caches can be unified: populate _cached_q_29 once over all 29 joints, then derive _cached_q_15 = _cached_q_29[:15] in one slice copy.
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
| op, err = errors[0] | ||
| raise RuntimeError(f"MujocoSimModule.stop() failed during {op}: {err}") from err | ||
|
|
||
| def _apply_shm_commands(self, engine: MujocoEngine) -> None: | ||
| """Pre-step hook: pull command targets from SHM into the engine.""" | ||
| def _publish_shm_and_lcm(self, engine: MujocoEngine) -> None: | ||
| """Post-step hook: SHM writes + LCM publishes. | ||
|
|
||
| This stays in the module so odom/IMU continue to flow through normal | ||
| typed ports while the whole-body adapter consumes joint state via SHM. | ||
| """ | ||
| if self._sim_hooks is not None: | ||
| self._sim_hooks.post_step(engine) | ||
| shm = self._shm | ||
| if shm is None: | ||
| return | ||
| dof = self.config.dof | ||
|
|
||
| pos_cmd = shm.read_position_command(dof) | ||
| if pos_cmd is not None: | ||
| engine.write_joint_command(JointState(position=pos_cmd.tolist())) | ||
|
|
||
| vel_cmd = shm.read_velocity_command(dof) | ||
| if vel_cmd is not None: | ||
| engine.write_joint_command(JointState(velocity=vel_cmd.tolist())) | ||
|
|
||
| if self._gripper_idx is not None: | ||
| gripper_cmd = shm.read_gripper_command() | ||
| if gripper_cmd is not None: | ||
| ctrl_value = self._gripper_joint_to_ctrl(gripper_cmd) | ||
| engine.set_position_target(self._gripper_idx, ctrl_value) | ||
| if not self._shm_ready_signaled: | ||
| shm.signal_ready(num_joints=len(engine.joint_names)) | ||
| self._shm_ready_signaled = True |
There was a problem hiding this comment.
signal_ready fires before IMU is written in the same hook
_publish_shm_and_lcm calls _sim_hooks.post_step(engine) (joint state → SHM) and then shm.signal_ready(...) before reaching shm.write_imu(...). An adapter that polls is_ready() in a tight loop could call read_imu() in the window between signal_ready and write_imu, receiving all-zeros — an invalid quaternion (0,0,0,0) rather than the identity. Reordering so write_imu precedes signal_ready would close this window.
Problem
Part 1 (#2239) added the coordinator pieces needed for whole-body control but did not include the policy. This PR lands the G1 GR00T walking policy as a first-class
ControlCoordinatortask and ships a blueprint to run it.Solution
Adds a
g1_groot_wbcControlTaskthat runs the balance and walk ONNX policies, with the arming / dry-run / ramp lifecycle wired through the coordinator.Adds the
unitree-g1-groot-wbcblueprint, which selects the backend with--simulation:G1WholeBodyConnection+transport_lcmat 500 HzMujocoSimModule+sim_mujoco_g1at 50 HzTwists on
/g1/cmd_velflow through the coordinator into the task's velocity command.How to Test
Real hardware:
In a Python shell:
Wait 10 seconds for the ramp. Then publish a
Twistto/g1/cmd_velto drive it.Simulation:
Contributor License Agreement