diff --git a/.gitignore b/.gitignore index aedee04af7..e8a05f75b0 100644 --- a/.gitignore +++ b/.gitignore @@ -86,3 +86,4 @@ htmlcov/ # Memory2 autorecord recording*.db +MUJOCO_LOG.TXT diff --git a/HACKATHON.md b/HACKATHON.md new file mode 100644 index 0000000000..2436d8ed6f --- /dev/null +++ b/HACKATHON.md @@ -0,0 +1,275 @@ +# DimOS · Hogwarts Hackathon Working Tree (`jamjam_ui` branch) + +> Branched from [`jamjamDimos/dimos:jamjam`](https://github.com/jamjamDimos/dimos/tree/jamjam), +> which itself layers YOLO-E open-vocab tracking + target-lock + bbox-distance +> follow on top of [`dimensionalOS/dimos`](https://github.com/dimensionalOS/dimos). +> +> This file documents **what `jamjam_ui` adds on top of `jamjam`** and how to +> run it. The upstream-only docs live in [`README.md`](./README.md); the +> jamjam control stack docs live in +> [`dimos/robot/custom/README.md`](./dimos/robot/custom/README.md). + +--- + +## TL;DR + +```bash +# MuJoCo simulation, no hardware needed — best for trying the UI +SIM=mujoco scripts/run-blueprint.sh +open http://localhost:7782/ + +# Real Go2 on its own AP (dimair10 / 192.168.12.1): +sudo route add -net 224.0.0.0/4 -interface lo0 # LCM multicast onto lo0 +scripts/run-blueprint.sh # default ROBOT_IP=192.168.12.1 +open http://localhost:7782/ +``` + +Logs land in `/tmp/dimos-logs/` with a `latest.log` symlink (auto-rotated; keeps newest 20). + +--- + +## Two-sentence overview of `jamjam_ui` + +`jamjam_ui` adds **a Harry-Potter "Marauder's Map" web app** (`dimos/apps/marauders_map/`) +on top of jamjam's perception/control stack. The app composes upstream +navigation + jamjam's tracking + the WebRTC connection into a single +blueprint, then renders a parchment-styled web UI on port 7782 that clients +can click to track people, click to navigate, and drive with WASD. + +Everything new lives **under `dimos/apps/`** so jamjam upstream can fast-forward +with zero merge conflict. + +--- + +## Project structure (after this branch's additions) + +``` +DimOS/ ← origin = dimensionalOS/dimos +│ ← jamjam = jamjamDimos/dimos +│ ← branch = jamjam_ui +│ +├── HACKATHON.md ★ this file +├── README.md upstream DimOS README — untouched +│ +├── dimos/apps/ ★ NEW NAMESPACE — application code lives here +│ └── marauders_map/ ★ The Wrath of Filch +│ ├── blueprint.py autoconnect composition of the full closed loop +│ ├── module.py ReidMapModule — Starlette+socket.io on :7782 +│ └── templates/ +│ ├── marauders_map.html parchment UI + intro overlay + quest chip + BGM +│ ├── hp_characters.js 100 HP characters × 10 lines, baked offline +│ ├── socketio.min.js +│ └── potion_latch.mp3 looped background music +│ +├── dimos/robot/custom/ jamjam's open-vocab control stack +│ │ (inherited from jamjam/jamjam — see custom/README.md) +│ ├── modules/ +│ │ ├── yoloe_tracking_module.py open-vocab YOLO-E + BoT-SORT tracking +│ │ ├── bbox_selection_module.py sole writer of /user_selected_bbox +│ │ ├── target_lock_module.py debouncing + spatial re-association +│ │ └── go2_startup_self_check_module.py +│ ├── tasks/ +│ │ └── bbox_distance_behavior_module.py pinhole bbox → Twist follower +│ ├── visualization/ +│ │ └── detection2d_overlay.py Detection2DArray → Rerun overlay +│ ├── blueprints/ autoconnect compositions +│ │ ├── yoloe_target_lock_distance_follow.py full closed-loop demo +│ │ ├── bbox_distance_follow.py minimal task test +│ │ ├── yoloe_keyboard_teleop.py +│ │ ├── yoloe_tracking_test.py +│ │ └── go2_startup_self_check.py +│ └── tests/ pytest-able module + task tests +│ +├── dimos/robot/unitree/ +│ ├── connection.py ★ FIX: free_walk() now also sends +│ │ SwitchJoystick(1027) — fixes the +│ │ "dog sways in place" bug. +│ └── go2/ +│ └── connection.py ★ FIX: 2 FreeWalk publishes on init + +│ 15s heartbeat thread; hasattr-guarded +│ so MuJoCo/DimSim/Replay don't crash. +│ +├── examples/ ★ HACKATHON SAMPLES +│ ├── go2_phone_control/ Phone-as-controller webapp (mock+real) +│ │ ├── server.py FastAPI + WebRTC + YOLO+OSNet +│ │ ├── index.html GameBoy keypad page +│ │ ├── people.html tracked-people map page +│ │ └── halt.py standalone E-STOP +│ ├── go2_walk_forward.py ★ minimal direct-WebRTC scripts +│ ├── go2_walk_backward.py +│ └── go2_person_aware_walk.py +│ +├── scripts/ +│ └── run-blueprint.sh ★ auto-logged runner; SIM/REPLAY/ROBOT_IP envs; +│ picks mjpython on macOS for MuJoCo's +│ main-thread GL context +│ +└── docs/research/ + ├── go2_marauders_map_design.md design rationale for the app + └── reid_capability_summary.md re-ID architecture notes +``` + +`★` marks paths added or modified on this branch. Everything else either came +from `jamjam/jamjam` or `dimensionalOS/dimos` upstream. + +--- + +## The Marauder's Map app — what the page actually does + +Open `http://localhost:7782/` after starting a blueprint. The page shows: + +| UI element | Behavior | +|---|---| +| **Opening parchment scroll** | Unfurls + per-character "magick-in" reveal of the story beats (~11s, skippable with Esc / corner link) | +| **Live floor plan** | Detected people pinned by stable HP character names; footprint trails | +| **Click a face on the map** | Confirm dialog → BBoxSelectionModule → TargetLockModule → bbox-distance follower drives `nav_cmd_vel` | +| **Click free map space** | Publishes `goal_request` → ReplanningAStarPlanner drives `nav_cmd_vel` toward that point | +| **Roster (right column) click** | Direct select-and-track (no confirm — list is a deliberate picker) | +| **`View all` gallery** | Full-screen face grid; click direct-tracks, interrupting any prior track or nav | +| **Teleop pad (bottom)** | Q/W/E/A/S/D + STOP buttons; `` = STOP; keyboard drives the same handlers as buttons | +| **Quest chip (top-left)** | Always-visible mini brief; click to expand the win conditions | +| **🔊/🔇 (top-right)** | Toggles `Potion Latch.mp3` looped BGM. Muted-autoplay on load → unmutes on first user interaction (the only autoplay shape every browser permits) | + +**Exclusivity rule.** The planner and the bbox-follower both publish to +`nav_cmd_vel`. The server-side `select` event publishes `stop_movement=True` +before adopting the new selection, and `navigate` clears any active bbox lock +first. So picking a face cancels a nav goal, and clicking a nav point cancels +a follow — both via a single user gesture. + +--- + +## What this branch adds on top of `jamjam/jamjam` + +### A. The app — `dimos/apps/marauders_map/` + +A new top-level namespace `dimos/apps/` for hackathon-flavored full-stack +applications. The marauders_map app composes: + +- `unitree_go2_basic` (connection + camera + lidar) +- `VoxelGridMapper` + `CostMapper` (world voxel map + 2D parchment walls) +- `YoloeTrackingModule` + `BBoxSelectionModule` + `TargetLockModule` (jamjam) +- `BBoxDistanceBehaviorModule` (jamjam follower) +- `MovementManager` (priority mux: nav vs teleop) +- `ReplanningAStarPlanner` (planner — consumes `goal_request`) +- `ReidMapModule` (this app's web server) + +…via `autoconnect(...)` with remappings, plus an LCM transport map. See +`dimos/apps/marauders_map/blueprint.py` for the wiring detail. + +### B. Hardware fixes — `dimos/robot/unitree/` + +1. **`free_walk()` + `SwitchJoystick(1027)`** in `connection.py`. Without the + second command, lx/ly are interpreted as body-pose lean (BalanceStand + semantics) and the dog only sways — even after FreeWalk has been issued. + The fix is modeled on `enable_rage_mode()` which already does both. +2. **FreeWalk init hardening** in `go2/connection.py`. Two FreeWalk publishes + on init (with a settle gap) plus a 15-second background heartbeat that + re-asserts FreeWalk. Guarded by `hasattr(connection, "free_walk")` so the + simulation backends (`MujocoConnection`, `DimSimConnection`, + `ReplayConnection`) don't crash. + +### C. The runner — `scripts/run-blueprint.sh` + +A thin wrapper around `dimos` that: + +- Resolves run mode by env var. Priority: **`SIM` > `REPLAY` > `ROBOT_IP`**. +- Uses `mjpython` on macOS when `SIM=mujoco` (MuJoCo's GL window has to own + the Cocoa main thread). +- Writes every run to `/tmp/dimos-logs/-.log` with a + header (git HEAD, uncommitted file count, full command). Auto-rotates to + the newest 20. +- Refreshes `/tmp/dimos-logs/latest.log` symlink on exit. + +### D. Examples — `examples/` + +- `go2_phone_control/` — standalone phone-as-controller webapp. The Mac + holds one persistent WebRTC session to the dog; a mobile-friendly page + posts button taps over HTTP. Two UIs (GameBoy keypad + people tracker) + plus an E-STOP script (`halt.py`) that opens a fresh WebRTC and forces + the dog to lie down. +- `go2_walk_forward.py`, `go2_walk_backward.py`, `go2_person_aware_walk.py` + — minimal direct-WebRTC scripts. No blueprint, no LCM. Useful as the + smallest possible reproducers for movement debugging. + +### E. Research notes — `docs/research/` + +- `go2_marauders_map_design.md` — design rationale for the app's web/control split. +- `reid_capability_summary.md` — re-ID architecture (OSNet/TorchReID). + +--- + +## Running it — picking the right mode + +| Goal | Command | +|---|---| +| MuJoCo physics simulation | `SIM=mujoco scripts/run-blueprint.sh` | +| DimSim simulation | `SIM=dimsim scripts/run-blueprint.sh` | +| Replay a recorded dataset | `REPLAY=1 scripts/run-blueprint.sh` | +| Real Go2, default IP `192.168.12.1` | `scripts/run-blueprint.sh` | +| Real Go2 elsewhere | `ROBOT_IP=192.168.123.18 scripts/run-blueprint.sh` | +| Run a different blueprint | `scripts/run-blueprint.sh yoloe-target-lock-distance-follow` | + +Anything after the blueprint name is forwarded straight to `dimos`. + +--- + +## Data flow (text wiring diagram) + +``` + ┌──────────────────────┐ + camera ───▶│ YoloeTrackingModule │── /color_image/yoloe_detections ─┐ + └──────────────────────┘ │ + ▼ + ┌──────────────────────┐ ┌────────────────────────┐ + Rerun click ▶│ BBoxSelectionModule │── /color_image/ │ TargetLockModule │ + Web click ──▶│ (sole writer of │ selected_bbox ─▶│ (debounce + reassoc.) │ + │ /user_selected_bbox)│ └────────────────────────┘ + └──────────────────────┘ │ + │ /color_image/locked_bbox + ▼ + ┌──────────────────────────────┐ + │ BBoxDistanceBehaviorModule │ + │ (bbox → Twist follower) │ + └──────────────────────────────┘ + │ + ┌──────────────────────┐ │ nav_cmd_vel + Map click ▶│ ReidMapModule │── goal_request ▼ + (free pt) │ (web :7782) │──────────────────▶ ReplanningA*Planner ──┐ + │ │── tele_cmd_vel ───────────────────┐ │ + └──────────────────────┘ ▼ ▼ + ┌──────────────────┐ + │ MovementManager │ ← priority mux + │ (nav | tele) │ + └──────────────────┘ + │ + │ cmd_vel + ▼ + ┌──────────────────┐ + │ GO2Connection │ ← WebRTC → real / MuJoCo + └──────────────────┘ +``` + +Both human-facing selectors (Rerun camera click and Marauder's Map web click) +feed the **same** `BBoxSelectionModule`. The planner and the follower share +`nav_cmd_vel`; exclusivity is enforced in `ReidMapModule`'s socket handlers. + +--- + +## E-STOP — the layered safety story + +Order of preference when the dog goes wrong: + +1. **Physical remote — `L2 + B`** (damp). Always works. +2. **Web `STOP` button or `` key** on the Marauder's Map page. +3. **`POST /api/halt`** on the phone-control server (`examples/go2_phone_control/server.py`). +4. **`halt.py`** — opens a fresh WebRTC session, sends 5× zero velocity, then `liedown`. + Survives blueprint hang/crash: + ```bash + PATH="$PWD/.venv/bin:$PATH" .venv/bin/python examples/go2_phone_control/halt.py + ``` + +--- + +## License + +Apache-2.0, inherited from upstream DimOS. diff --git a/dimos/apps/__init__.py b/dimos/apps/__init__.py new file mode 100644 index 0000000000..7d6879a0b8 --- /dev/null +++ b/dimos/apps/__init__.py @@ -0,0 +1,8 @@ +"""User-built applications layered on top of the DimOS core / robot stacks. + +Anything in this package belongs to the application authors. The core +`dimos/robot/custom/` modules (e.g. jamjam's BBox / target-lock / behavior +modules) are imported and composed here but never modified -- so the upstream +jamjam control stack can be fast-forwarded independently of the apps that +ride on top. +""" diff --git a/dimos/apps/marauders_map/__init__.py b/dimos/apps/marauders_map/__init__.py new file mode 100644 index 0000000000..c5ec421968 --- /dev/null +++ b/dimos/apps/marauders_map/__init__.py @@ -0,0 +1,19 @@ +"""Harry-Potter "Marauder's Map" web app for the Go2. + +A self-contained application built ON TOP of the jamjam target-lock / +distance-follow control stack (in ``dimos.robot.custom``). It composes those +modules into a blueprint and adds a parchment-styled web UI on port 7782 with: + +* A live floor plan of detected people, each mapped to a stable HP character +* A click-to-track flow that drives the same BBoxSelectionModule the Rerun + camera view already uses (no modifications to jamjam modules required -- + we synthesize a PointStamped on the existing ``clicked_point`` input) +* An on-page teleop pad (W/A/S/D/Q/E + STOP) that publishes Twist on + ``tele_cmd_vel`` (MovementManager's existing priority lane) +* Hover tooltips, a full-roster gallery modal, HP-themed quote bubbles, + per-person photo lightbox, and per-laptop SSID viewpoint labels + +The whole app lives under this directory and depends only on public +interfaces of ``dimos.robot.custom`` -- so jamjam upstream can be +fast-forwarded with zero merge conflict here. +""" diff --git a/dimos/apps/marauders_map/blueprint.py b/dimos/apps/marauders_map/blueprint.py new file mode 100644 index 0000000000..f8e96ea484 --- /dev/null +++ b/dimos/apps/marauders_map/blueprint.py @@ -0,0 +1,222 @@ +# Copyright 2025-2026 Dimensional Inc. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +"""Marauder's Map blueprint for the Go2. + +A *thin* composition layer on top of jamjam's target-lock / distance-follow / +selection stack. This file lives under ``dimos/apps/`` so the upstream jamjam +control modules (``dimos/robot/custom/...``) and the dimos navigation / +movement manager can be fast-forwarded with zero merge conflict here. + +How selection is wired without touching jamjam's modules: + * Rerun camera-image click → BBoxSelectionModule.clicked_point (existing) + * Web map click → ReidMapModule synthesizes a PointStamped at + the chosen person's bbox center with + ``frame_id="color_image/web_click"``, which + flows into the SAME clicked_point topic + → BBoxSelectionModule treats it identically + to a Rerun camera click. + +Other web→robot wiring (also non-invasive): + * ReidMapModule.cmd_vel → ``tele_cmd_vel`` (mux'd with priority by + MovementManager — replaces pygame KeyboardTeleop, which crashes on + macOS from a worker thread). + * ReidMapModule.goal_request → ReplanningAStarPlanner (click free map + space to publish a navigation goal). + +Run: + uv run dimos --robot-ip 192.168.12.1 --rerun-open native run go2-marauders-map + dimos --replay run go2-marauders-map # no hardware, replay data has people + +Open: http://localhost:7782/ (the map) + Rerun viewer (camera + 3D + bboxes) +""" + +from __future__ import annotations + +from typing import Any + +from dimos.apps.marauders_map.module import ReidMapModule +from dimos.core.coordination.blueprints import autoconnect +from dimos.core.coordination.module_coordinator import ModuleCoordinator +from dimos.core.global_config import global_config +from dimos.core.transport import LCMTransport +from dimos.mapping.costmapper import CostMapper +from dimos.mapping.voxels import VoxelGridMapper +from dimos.msgs.geometry_msgs.Twist import Twist +from dimos.msgs.vision_msgs.Detection2DArray import Detection2DArray +from dimos.navigation.movement_manager.movement_manager import MovementManager +from dimos.navigation.replanning_a_star.module import ReplanningAStarPlanner +from dimos.robot.custom.modules.bbox_selection_module import BBoxSelectionModule +from dimos.robot.custom.modules.target_lock_module import TargetLockModule +from dimos.robot.custom.modules.yoloe_tracking_module import ( + YoloeTrackingModule, + _require_yoloe_lrpc_model, +) +from dimos.robot.custom.tasks.bbox_distance_behavior_module import ( + BBoxDistanceBehaviorModule, +) +from dimos.robot.custom.visualization.detection2d_overlay import ( + selected_bbox_overlay, + yoloe_overlay, +) +from dimos.robot.unitree.go2.blueprints.basic.unitree_go2_basic import ( + rerun_config as go2_rerun_config, + unitree_go2_basic, +) +from dimos.robot.unitree.go2.connection import GO2Connection +from dimos.visualization.vis_module import vis_module + +# LCM topic / Rerun entity wiring (matches yoloe_target_lock_distance_follow). +_YOLOE_DETECTIONS_TOPIC = "/color_image/yoloe_detections" +_USER_SELECTED_BBOX_TOPIC = "/color_image/selected_bbox" +_LOCKED_BBOX_TOPIC = "/color_image/locked_bbox" +_NAV_CMD_VEL_TOPIC = "/nav_cmd_vel" +_TELE_CMD_VEL_TOPIC = "/tele_cmd_vel" + +_YOLOE_DETECTIONS_ENTITY = "world/color_image/yoloe_detections" +_USER_SELECTED_BBOX_ENTITY = "world/color_image/selected_bbox" +_LOCKED_BBOX_ENTITY = "world/color_image/locked_bbox" + + +def _marauders_rerun_blueprint() -> Any: + import rerun as rr + import rerun.blueprint as rrb + + return rrb.Blueprint( + rrb.Horizontal( + rrb.Spatial2DView( + origin="world/color_image", + contents=["world/color_image/**"], + name="Camera", + ), + rrb.Spatial3DView( + origin="world", + contents=[ + "world/**", + f"-{_YOLOE_DETECTIONS_ENTITY}", + f"-{_USER_SELECTED_BBOX_ENTITY}", + f"-{_LOCKED_BBOX_ENTITY}", + ], + name="3D", + background=rrb.Background(kind="SolidColor", color=[0, 0, 0]), + line_grid=rrb.LineGrid3D(plane=rr.components.Plane3D.XY.with_distance(0.5)), + overrides={"world/lidar": rrb.EntityBehavior(visible=False)}, + ), + column_shares=[1, 2], + ), + rrb.TimePanel(state="hidden"), + rrb.SelectionPanel(state="hidden"), + ) + + +_marauders_rerun_config = { + **go2_rerun_config, + "blueprint": _marauders_rerun_blueprint, + "visual_override": { + **go2_rerun_config["visual_override"], + _YOLOE_DETECTIONS_ENTITY: yoloe_overlay, + _USER_SELECTED_BBOX_ENTITY: selected_bbox_overlay, + _LOCKED_BBOX_ENTITY: selected_bbox_overlay, + }, +} + +_marauders_vis = vis_module( + viewer_backend=global_config.viewer, + rerun_config=_marauders_rerun_config, +) + + +# Composition (jamjam control stack untouched; only routed): +# * unitree_go2_basic — connection + camera + lidar + base viewer wiring +# * VoxelGridMapper — world-frame voxel map (needed for ReidMapModule's +# Detection3DPC.from_2d person localization) +# * CostMapper — 2D occupancy/cost grid, drawn as the parchment walls +# * YoloeTrackingModule — YOLO-E open-vocab detect + BoT-SORT track +# * BBoxSelectionModule — single source of truth for "which person is selected" +# (accepts Rerun click + our synthesized web click via +# the same clicked_point input) +# * TargetLockModule — locked-bbox state machine +# * BBoxDistanceBehaviorModule — distance-follow control loop +# * MovementManager — tele/nav cmd_vel mux +# * ReplanningAStarPlanner — point-to-go navigation (consumes goal_request) +# * ReidMapModule — Marauder's Map web view on :7782 (UI layer only) +go2_marauders_map = ( + autoconnect( + unitree_go2_basic, + VoxelGridMapper.blueprint(emit_every=5), + CostMapper.blueprint(), + _marauders_vis, + YoloeTrackingModule.blueprint(), + BBoxSelectionModule.blueprint(), + TargetLockModule.blueprint(), + BBoxDistanceBehaviorModule.blueprint(), + MovementManager.blueprint(), + ReplanningAStarPlanner.blueprint(), + ReidMapModule.blueprint( + camera_info=GO2Connection.camera_info_static, + enable_reid=False, # rely on YOLO-E/BoT-SORT + spatial re-association + ), + ) + .global_config( + n_workers=13, + robot_model="unitree_go2", + ) + .remappings( + [ + # ─── Selection chain ──────────────────────────────────────────── + # Both Rerun camera click and our web "click" feed BBoxSelectionModule + # through its existing `clicked_point` input. BBoxSelectionModule is + # the sole writer of /user_selected_bbox; TargetLockModule consumes + # it; the locked output is remapped to /selected_bbox for the + # follower task and the Rerun green-box overlay. + (BBoxSelectionModule, "selected_bbox", "user_selected_bbox"), + (TargetLockModule, "selected_bbox", "user_selected_bbox"), + (TargetLockModule, "locked_bbox", "selected_bbox"), + # ─── Follow task → cmd_vel mux ────────────────────────────────── + (BBoxDistanceBehaviorModule, "cmd_vel", "nav_cmd_vel"), + # ─── Web teleop → mux (priority) ──────────────────────────────── + (ReidMapModule, "cmd_vel", "tele_cmd_vel"), + # ─── Teleop activity → pause follow + clear selection ─────────── + # jamjam e9a82939: BBoxSelectionModule and TargetLockModule both gained + # `stop_movement: In[Bool]` so any source of teleop-activity wipes the + # current target and the old bbox can't auto-restart the follow on the + # next detection frame. + (MovementManager, "stop_movement", "teleop_active"), + (BBoxSelectionModule, "stop_movement", "teleop_active"), + (TargetLockModule, "stop_movement", "teleop_active"), + # ─── Marauder's Map world localization ────────────────────────── + (ReidMapModule, "pointcloud", "global_map"), + ] + ) + .transports( + { + ("detections", Detection2DArray): LCMTransport( + _YOLOE_DETECTIONS_TOPIC, Detection2DArray + ), + ("user_selected_bbox", Detection2DArray): LCMTransport( + _USER_SELECTED_BBOX_TOPIC, Detection2DArray + ), + ("locked_bbox", Detection2DArray): LCMTransport(_LOCKED_BBOX_TOPIC, Detection2DArray), + ("nav_cmd_vel", Twist): LCMTransport(_NAV_CMD_VEL_TOPIC, Twist), + ("tele_cmd_vel", Twist): LCMTransport(_TELE_CMD_VEL_TOPIC, Twist), + } + ) + .requirements(_require_yoloe_lrpc_model) +) + +__all__ = ["go2_marauders_map"] + + +if __name__ == "__main__": + ModuleCoordinator.build(go2_marauders_map).loop() diff --git a/dimos/apps/marauders_map/module.py b/dimos/apps/marauders_map/module.py new file mode 100644 index 0000000000..c1ab5d2c05 --- /dev/null +++ b/dimos/apps/marauders_map/module.py @@ -0,0 +1,835 @@ +#!/usr/bin/env python3 + +# Copyright 2025-2026 Dimensional Inc. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +"""Marauder's Map web view. + +A Harry-Potter "Marauder's Map" styled floor plan of the area the robot has +mapped, with every detected person plotted at their world position as an +ID-tagged footprint marker. Two display modes (toggled in the browser): + + * ID only -- just the code, e.g. ``#3`` + * ID + photo -- the code plus the person's cropped camera thumbnail + +Data sources (all already produced by ``unitree-go2-detection``): + * ``global_costmap`` (OccupancyGrid) -> the parchment floor plan + * ``color_image`` (Image) -> source for per-person crops + * ``detections`` (Detection2DArray)-> per-person id/class/bbox + * ``pointcloud`` (PointCloud2, world frame) -> world-frame localization + * ``odom`` (PoseStamped) -> the robot's own dot on the map + +World localization reuses ``Detection3DPC.from_2d`` (the same projection +``Detection3DModule`` uses) so a person's map position is the centroid of the +world points inside their detection box -- and the original track_id / class / +crop survive the projection. + +Same authoring pattern as the reID-marking feature; served on its own port +(default 7782) so it can run alongside the command center (7779) and the +marking view (7781). +""" + +import asyncio +import base64 +import math +from pathlib import Path as FilePath +import subprocess +import sys +import threading +import time +from typing import Any + +import cv2 +from dimos_lcm.std_msgs import Bool # planner stop_movement uses LCM std_msgs/Bool +import numpy as np +from reactivex.disposable import Disposable +import socketio # type: ignore[import-untyped] +from starlette.applications import Starlette +from starlette.responses import FileResponse, Response +from starlette.routing import Route +import uvicorn + +from dimos.constants import DEFAULT_THREAD_JOIN_TIMEOUT +from dimos.core.core import rpc +from dimos.core.global_config import global_config +from dimos.core.module import Module, ModuleConfig +from dimos.core.stream import In, Out +from dimos.msgs.geometry_msgs.PointStamped import PointStamped +from dimos.msgs.geometry_msgs.PoseStamped import PoseStamped +from dimos.msgs.geometry_msgs.Twist import Twist +from dimos.msgs.geometry_msgs.Vector3 import Vector3 +from dimos.msgs.nav_msgs.OccupancyGrid import OccupancyGrid +from dimos.msgs.sensor_msgs.CameraInfo import CameraInfo +from dimos.msgs.sensor_msgs.Image import Image +from dimos.msgs.sensor_msgs.PointCloud2 import PointCloud2 +from dimos.msgs.vision_msgs.Detection2DArray import Detection2DArray +from dimos.perception.detection.type.detection2d.imageDetections2D import ImageDetections2D +from dimos.perception.detection.type.detection3d.pointcloud import Detection3DPC +from dimos.utils.logging_config import setup_logger + +logger = setup_logger() + +_TEMPLATES_DIR = FilePath(__file__).parent / "templates" +_MAP_HTML = _TEMPLATES_DIR / "marauders_map.html" +_SOCKETIO_JS = _TEMPLATES_DIR / "socketio.min.js" +# 100 Harry Potter characters x 10 lines each, baked locally so the map works +# fully offline (the field network has no internet). +_HP_JS = _TEMPLATES_DIR / "hp_characters.js" +# Ambient BGM — "Potion Latch", loop-played in the page. File ships in this +# folder so the offline field network works. +_BGM_MP3 = _TEMPLATES_DIR / "potion_latch.mp3" + +PERSON_CLASS_ID = 0 # COCO "person" +DOG_CLASS_ID = 16 # COCO "dog" -- used as the heuristic for "another Go2 quadruped" +# Dog roster ids live in a separate numeric range so they never collide with +# the small long-term person ids assigned by reID. +DOG_ID_BASE = 1_000_000 + + +class ReidMapConfig(ModuleConfig): + camera_info: CameraInfo + + port: int = 7782 + # The pointcloud frame (global map is published in the world frame). + world_frame_hint: str = "world" + # Only plot people; set False to plot every detected object. + persons_only: bool = True + # Also flag other Unitree-Go2-like quadrupeds (COCO "dog") with a special icon. + detect_robot_dogs: bool = True + robot_dog_class_id: int = DOG_CLASS_ID + # Throttles (Hz). Map changes slowly; detection localization is CPU heavy. + map_emit_hz: float = 1.0 + detect_hz: float = 3.0 + # How long a person stays on the map after they were last seen (sec). + person_ttl_sec: float = 8.0 + # Re-grab a person's photo at most this often (sec). + photo_refresh_sec: float = 4.0 + photo_quality: int = 70 + photo_max_width: int = 120 + # Larger crop kept per id and served on demand for the "view large" lightbox. + photo_full_quality: int = 82 + photo_full_max_width: int = 360 + + # Per-dog label. If blank, auto-detect from the WiFi SSID this machine is + # joined to (each Go2 broadcasts its own AP, e.g. "dimair13") so a viewpoint + # is identified by which dog's network it came from. + robot_label: str = "" + wifi_interface: str = "en0" + + # reID: collapse the detector's short-term track_ids into stable long-term + # IDs (osnet appearance embeddings) so one person == one code, and the + # roster count reflects distinct people rather than track fragments. + enable_reid: bool = True + # Frames a track must accumulate before it can be matched/assigned an ID. + # Lower = snappier first appearance, higher = more reliable matching. + reid_min_embeddings: int = 3 + reid_similarity: float = 0.63 + + # Spatial-temporal re-association (used when appearance reID is OFF). Keeps + # the SAME person on one stable id by position continuity: a brand-new + # detector track that appears within `reassoc_radius_m` of where a recently + # (within `reassoc_window_s`) lost person was, and that spot is not currently + # held by another live person, inherits that person's id. Robust to BoT-SORT + # reassigning track ids across brief occlusions, without appearance merges. + reassoc_radius_m: float = 1.5 + reassoc_window_s: float = 12.0 + + # Web teleop: linear / angular speed when a movement button is pressed. + teleop_linear_speed: float = 0.5 # m/s + teleop_angular_speed: float = 0.8 # rad/s + + +class ReidMapModule(Module): + """Plots detected people on a Marauder's-Map-styled floor plan web view.""" + + config: ReidMapConfig + + global_costmap: In[OccupancyGrid] + color_image: In[Image] + detections: In[Detection2DArray] + pointcloud: In[PointCloud2] + odom: In[PoseStamped] + + # Web -> robot wiring: + # * cmd_vel: teleop Twist, remapped to tele_cmd_vel in the blueprint. + # * clicked_point: a synthesized camera-pixel click that lands inside + # the chosen person's bbox; flows into the SAME BBoxSelectionModule + # input the Rerun camera-view click uses. This lets us drive selection + # through jamjam's public API without modifying their module. + cmd_vel: Out[Twist] + clicked_point: Out[PointStamped] + # Click-to-navigate: the user clicks a free point on the parchment map and we + # publish that world-frame target on goal_request; ReplanningAStarPlanner + # picks it up and drives via nav_cmd_vel. stop_movement is the kill for that + # planner (used to make "follow person" and "navigate to point" mutually + # exclusive — only one drives nav_cmd_vel at a time). + goal_request: Out[PoseStamped] + stop_movement: Out[Bool] + + def __init__(self, **kwargs: Any) -> None: + super().__init__(**kwargs) + self._uvicorn_server_thread: threading.Thread | None = None + self._uvicorn_server: uvicorn.Server | None = None + self.sio: socketio.AsyncServer | None = None + self.app: Any = None + self._broadcast_loop: asyncio.AbstractEventLoop | None = None + self._broadcast_thread: threading.Thread | None = None + self._stopped = False + + self._lock = threading.Lock() + self._latest_image: Image | None = None + self._latest_pc: PointCloud2 | None = None + self._last_map_emit = 0.0 + self._last_detect = 0.0 + # Keyed by stable long-term reID id (or track_id if reID disabled): + # {x, y, name, class_id, photo, photo_ts, last_seen} + self._people: dict[int, dict[str, Any]] = {} + # Cached map metadata so the frontend can map world -> pixel for markers. + self._map_meta: dict[str, Any] | None = None + # Long-term appearance reID (lazily built on start if enabled). + self._idsystem: Any = None + # Label identifying which dog/WiFi this viewpoint belongs to. + self._label: str = "go2" + # Spatial-temporal identity layer (used when appearance reID is off): + # _track_to_stable: detector track_id -> stable display id + # _stable_pos: stable id -> {x, y, last_seen} (for re-association) + self._track_to_stable: dict[int, int] = {} + self._stable_pos: dict[int, dict[str, float]] = {} + self._next_stable: int = 0 + # Per-stable-id mapping to the latest raw LCM Detection2D the tracker + # produced for that person -- used when the web client clicks a + # character to drive TargetLockModule. + self._stable_to_raw_det: dict[int, Any] = {} + self._latest_detection_header: Any = None + + # ------------------------------------------------------------------ # + # Lifecycle + # ------------------------------------------------------------------ # + @rpc + def start(self) -> None: + super().start() + self._label = self.config.robot_label.strip() or self._detect_wifi_ssid() or "go2" + logger.info("Marauder's Map: viewpoint label = %s", self._label) + self._create_server() + self._start_broadcast_loop() + self._uvicorn_server_thread = threading.Thread(target=self._run_uvicorn_server, daemon=True) + self._uvicorn_server_thread.start() + logger.info(f"Marauder's Map available at http://localhost:{self.config.port}/") + + if self.config.enable_reid: + try: + from dimos.models.embedding.treid import TorchReIDModel + from dimos.perception.detection.reid.embedding_id_system import ( + EmbeddingIDSystem, + ) + + self._idsystem = EmbeddingIDSystem( + model=TorchReIDModel, + padding=0, + similarity_threshold=self.config.reid_similarity, + min_embeddings_for_matching=self.config.reid_min_embeddings, + ) + logger.info("Marauder's Map: reID (long-term IDs) enabled") + except Exception: + logger.exception("Marauder's Map: reID unavailable, falling back to track_id") + self._idsystem = None + + for stream, handler in ( + (self.global_costmap, self._on_costmap), + (self.color_image, self._on_color_image), + (self.pointcloud, self._on_pointcloud), + (self.detections, self._on_detections), + (self.odom, self._on_odom), + ): + try: + unsub = stream.subscribe(handler) + self.register_disposable(Disposable(unsub)) + except Exception: + logger.exception("Marauder's Map: subscribe failed for %s", handler.__name__) + + @rpc + def stop(self) -> None: + if self._stopped: + return + self._stopped = True + if self._uvicorn_server: + self._uvicorn_server.should_exit = True + if self._broadcast_loop and not self._broadcast_loop.is_closed(): + self._broadcast_loop.call_soon_threadsafe(self._broadcast_loop.stop) + if self._broadcast_thread and self._broadcast_thread.is_alive(): + self._broadcast_thread.join(timeout=DEFAULT_THREAD_JOIN_TIMEOUT) + if self._uvicorn_server_thread and self._uvicorn_server_thread.is_alive(): + self._uvicorn_server_thread.join(timeout=DEFAULT_THREAD_JOIN_TIMEOUT) + super().stop() + + # ------------------------------------------------------------------ # + # Server + # ------------------------------------------------------------------ # + def _create_server(self) -> None: + self.sio = socketio.AsyncServer(async_mode="asgi", cors_allowed_origins="*") + + async def serve_index(request): # type: ignore[no-untyped-def] + if _MAP_HTML.exists(): + return FileResponse(_MAP_HTML, media_type="text/html") + return Response(content="marauders_map.html not found", status_code=503) + + async def serve_socketio_js(request): # type: ignore[no-untyped-def] + if _SOCKETIO_JS.exists(): + return FileResponse(_SOCKETIO_JS, media_type="application/javascript") + return Response(content="socketio.min.js not found", status_code=503) + + async def serve_hp_js(request): # type: ignore[no-untyped-def] + if _HP_JS.exists(): + return FileResponse(_HP_JS, media_type="application/javascript") + return Response(content="hp_characters.js not found", status_code=503) + + async def serve_bgm(request): # type: ignore[no-untyped-def] + """Background ambient music (loop-played in the page). + + Served from this folder so the page works on the offline field + network. Audio kept under /vendor/* for consistency with the + other static asset routes. + """ + if _BGM_MP3.exists(): + return FileResponse(_BGM_MP3, media_type="audio/mpeg") + return Response(content="bgm.mp3 not found", status_code=503) + + async def serve_house_icon(request): # type: ignore[no-untyped-def] + """Hogwarts house badge for a character; only 4 known files.""" + name = request.path_params.get("house", "") + if name not in ("gryffindor", "hufflepuff", "ravenclaw", "slytherin"): + return Response(content="unknown house", status_code=404) + path = _TEMPLATES_DIR / "icons" / f"{name}.png" + if path.exists(): + return FileResponse(path, media_type="image/png") + return Response(content="icon not found", status_code=404) + + starlette_app = Starlette( + routes=[ + Route("/", serve_index), + Route("/vendor/socketio.js", serve_socketio_js), + Route("/vendor/hp_characters.js", serve_hp_js), + Route("/vendor/bgm.mp3", serve_bgm), + Route("/icons/{house}.png", serve_house_icon), + ] + ) + self.app = socketio.ASGIApp(self.sio, starlette_app) + + @self.sio.event # type: ignore[untyped-decorator] + async def connect(sid, environ) -> None: # type: ignore[no-untyped-def] + logger.info(f"Marauder's Map client connected: {sid}") + # Send the cached map + current people so a fresh client is not blank. + if self._map_meta is not None: + await self.sio.emit("map", self._map_meta, room=sid) # type: ignore[union-attr] + await self.sio.emit("people", {"people": self._people_payload()}, room=sid) # type: ignore[union-attr] + + @self.sio.event # type: ignore[untyped-decorator] + async def get_photo(sid, data) -> None: # type: ignore[no-untyped-def] + """Client asked for a full-size portrait to view large.""" + try: + ident = int(data.get("id")) + except Exception: + return + entry = self._people.get(ident) + full = entry.get("photo_full") if entry else None + await self.sio.emit( # type: ignore[union-attr] + "photo_full", + {"id": ident, "img": full, "name": entry.get("name") if entry else None}, + room=sid, + ) + + @self.sio.event # type: ignore[untyped-decorator] + async def teleop(sid, data) -> None: # type: ignore[no-untyped-def] + """Web button held -> emit Twist on cmd_vel (->tele_cmd_vel topic). + + Payload is a Twist in robot units; the button handler in the page + scales the axes by the configured linear/angular speed so this + module stays a thin pipe. An all-zero Twist arrives on button + release and tells MovementManager to hand back to nav. + """ + try: + lx = float(data.get("linear_x", 0.0)) + ly = float(data.get("linear_y", 0.0)) + az = float(data.get("angular_z", 0.0)) + except Exception: + return + try: + self.cmd_vel.publish( + Twist(linear=Vector3(lx, ly, 0.0), angular=Vector3(0.0, 0.0, az)) + ) + except Exception: + logger.exception("Marauder's Map: teleop publish failed") + + @self.sio.event # type: ignore[untyped-decorator] + async def select(sid, data) -> None: # type: ignore[no-untyped-def] + """Web click on a character -> drive TargetLockModule. + Also cancels any active nav goal — only one driver of nav_cmd_vel + at a time (planner OR bbox-follow).""" + try: + ident = int(data.get("id")) + except Exception: + return + # Cancel any nav-to-point goal first so the planner stops driving + # before bbox-follow takes over. + try: + self.stop_movement.publish(Bool(data=True)) + except Exception: + logger.exception("Marauder's Map: stop_movement publish failed (on select)") + self._publish_selection(ident) + + @self.sio.event # type: ignore[untyped-decorator] + async def deselect(sid, data) -> None: # type: ignore[no-untyped-def] + """Web clears the selection and interrupts any active task.""" + try: + self.stop_movement.publish(Bool(data=True)) + except Exception: + logger.exception("Marauder's Map: stop_movement publish failed (on deselect)") + self._publish_selection(None) + + @self.sio.event # type: ignore[untyped-decorator] + async def navigate(sid, data) -> None: # type: ignore[no-untyped-def] + """Web click on a free point of the map -> drive the planner. + Payload: {"wx": , "wy": }. + Cancels any active follow (planner & bbox-follow share nav_cmd_vel + — exclusivity is enforced here).""" + try: + wx = float(data.get("wx")) + wy = float(data.get("wy")) + except Exception: + return + # Stop any active task before handing nav_cmd_vel back to the planner. + try: + self.stop_movement.publish(Bool(data=True)) + self._publish_selection(None) + except Exception: + logger.exception("Marauder's Map: deselect on navigate failed") + try: + self.goal_request.publish( + PoseStamped( + position=(wx, wy, 0.0), + orientation=(0.0, 0.0, 0.0, 1.0), + frame_id=self.config.world_frame_hint, + ) + ) + # Echo back so the UI can render a goal pin without waiting + # for an odom round-trip. + await self.sio.emit( # type: ignore[union-attr] + "goal", {"x": wx, "y": wy} + ) + except Exception: + logger.exception("Marauder's Map: goal_request publish failed") + + @self.sio.event # type: ignore[untyped-decorator] + async def cancel_nav(sid, data) -> None: # type: ignore[no-untyped-def] + """Stop the planner (publishes Bool(True) on stop_movement).""" + try: + self.stop_movement.publish(Bool(data=True)) + await self.sio.emit("goal", None) # type: ignore[union-attr] + except Exception: + logger.exception("Marauder's Map: cancel_nav publish failed") + + def _start_broadcast_loop(self) -> None: + def loop_runner() -> None: + self._broadcast_loop = asyncio.new_event_loop() + asyncio.set_event_loop(self._broadcast_loop) + try: + self._broadcast_loop.run_forever() + finally: + self._broadcast_loop.close() + + self._broadcast_thread = threading.Thread(target=loop_runner, daemon=True) + self._broadcast_thread.start() + + def _run_uvicorn_server(self) -> None: + config = uvicorn.Config( + self.app, + host=global_config.listen_host, + port=self.config.port, + log_level="error", + ) + self._uvicorn_server = uvicorn.Server(config) + self._uvicorn_server.run() + + # ------------------------------------------------------------------ # + # Stream handlers + # ------------------------------------------------------------------ # + def _on_color_image(self, image: Image) -> None: + with self._lock: + self._latest_image = image + + def _on_pointcloud(self, pc: PointCloud2) -> None: + with self._lock: + self._latest_pc = pc + + def _on_odom(self, msg: PoseStamped) -> None: + self._emit( + "robot", + {"x": float(msg.position.x), "y": float(msg.position.y), "label": self._label}, + ) + + def _on_costmap(self, grid: OccupancyGrid) -> None: + now = time.monotonic() + if now - self._last_map_emit < 1.0 / max(self.config.map_emit_hz, 0.1): + return + self._last_map_emit = now + try: + meta = self._encode_map(grid) + except Exception: + logger.exception("Marauder's Map: failed to encode occupancy grid") + return + self._map_meta = meta + self._emit("map", meta) + + def _on_detections(self, msg: Detection2DArray) -> None: + # Header is stashed unconditionally so the web select event can still + # emit a valid Detection2DArray even on frames we throttle past. + self._latest_detection_header = msg.header + + # Build raw-detection lookup (by LCM id string == track_id) so a web + # click for stable_id N can construct a single-detection array from + # the *current* frame, not the throttled-stale one. + raw_by_id = {str(getattr(r, "id", "") or ""): r for r in msg.detections} + + now = time.monotonic() + if now - self._last_detect < 1.0 / max(self.config.detect_hz, 0.1): + self._refresh_stable_raw(msg.detections, raw_by_id) + return + self._last_detect = now + + with self._lock: + image = self._latest_image + pc = self._latest_pc + if image is None or pc is None: + return + + # World <- camera_optical transform, same lookup Detection3DModule uses. + try: + transform = self.tf.get("camera_optical", pc.frame_id, image.ts, 1.0) + except Exception: + transform = None + if not transform: + return + + try: + parsed = ImageDetections2D.from_ros_detection2d_array(image, msg) + except Exception: + logger.exception("Marauder's Map: failed to parse detections") + return + + # People (class 0) plus, optionally, other Go2-like quadrupeds (dog). + persons = [det for det in parsed if int(det.class_id) == PERSON_CLASS_ID] + dogs = ( + [det for det in parsed if int(det.class_id) == self.config.robot_dog_class_id] + if self.config.detect_robot_dogs + else [] + ) + + # Same-frame co-occurrence = different people; feed this to reID so two + # people standing together never collapse into one ID. + if self._idsystem is not None and len(persons) > 1: + try: + self._idsystem.add_negative_constraints([int(d.track_id) for d in persons]) + except Exception: + logger.exception("Marauder's Map: reID negative-constraint update failed") + + seen_now = time.time() + for det in persons: + center = self._project_center(det, pc, transform) + if center is None: + continue + x, y = center + ident = self._resolve_identity(det, x, y, seen_now) + if ident is None: + continue # appearance reID not confident yet + self._write_entry(ident, det, x, y, "person", seen_now) + raw = raw_by_id.get(str(int(det.track_id))) + if raw is not None: + self._stable_to_raw_det[ident] = raw + + for det in dogs: + center = self._project_center(det, pc, transform) + if center is None: + continue + ident = DOG_ID_BASE + int(det.track_id) + self._write_entry(ident, det, center[0], center[1], "dog", seen_now) + raw = raw_by_id.get(str(int(det.track_id))) + if raw is not None: + self._stable_to_raw_det[ident] = raw + + self._expire_people() + self._prune_stable(seen_now) + self._emit("people", {"people": self._people_payload()}) + + def _refresh_stable_raw( + self, raw_dets: list[Any], raw_by_id: dict[str, Any] + ) -> None: + """Keep _stable_to_raw_det fresh on frames we throttled past. + + Doesn't run reID / projection — only re-binds the stable ids we + already know to whatever raw detection currently carries the same + track id, so a web click in this frame still produces a usable + single-detection array. + """ + for sid, prev_raw in list(self._stable_to_raw_det.items()): + track_id = str(getattr(prev_raw, "id", "") or "") + if not track_id: + continue + cur = raw_by_id.get(track_id) + if cur is not None: + self._stable_to_raw_det[sid] = cur + + def _project_center( + self, det: Any, pc: PointCloud2, transform: Any + ) -> tuple[float, float] | None: + """Project a 2D detection to a world-frame (x, y) via the lidar map.""" + try: + det3d = Detection3DPC.from_2d( + det, + world_pointcloud=pc, + camera_info=self.config.camera_info, + world_to_optical_transform=transform, + ) + except Exception: + det3d = None + if det3d is None: + return None + return float(det3d.center.x), float(det3d.center.y) + + def _resolve_identity(self, det: Any, x: float, y: float, now: float) -> int | None: + """Stable display id for a person. + + With appearance reID enabled, defer to it. Otherwise use spatial-temporal + re-association on top of the detector's track id so the SAME person keeps + one id even when the tracker hands out a fresh track id after a gap. + """ + if self._idsystem is not None: + try: + lid = int(self._idsystem.register_detection(det)) + except Exception: + logger.exception("Marauder's Map: reID register failed") + return int(det.track_id) + return None if lid < 0 else lid + + tid = int(det.track_id) + sid = self._track_to_stable.get(tid) + if sid is None: + sid = self._reassociate(x, y, now) + if sid is None: + sid = self._next_stable + self._next_stable += 1 + self._track_to_stable[tid] = sid + self._stable_pos[sid] = {"x": x, "y": y, "last_seen": now} + return sid + + def _reassociate(self, x: float, y: float, now: float) -> int | None: + """Find a recently-lost (not currently active) stable id near (x, y).""" + best: int | None = None + best_d = self.config.reassoc_radius_m + active_cut = now - 1.0 # an id seen within 1s is live -> never steal it + for sid, info in self._stable_pos.items(): + last = info["last_seen"] + if last > active_cut: + continue + if now - last > self.config.reassoc_window_s: + continue + d = math.hypot(x - info["x"], y - info["y"]) + if d < best_d: + best_d, best = d, sid + return best + + def _prune_stable(self, now: float) -> None: + cutoff = now - self.config.reassoc_window_s + dead = [s for s, i in self._stable_pos.items() if i["last_seen"] < cutoff] + for s in dead: + self._stable_pos.pop(s, None) + for t in [t for t, s in self._track_to_stable.items() if s in dead]: + self._track_to_stable.pop(t, None) + + def _write_entry( + self, ident: int, det: Any, x: float, y: float, kind: str, seen_now: float + ) -> None: + """Update a person's roster/map entry.""" + entry = self._people.get(ident, {}) + entry.update( + { + "x": x, + "y": y, + "name": str(det.name), + "class_id": int(det.class_id), + "kind": kind, + "last_seen": seen_now, + } + ) + # Refresh thumbnail + full-size crop occasionally (crop + jpeg is not free). + if seen_now - entry.get("photo_ts", 0.0) > self.config.photo_refresh_sec: + try: + crop = det.cropped_image(padding=8) + entry["photo"] = crop.to_base64( + quality=self.config.photo_quality, + max_width=self.config.photo_max_width, + ) + # Larger version kept server-side; served on demand (not broadcast). + entry["photo_full"] = crop.to_base64( + quality=self.config.photo_full_quality, + max_width=self.config.photo_full_max_width, + ) + entry["photo_ts"] = seen_now + except Exception: + pass + self._people[ident] = entry + + # ------------------------------------------------------------------ # + # Helpers + # ------------------------------------------------------------------ # + def _detect_wifi_ssid(self) -> str: + """Best-effort current WiFi SSID, used as this dog's label.""" + iface = self.config.wifi_interface + cmds: list[list[str]] = [] + if sys.platform == "darwin": + cmds = [ + ["networksetup", "-getairportnetwork", iface], + ["ipconfig", "getsummary", iface], + ] + else: # linux + cmds = [["iwgetid", "-r"], ["nmcli", "-t", "-f", "active,ssid", "dev", "wifi"]] + + for cmd in cmds: + try: + out = subprocess.run(cmd, capture_output=True, text=True, timeout=5).stdout + except Exception: + continue + if not out: + continue + # networksetup: "Current Wi-Fi Network: dimair13" + if "Current Wi-Fi Network:" in out: + return out.split("Current Wi-Fi Network:", 1)[1].strip().splitlines()[0] + # ipconfig getsummary: a line " SSID : dimair13" + for line in out.splitlines(): + s = line.strip() + if s.upper().startswith("SSID") and ":" in s: + return s.split(":", 1)[1].strip() + # nmcli "yes:dimair13" + if s.startswith("yes:"): + return s.split(":", 1)[1].strip() + # iwgetid -r prints the bare SSID + if cmd[0] == "iwgetid" and out.strip(): + return out.strip().splitlines()[0] + return "" + + def _expire_people(self) -> None: + cutoff = time.time() - self.config.person_ttl_sec + for tid in [t for t, e in self._people.items() if e.get("last_seen", 0) < cutoff]: + del self._people[tid] + self._stable_to_raw_det.pop(tid, None) + + def _people_payload(self) -> list[dict[str, Any]]: + # Note: the large `photo_full` is intentionally excluded here -- it is + # served on demand via the `get_photo` event to keep this broadcast lean. + out = [] + for tid, e in self._people.items(): + out.append( + { + "id": tid, + "x": e.get("x"), + "y": e.get("y"), + "name": e.get("name"), + "kind": e.get("kind", "person"), + "seen_by": self._label, + "photo": e.get("photo"), + "age": round(time.time() - e.get("last_seen", time.time()), 1), + } + ) + return out + + def _encode_map(self, grid: OccupancyGrid) -> dict[str, Any]: + """Render the occupancy grid to an RGBA PNG of ink walls (transparent + elsewhere) and return it plus the world->pixel mapping metadata.""" + g = np.asarray(grid.grid) + h, w = g.shape + rgba = np.zeros((h, w, 4), dtype=np.uint8) + + occupied = g >= 50 + free = (g >= 0) & (g < 50) + # Ink walls (dark sepia), faint inked floor, transparent unknown. + rgba[occupied] = (40, 30, 22, 255) + rgba[free] = (90, 70, 45, 40) + + ok, buf = cv2.imencode(".png", cv2.cvtColor(rgba, cv2.COLOR_RGBA2BGRA)) + img_b64 = base64.b64encode(buf.tobytes()).decode("utf-8") if ok else "" + + origin = grid.info.origin + return { + "img": img_b64, + "width": int(w), + "height": int(h), + "resolution": float(grid.info.resolution), + "origin": [float(origin.position.x), float(origin.position.y)], + "label": self._label, + } + + def _publish_selection(self, ident: int | None) -> None: + """Drive BBoxSelectionModule by synthesizing a camera-pixel click. + + Selecting: publish a ``PointStamped`` at the *center* of the chosen + person's bbox with ``frame_id="color_image/web_click"``. + BBoxSelectionModule's ``clicked_point`` handler accepts + any frame_id containing ``color_image``, hit-tests the + pixel against current detections, and locks onto the + matching bbox -- the exact same path Rerun camera clicks + take, with no upstream code touched. + + Deselecting: publish a ``PointStamped`` far outside any detection + box. BBoxSelectionModule's "click did not hit any bbox" + branch then clears the selection naturally. + """ + ts = time.time() + if ident is None: + # A guaranteed miss; finite numbers so the non-finite filter passes + # but the hit-test rejects -> clears selection. + click = PointStamped( + ts=ts, frame_id="color_image/web_click", x=-1.0e6, y=-1.0e6, z=0.0 + ) + try: + self.clicked_point.publish(click) + except Exception: + logger.exception("Marauder's Map: deselect publish failed") + return + + raw = self._stable_to_raw_det.get(ident) + if raw is None: + logger.info("Marauder's Map: no live detection for id=%s", ident) + return + try: + center = raw.bbox.center.position + click = PointStamped( + ts=ts, + frame_id="color_image/web_click", + x=float(center.x), + y=float(center.y), + z=0.0, + ) + self.clicked_point.publish(click) + except Exception: + logger.exception("Marauder's Map: select publish failed") + + def _emit(self, event: str, data: Any) -> None: + if ( + self.sio is not None + and self._broadcast_loop is not None + and not self._broadcast_loop.is_closed() + ): + asyncio.run_coroutine_threadsafe(self.sio.emit(event, data), self._broadcast_loop) + + +__all__ = ["ReidMapConfig", "ReidMapModule"] diff --git a/dimos/apps/marauders_map/templates/hp_characters.js b/dimos/apps/marauders_map/templates/hp_characters.js new file mode 100644 index 0000000000..939401a1cf --- /dev/null +++ b/dimos/apps/marauders_map/templates/hp_characters.js @@ -0,0 +1,1702 @@ +window.HP_CHARACTERS = [ +{ +"name": "Harry Potter", +"house": "G", +"color": "#7a1f1f", +"quotes": [ +"I solemnly swear that I am up to no good.", +"I don't go looking for trouble. Trouble usually finds me.", +"Expelliarmus!", +"I'll not back down, not today.", +"Point me at the danger.", +"Fear is just the start of courage.", +"Curious. Most curious.", +"We'll see about that.", +"As you wish.", +"The castle keeps its secrets." +] +}, +{ +"name": "Hermione Granger", +"house": "G", +"color": "#7a1f1f", +"quotes": [ +"It's leviOsa, not levioSA!", +"When in doubt, go to the library.", +"Books! And cleverness!", +"I read about this. Twice.", +"Logic will get us out of here.", +"Honestly, it's quite simple.", +"Curious. Most curious.", +"We'll see about that.", +"As you wish.", +"The castle keeps its secrets." +] +}, +{ +"name": "Ron Weasley", +"house": "G", +"color": "#7a1f1f", +"quotes": [ +"Bloody hell.", +"She needs to sort out her priorities.", +"Why is it always me?", +"I'm with you, to the end.", +"You'll not face it alone.", +"I don't abandon my friends.", +"Well, this is going splendidly.", +"Mischief, as ever, managed.", +"Hold my butterbeer.", +"I—I'm not sure we should." +] +}, +{ +"name": "Albus Dumbledore", +"house": null, +"color": "#7a1f1f", +"quotes": [ +"Happiness can be found even in the darkest of times.", +"It is our choices that show what we truly are.", +"After all this time? Always... no, that's Severus.", +"Patience reveals what haste hides.", +"Every choice carves the soul.", +"Listen more than you speak.", +"Curious. Most curious.", +"We'll see about that.", +"As you wish.", +"The castle keeps its secrets." +] +}, +{ +"name": "Severus Snape", +"house": null, +"color": "#1f5a2f", +"quotes": [ +"Turn to page three hundred and ninety-four.", +"Always.", +"Obviously.", +"How original.", +"Charming. Truly.", +"Was that meant to impress?", +"Pathetic.", +"I'm with you, to the end.", +"You'll not face it alone.", +"I don't abandon my friends." +] +}, +{ +"name": "Draco Malfoy", +"house": "S", +"color": "#1f5a2f", +"quotes": [ +"My father will hear about this.", +"Scared, Potter?", +"Wait till my father hears.", +"Do you know who I am?", +"I do not lower myself for anyone.", +"Mediocrity offends me.", +"How frightfully common.", +"I don't associate with your sort.", +"Mudbloods, honestly.", +"Curious. Most curious." +] +}, +{ +"name": "Luna Lovegood", +"house": "R", +"color": "#2f4a8a", +"quotes": [ +"You're just as sane as I am.", +"The Nargles are behind it.", +"I hope there's pudding.", +"The Wrackspurts are thick today.", +"I think it's perfectly lovely.", +"Reality is overrated, don't you think?", +"Things are often stranger than they seem.", +"I rather like the upside-down.", +"Do you hear them too?", +"All is as it should be." +] +}, +{ +"name": "Rubeus Hagrid", +"house": null, +"color": "#7a1f1f", +"quotes": [ +"Yer a wizard, Harry.", +"I should not have said that.", +"What's comin' will come.", +"There, there, no harm done.", +"Be kind, it costs nothing.", +"Easy now, easy.", +"Misunderstood, they are, not monsters.", +"Treat 'em right an' they'll treat you right.", +"Beautiful creature, ain't she?", +"Curious. Most curious." +] +}, +{ +"name": "Minerva McGonagall", +"house": null, +"color": "#7a1f1f", +"quotes": [ +"Why is it always you three?", +"Have a biscuit, Potter.", +"Hooligans!", +"Rules exist for a reason.", +"I expect nothing less than excellence.", +"Detention. Now.", +"Do you know who I am?", +"I do not lower myself for anyone.", +"Mediocrity offends me.", +"Curious. Most curious." +] +}, +{ +"name": "Sirius Black", +"house": "G", +"color": "#7a1f1f", +"quotes": [ +"The ones that love us never really leave us.", +"We've all got both light and dark inside us.", +"I did my waiting! Twelve years of it!", +"I'm with you, to the end.", +"You'll not face it alone.", +"I don't abandon my friends.", +"I'm getting too old for this.", +"Another full moon, another scar.", +"Rest, while you can.", +"Curious. Most curious." +] +}, +{ +"name": "Lord Voldemort", +"house": "S", +"color": "#1f5a2f", +"quotes": [ +"There is no good and evil, only power.", +"Harry Potter, the boy who lived.", +"I can touch you now.", +"You will regret crossing me.", +"Beg. It changes nothing.", +"Pain is such a fine teacher.", +"The Dark Lord rewards the faithful.", +"Light makes such weak men.", +"Soon, none will oppose us.", +"Curious. Most curious." +] +}, +{ +"name": "Dobby", +"house": null, +"color": "#5a3a1a", +"quotes": [ +"Dobby is free.", +"Dobby has no master!", +"Harry Potter must not go back to Hogwarts!", +"I'm with you, to the end.", +"You'll not face it alone.", +"I don't abandon my friends.", +"I'll not back down, not today.", +"Point me at the danger.", +"Fear is just the start of courage.", +"Curious. Most curious." +] +}, +{ +"name": "Neville Longbottom", +"house": "G", +"color": "#7a1f1f", +"quotes": [ +"I—I'm not sure we should.", +"Please don't make me.", +"Oh dear, oh dear.", +"I'll not back down, not today.", +"Point me at the danger.", +"Fear is just the start of courage.", +"Oh—sorry, didn't mean to—", +"I'd rather not, if that's alright.", +"They're really quite friendly, I promise.", +"Curious. Most curious." +] +}, +{ +"name": "Bellatrix Lestrange", +"house": "S", +"color": "#1f5a2f", +"quotes": [ +"Hee hee hee!", +"Shall I do it slowly?", +"The Dark Lord is RISING!", +"Crucio is such fun.", +"You will regret crossing me.", +"Beg. It changes nothing.", +"Pain is such a fine teacher.", +"The Dark Lord rewards the faithful.", +"I'd die for him, gladly.", +"Blood traitors deserve nothing." +] +}, +{ +"name": "Ginny Weasley", +"house": "G", +"color": "#7a1f1f", +"quotes": [ +"Try me and find out.", +"I fight my own battles.", +"Underestimate me again.", +"Well, this is going splendidly.", +"Mischief, as ever, managed.", +"Hold my butterbeer.", +"Curious. Most curious.", +"We'll see about that.", +"As you wish.", +"The castle keeps its secrets." +] +}, +{ +"name": "Fred Weasley", +"house": "G", +"color": "#7a1f1f", +"quotes": [ +"Care for a Skiving Snackbox?", +"We'd like to test it, purely for science.", +"Fancy a Canary Cream?", +"Honestly, mate, lighten up.", +"Boom. You're welcome.", +"Honestly, lighten up.", +"We're the good kind of trouble.", +"Pull my wand, go on.", +"I'll not back down, not today.", +"Point me at the danger." +] +}, +{ +"name": "George Weasley", +"house": "G", +"color": "#7a1f1f", +"quotes": [ +"Care for a Skiving Snackbox?", +"We'd like to test it, purely for science.", +"Fancy a Canary Cream?", +"Honestly, mate, lighten up.", +"Boom. You're welcome.", +"Honestly, lighten up.", +"We're the good kind of trouble.", +"Pull my wand, go on.", +"I read about this. Twice.", +"Logic will get us out of here." +] +}, +{ +"name": "Remus Lupin", +"house": null, +"color": "#7a1f1f", +"quotes": [ +"Be kind, it costs nothing.", +"There's good in everyone, mostly.", +"Let me help you up.", +"I'm getting too old for this.", +"Another full moon, another scar.", +"Rest, while you can.", +"Patience reveals what haste hides.", +"Every choice carves the soul.", +"Listen more than you speak.", +"Curious. Most curious." +] +}, +{ +"name": "Nymphadora Tonks", +"house": "H", +"color": "#7a1f1f", +"quotes": [ +"Oh—sorry, didn't mean to—", +"I'd rather not, if that's alright.", +"They're really quite friendly, I promise.", +"Well, this is going splendidly.", +"Mischief, as ever, managed.", +"Hold my butterbeer.", +"I'll not back down, not today.", +"Point me at the danger.", +"Fear is just the start of courage.", +"Curious. Most curious." +] +}, +{ +"name": "Alastor Moody", +"house": null, +"color": "#3a5a3a", +"quotes": [ +"Bah.", +"Speak plainly or not at all.", +"Don't make me repeat myself.", +"Curious. Most curious.", +"We'll see about that.", +"As you wish.", +"The castle keeps its secrets.", +"Mind how you go.", +"Magic always leaves a trace.", +"Trouble follows the unwary." +] +}, +{ +"name": "Kingsley Shacklebolt", +"house": null, +"color": "#3a5a3a", +"quotes": [ +"Hmph.", +"I do not waste words.", +"We shall see.", +"Patience reveals what haste hides.", +"Every choice carves the soul.", +"Listen more than you speak.", +"Curious. Most curious.", +"We'll see about that.", +"As you wish.", +"The castle keeps its secrets." +] +}, +{ +"name": "Arthur Weasley", +"house": "G", +"color": "#7a1f1f", +"quotes": [ +"Patience reveals what haste hides.", +"Every choice carves the soul.", +"Listen more than you speak.", +"Be kind, it costs nothing.", +"There's good in everyone, mostly.", +"Let me help you up.", +"Curious. Most curious.", +"We'll see about that.", +"As you wish.", +"The castle keeps its secrets." +] +}, +{ +"name": "Molly Weasley", +"house": "G", +"color": "#7a1f1f", +"quotes": [ +"Come in, come in, you're freezing.", +"There's always room for one more.", +"Family is everything.", +"Try me and find out.", +"I fight my own battles.", +"Underestimate me again.", +"You'll eat something first.", +"NOT MY CHILDREN!", +"Wear a scarf, it's cold.", +"Sit, I'll mend it." +] +}, +{ +"name": "Percy Weasley", +"house": "G", +"color": "#7a1f1f", +"quotes": [ +"I must insist on proper procedure.", +"As Head Boy, I cannot condone this.", +"Standards, people, standards.", +"I take this very seriously.", +"You can count on me.", +"I won't let you down.", +"Curious. Most curious.", +"We'll see about that.", +"As you wish.", +"The castle keeps its secrets." +] +}, +{ +"name": "Bill Weasley", +"house": "G", +"color": "#7a1f1f", +"quotes": [ +"Hmph.", +"I do not waste words.", +"We shall see.", +"I'll not back down, not today.", +"Point me at the danger.", +"Fear is just the start of courage.", +"Patience reveals what haste hides.", +"Every choice carves the soul.", +"Listen more than you speak.", +"Curious. Most curious." +] +}, +{ +"name": "Charlie Weasley", +"house": "G", +"color": "#7a1f1f", +"quotes": [ +"Bah.", +"Speak plainly or not at all.", +"Don't make me repeat myself.", +"Curious. Most curious.", +"We'll see about that.", +"As you wish.", +"The castle keeps its secrets.", +"Mind how you go.", +"Magic always leaves a trace.", +"Trouble follows the unwary." +] +}, +{ +"name": "Cedric Diggory", +"house": "H", +"color": "#9a7a1f", +"quotes": [ +"Do you know who I am?", +"I do not lower myself for anyone.", +"Mediocrity offends me.", +"Rules exist for a reason.", +"I expect nothing less than excellence.", +"Detention. Now.", +"Curious. Most curious.", +"We'll see about that.", +"As you wish.", +"The castle keeps its secrets." +] +}, +{ +"name": "Cho Chang", +"house": "R", +"color": "#2f4a8a", +"quotes": [ +"Be kind, it costs nothing.", +"There's good in everyone, mostly.", +"Let me help you up.", +"I'm getting too old for this.", +"Another full moon, another scar.", +"Rest, while you can.", +"Curious. Most curious.", +"We'll see about that.", +"As you wish.", +"The castle keeps its secrets." +] +}, +{ +"name": "Cornelius Fudge", +"house": null, +"color": "#5a4a6a", +"quotes": [ +"Have you seen my smile?", +"Fame, my dear, is a full-time job.", +"Do get my good side.", +"I—I'm not sure we should.", +"Please don't make me.", +"Oh dear, oh dear.", +"The Ministry has it well in hand.", +"Let us not cause a panic.", +"Procedure must be followed.", +"Curious. Most curious." +] +}, +{ +"name": "Dolores Umbridge", +"house": null, +"color": "#5a4a6a", +"quotes": [ +"I will have order!", +"Naughty children deserve to be punished.", +"Hem, hem.", +"Be kind, it costs nothing.", +"There's good in everyone, mostly.", +"Let me help you up.", +"You will regret crossing me.", +"Beg. It changes nothing.", +"Pain is such a fine teacher.", +"Curious. Most curious." +] +}, +{ +"name": "Gilderoy Lockhart", +"house": null, +"color": "#2f4a8a", +"quotes": [ +"Have you seen my smile?", +"Fame, my dear, is a full-time job.", +"Do get my good side.", +"Fame is a full-time job.", +"Allow me to demonstrate. Carefully.", +"Curious. Most curious.", +"We'll see about that.", +"As you wish.", +"The castle keeps its secrets.", +"Mind how you go." +] +}, +{ +"name": "Sybill Trelawney", +"house": null, +"color": "#2f4a8a", +"quotes": [ +"I sense a disturbance...", +"The Grim! I see the Grim!", +"Your aura is most troubled today.", +"Curious. Most curious.", +"We'll see about that.", +"As you wish.", +"The castle keeps its secrets.", +"Mind how you go.", +"Magic always leaves a trace.", +"Trouble follows the unwary." +] +}, +{ +"name": "Filius Flitwick", +"house": null, +"color": "#2f4a8a", +"quotes": [ +"Well, this is going splendidly.", +"Mischief, as ever, managed.", +"Hold my butterbeer.", +"Curious. Most curious.", +"We'll see about that.", +"As you wish.", +"The castle keeps its secrets.", +"Mind how you go.", +"Magic always leaves a trace.", +"Trouble follows the unwary." +] +}, +{ +"name": "Pomona Sprout", +"house": null, +"color": "#9a7a1f", +"quotes": [ +"Be kind, it costs nothing.", +"There's good in everyone, mostly.", +"Let me help you up.", +"Curious. Most curious.", +"We'll see about that.", +"As you wish.", +"The castle keeps its secrets.", +"Mind how you go.", +"Magic always leaves a trace.", +"Trouble follows the unwary." +] +}, +{ +"name": "Horace Slughorn", +"house": null, +"color": "#1f5a2f", +"quotes": [ +"Well, this is going splendidly.", +"Mischief, as ever, managed.", +"Hold my butterbeer.", +"How frightfully common.", +"I don't associate with your sort.", +"Beneath me, all of it.", +"Hmph.", +"I do not waste words.", +"We shall see.", +"Curious. Most curious." +] +}, +{ +"name": "Argus Filch", +"house": null, +"color": "#4a4a4a", +"quotes": [ +"How original.", +"Charming. Truly.", +"Was that meant to impress?", +"Pathetic.", +"Students. Filth, the lot.", +"I'll have you in chains.", +"No running in the corridors!", +"Curious. Most curious.", +"We'll see about that.", +"As you wish." +] +}, +{ +"name": "Madam Pomfrey", +"house": null, +"color": "#4a4a4a", +"quotes": [ +"Drink this, all of it.", +"Out of my hospital wing!", +"You'll live. Probably.", +"Curious. Most curious.", +"We'll see about that.", +"As you wish.", +"The castle keeps its secrets.", +"Mind how you go.", +"Magic always leaves a trace.", +"Trouble follows the unwary." +] +}, +{ +"name": "Lucius Malfoy", +"house": "S", +"color": "#1f5a2f", +"quotes": [ +"How frightfully common.", +"I don't associate with your sort.", +"Beneath me, all of it.", +"You will regret crossing me.", +"Beg. It changes nothing.", +"Pain is such a fine teacher.", +"Curious. Most curious.", +"We'll see about that.", +"As you wish.", +"The castle keeps its secrets." +] +}, +{ +"name": "Narcissa Malfoy", +"house": "S", +"color": "#1f5a2f", +"quotes": [ +"Do you know who I am?", +"I do not lower myself for anyone.", +"Mediocrity offends me.", +"You'll eat something first.", +"NOT MY CHILDREN!", +"Wear a scarf, it's cold.", +"Sit, I'll mend it.", +"Curious. Most curious.", +"We'll see about that.", +"As you wish." +] +}, +{ +"name": "Peter Pettigrew", +"house": "G", +"color": "#7a1f1f", +"quotes": [ +"I—I'm not sure we should.", +"Please don't make me.", +"Oh dear, oh dear.", +"How original.", +"Charming. Truly.", +"Was that meant to impress?", +"Pathetic.", +"Curious. Most curious.", +"We'll see about that.", +"As you wish." +] +}, +{ +"name": "James Potter", +"house": "G", +"color": "#7a1f1f", +"quotes": [ +"I'll not back down, not today.", +"Point me at the danger.", +"Fear is just the start of courage.", +"Do you know who I am?", +"I do not lower myself for anyone.", +"Mediocrity offends me.", +"I'm with you, to the end.", +"You'll not face it alone.", +"I don't abandon my friends.", +"Curious. Most curious." +] +}, +{ +"name": "Lily Potter", +"house": "G", +"color": "#7a1f1f", +"quotes": [ +"Be kind, it costs nothing.", +"There's good in everyone, mostly.", +"Let me help you up.", +"I'll not back down, not today.", +"Point me at the danger.", +"Fear is just the start of courage.", +"Curious. Most curious.", +"We'll see about that.", +"As you wish.", +"The castle keeps its secrets." +] +}, +{ +"name": "Fleur Delacour", +"house": null, +"color": "#4a4a4a", +"quotes": [ +"Do you know who I am?", +"I do not lower myself for anyone.", +"Mediocrity offends me.", +"I'll not back down, not today.", +"Point me at the danger.", +"Fear is just the start of courage.", +"Curious. Most curious.", +"We'll see about that.", +"As you wish.", +"The castle keeps its secrets." +] +}, +{ +"name": "Viktor Krum", +"house": null, +"color": "#2a2a3a", +"quotes": [ +"Hmph.", +"I do not waste words.", +"We shall see.", +"Bah.", +"Speak plainly or not at all.", +"Don't make me repeat myself.", +"Curious. Most curious.", +"We'll see about that.", +"As you wish.", +"The castle keeps its secrets." +] +}, +{ +"name": "Dean Thomas", +"house": "G", +"color": "#7a1f1f", +"quotes": [ +"Be kind, it costs nothing.", +"There's good in everyone, mostly.", +"Let me help you up.", +"Things are often stranger than they seem.", +"I rather like the upside-down.", +"Do you hear them too?", +"I'm with you, to the end.", +"You'll not face it alone.", +"I don't abandon my friends.", +"Curious. Most curious." +] +}, +{ +"name": "Seamus Finnigan", +"house": "G", +"color": "#7a1f1f", +"quotes": [ +"Well, this is going splendidly.", +"Mischief, as ever, managed.", +"Hold my butterbeer.", +"Curious. Most curious.", +"We'll see about that.", +"As you wish.", +"The castle keeps its secrets.", +"Mind how you go.", +"Magic always leaves a trace.", +"Trouble follows the unwary." +] +}, +{ +"name": "Lavender Brown", +"house": "G", +"color": "#7a1f1f", +"quotes": [ +"I sense a disturbance...", +"The Grim! I see the Grim!", +"Your aura is most troubled today.", +"Curious. Most curious.", +"We'll see about that.", +"As you wish.", +"The castle keeps its secrets.", +"Mind how you go.", +"Magic always leaves a trace.", +"Trouble follows the unwary." +] +}, +{ +"name": "Parvati Patil", +"house": "G", +"color": "#7a1f1f", +"quotes": [ +"Well, this is going splendidly.", +"Mischief, as ever, managed.", +"Hold my butterbeer.", +"I sense a disturbance...", +"The Grim! I see the Grim!", +"Your aura is most troubled today.", +"Curious. Most curious.", +"We'll see about that.", +"As you wish.", +"The castle keeps its secrets." +] +}, +{ +"name": "Padma Patil", +"house": "R", +"color": "#2f4a8a", +"quotes": [ +"I read about this. Twice.", +"Logic will get us out of here.", +"Honestly, it's quite simple.", +"Hmph.", +"I do not waste words.", +"We shall see.", +"Curious. Most curious.", +"We'll see about that.", +"As you wish.", +"The castle keeps its secrets." +] +}, +{ +"name": "Oliver Wood", +"house": "G", +"color": "#7a1f1f", +"quotes": [ +"I take this very seriously.", +"You can count on me.", +"I won't let you down.", +"Curious. Most curious.", +"We'll see about that.", +"As you wish.", +"The castle keeps its secrets.", +"Mind how you go.", +"Magic always leaves a trace.", +"Trouble follows the unwary." +] +}, +{ +"name": "Angelina Johnson", +"house": "G", +"color": "#7a1f1f", +"quotes": [ +"I take this very seriously.", +"You can count on me.", +"I won't let you down.", +"Curious. Most curious.", +"We'll see about that.", +"As you wish.", +"The castle keeps its secrets.", +"Mind how you go.", +"Magic always leaves a trace.", +"Trouble follows the unwary." +] +}, +{ +"name": "Katie Bell", +"house": "G", +"color": "#7a1f1f", +"quotes": [ +"Try me and find out.", +"I fight my own battles.", +"Underestimate me again.", +"Curious. Most curious.", +"We'll see about that.", +"As you wish.", +"The castle keeps its secrets.", +"Mind how you go.", +"Magic always leaves a trace.", +"Trouble follows the unwary." +] +}, +{ +"name": "Lee Jordan", +"house": "G", +"color": "#7a1f1f", +"quotes": [ +"Well, this is going splendidly.", +"Mischief, as ever, managed.", +"Hold my butterbeer.", +"Curious. Most curious.", +"We'll see about that.", +"As you wish.", +"The castle keeps its secrets.", +"Mind how you go.", +"Magic always leaves a trace.", +"Trouble follows the unwary." +] +}, +{ +"name": "Colin Creevey", +"house": "G", +"color": "#7a1f1f", +"quotes": [ +"Oh—sorry, didn't mean to—", +"I'd rather not, if that's alright.", +"They're really quite friendly, I promise.", +"Curious. Most curious.", +"We'll see about that.", +"As you wish.", +"The castle keeps its secrets.", +"Mind how you go.", +"Magic always leaves a trace.", +"Trouble follows the unwary." +] +}, +{ +"name": "Ernie Macmillan", +"house": "H", +"color": "#9a7a1f", +"quotes": [ +"I must insist on proper procedure.", +"As Head Boy, I cannot condone this.", +"Standards, people, standards.", +"I'm with you, to the end.", +"You'll not face it alone.", +"I don't abandon my friends.", +"Do you know who I am?", +"I do not lower myself for anyone.", +"Mediocrity offends me.", +"Curious. Most curious." +] +}, +{ +"name": "Hannah Abbott", +"house": "H", +"color": "#9a7a1f", +"quotes": [ +"There, there, no harm done.", +"Be kind, it costs nothing.", +"Easy now, easy.", +"Oh—sorry, didn't mean to—", +"I'd rather not, if that's alright.", +"They're really quite friendly, I promise.", +"There's good in everyone, mostly.", +"Let me help you up.", +"Curious. Most curious.", +"We'll see about that." +] +}, +{ +"name": "Justin Finch-Fletchley", +"house": "H", +"color": "#9a7a1f", +"quotes": [ +"I take this very seriously.", +"You can count on me.", +"I won't let you down.", +"Curious. Most curious.", +"We'll see about that.", +"As you wish.", +"The castle keeps its secrets.", +"Mind how you go.", +"Magic always leaves a trace.", +"Trouble follows the unwary." +] +}, +{ +"name": "Susan Bones", +"house": "H", +"color": "#9a7a1f", +"quotes": [ +"Oh—sorry, didn't mean to—", +"I'd rather not, if that's alright.", +"They're really quite friendly, I promise.", +"I'll not back down, not today.", +"Point me at the danger.", +"Fear is just the start of courage.", +"Rules exist for a reason.", +"I expect nothing less than excellence.", +"Detention. Now.", +"Curious. Most curious." +] +}, +{ +"name": "Zacharias Smith", +"house": "H", +"color": "#9a7a1f", +"quotes": [ +"How original.", +"Charming. Truly.", +"Was that meant to impress?", +"Pathetic.", +"Do you know who I am?", +"I do not lower myself for anyone.", +"Mediocrity offends me.", +"Curious. Most curious.", +"We'll see about that.", +"As you wish." +] +}, +{ +"name": "Terry Boot", +"house": "R", +"color": "#2f4a8a", +"quotes": [ +"I read about this. Twice.", +"Logic will get us out of here.", +"Honestly, it's quite simple.", +"Patience reveals what haste hides.", +"Every choice carves the soul.", +"Listen more than you speak.", +"Curious. Most curious.", +"We'll see about that.", +"As you wish.", +"The castle keeps its secrets." +] +}, +{ +"name": "Michael Corner", +"house": "R", +"color": "#2f4a8a", +"quotes": [ +"How original.", +"Charming. Truly.", +"Was that meant to impress?", +"Pathetic.", +"I read about this. Twice.", +"Logic will get us out of here.", +"Honestly, it's quite simple.", +"Curious. Most curious.", +"We'll see about that.", +"As you wish." +] +}, +{ +"name": "Marcus Flint", +"house": "S", +"color": "#1f5a2f", +"quotes": [ +"Bah.", +"Speak plainly or not at all.", +"Don't make me repeat myself.", +"Curious. Most curious.", +"We'll see about that.", +"As you wish.", +"The castle keeps its secrets.", +"Mind how you go.", +"Magic always leaves a trace.", +"Trouble follows the unwary." +] +}, +{ +"name": "Vincent Crabbe", +"house": "S", +"color": "#1f5a2f", +"quotes": [ +"Bah.", +"Speak plainly or not at all.", +"Don't make me repeat myself.", +"Curious. Most curious.", +"We'll see about that.", +"As you wish.", +"The castle keeps its secrets.", +"Mind how you go.", +"Magic always leaves a trace.", +"Trouble follows the unwary." +] +}, +{ +"name": "Gregory Goyle", +"house": "S", +"color": "#1f5a2f", +"quotes": [ +"Bah.", +"Speak plainly or not at all.", +"Don't make me repeat myself.", +"Curious. Most curious.", +"We'll see about that.", +"As you wish.", +"The castle keeps its secrets.", +"Mind how you go.", +"Magic always leaves a trace.", +"Trouble follows the unwary." +] +}, +{ +"name": "Pansy Parkinson", +"house": "S", +"color": "#1f5a2f", +"quotes": [ +"How original.", +"Charming. Truly.", +"Was that meant to impress?", +"Pathetic.", +"Curious. Most curious.", +"We'll see about that.", +"As you wish.", +"The castle keeps its secrets.", +"Mind how you go.", +"Magic always leaves a trace." +] +}, +{ +"name": "Blaise Zabini", +"house": "S", +"color": "#1f5a2f", +"quotes": [ +"Hmph.", +"I do not waste words.", +"We shall see.", +"Curious. Most curious.", +"We'll see about that.", +"As you wish.", +"The castle keeps its secrets.", +"Mind how you go.", +"Magic always leaves a trace.", +"Trouble follows the unwary." +] +}, +{ +"name": "Theodore Nott", +"house": "S", +"color": "#1f5a2f", +"quotes": [ +"Oh—sorry, didn't mean to—", +"I'd rather not, if that's alright.", +"They're really quite friendly, I promise.", +"I read about this. Twice.", +"Logic will get us out of here.", +"Honestly, it's quite simple.", +"The Dark Lord rewards the faithful.", +"Light makes such weak men.", +"Soon, none will oppose us.", +"Curious. Most curious." +] +}, +{ +"name": "Fenrir Greyback", +"house": null, +"color": "#2a2a3a", +"quotes": [ +"You will regret crossing me.", +"Beg. It changes nothing.", +"Pain is such a fine teacher.", +"Curious. Most curious.", +"We'll see about that.", +"As you wish.", +"The castle keeps its secrets.", +"Mind how you go.", +"Magic always leaves a trace.", +"Trouble follows the unwary." +] +}, +{ +"name": "Antonin Dolohov", +"house": "S", +"color": "#2a2a3a", +"quotes": [ +"You will regret crossing me.", +"Beg. It changes nothing.", +"Pain is such a fine teacher.", +"The Dark Lord rewards the faithful.", +"Light makes such weak men.", +"Soon, none will oppose us.", +"Curious. Most curious.", +"We'll see about that.", +"As you wish.", +"The castle keeps its secrets." +] +}, +{ +"name": "Rodolphus Lestrange", +"house": "S", +"color": "#2a2a3a", +"quotes": [ +"I'm with you, to the end.", +"You'll not face it alone.", +"I don't abandon my friends.", +"You will regret crossing me.", +"Beg. It changes nothing.", +"Pain is such a fine teacher.", +"Curious. Most curious.", +"We'll see about that.", +"As you wish.", +"The castle keeps its secrets." +] +}, +{ +"name": "Rabastan Lestrange", +"house": "S", +"color": "#2a2a3a", +"quotes": [ +"You will regret crossing me.", +"Beg. It changes nothing.", +"Pain is such a fine teacher.", +"Bah.", +"Speak plainly or not at all.", +"Don't make me repeat myself.", +"Curious. Most curious.", +"We'll see about that.", +"As you wish.", +"The castle keeps its secrets." +] +}, +{ +"name": "Walden Macnair", +"house": null, +"color": "#2a2a3a", +"quotes": [ +"The Dark Lord rewards the faithful.", +"Light makes such weak men.", +"Soon, none will oppose us.", +"You will regret crossing me.", +"Beg. It changes nothing.", +"Pain is such a fine teacher.", +"Curious. Most curious.", +"We'll see about that.", +"As you wish.", +"The castle keeps its secrets." +] +}, +{ +"name": "Barty Crouch Jr", +"house": "S", +"color": "#2a2a3a", +"quotes": [ +"Hee hee hee!", +"Shall I do it slowly?", +"The Dark Lord is RISING!", +"Crucio is such fun.", +"The Dark Lord rewards the faithful.", +"I'd die for him, gladly.", +"Blood traitors deserve nothing.", +"All hail the Dark Lord.", +"I read about this. Twice.", +"Logic will get us out of here." +] +}, +{ +"name": "Barty Crouch Sr", +"house": null, +"color": "#5a4a6a", +"quotes": [ +"I must insist on proper procedure.", +"As Head Boy, I cannot condone this.", +"Standards, people, standards.", +"How frightfully common.", +"I don't associate with your sort.", +"Beneath me, all of it.", +"Curious. Most curious.", +"We'll see about that.", +"As you wish.", +"The castle keeps its secrets." +] +}, +{ +"name": "Igor Karkaroff", +"house": null, +"color": "#2a2a3a", +"quotes": [ +"I—I'm not sure we should.", +"Please don't make me.", +"Oh dear, oh dear.", +"How original.", +"Charming. Truly.", +"Was that meant to impress?", +"Pathetic.", +"Curious. Most curious.", +"We'll see about that.", +"As you wish." +] +}, +{ +"name": "Madame Maxime", +"house": null, +"color": "#4a4a4a", +"quotes": [ +"Do you know who I am?", +"I do not lower myself for anyone.", +"Mediocrity offends me.", +"There, there, no harm done.", +"Be kind, it costs nothing.", +"Easy now, easy.", +"Curious. Most curious.", +"We'll see about that.", +"As you wish.", +"The castle keeps its secrets." +] +}, +{ +"name": "Grawp", +"house": null, +"color": "#5a3a1a", +"quotes": [ +"Oh—sorry, didn't mean to—", +"I'd rather not, if that's alright.", +"They're really quite friendly, I promise.", +"There, there, no harm done.", +"Be kind, it costs nothing.", +"Easy now, easy.", +"Curious. Most curious.", +"We'll see about that.", +"As you wish.", +"The castle keeps its secrets." +] +}, +{ +"name": "Firenze", +"house": null, +"color": "#5a3a1a", +"quotes": [ +"Patience reveals what haste hides.", +"Every choice carves the soul.", +"Listen more than you speak.", +"Mars is bright tonight.", +"The heavens do not lie.", +"We do not meddle in human folly.", +"Curious. Most curious.", +"We'll see about that.", +"As you wish.", +"The castle keeps its secrets." +] +}, +{ +"name": "Bane", +"house": null, +"color": "#5a3a1a", +"quotes": [ +"Do you know who I am?", +"I do not lower myself for anyone.", +"Mediocrity offends me.", +"Mars is bright tonight.", +"The heavens do not lie.", +"We do not meddle in human folly.", +"Curious. Most curious.", +"We'll see about that.", +"As you wish.", +"The castle keeps its secrets." +] +}, +{ +"name": "Aragog", +"house": null, +"color": "#5a3a1a", +"quotes": [ +"Patience reveals what haste hides.", +"Every choice carves the soul.", +"Listen more than you speak.", +"The Dark Lord rewards the faithful.", +"Light makes such weak men.", +"Soon, none will oppose us.", +"Curious. Most curious.", +"We'll see about that.", +"As you wish.", +"The castle keeps its secrets." +] +}, +{ +"name": "Buckbeak", +"house": null, +"color": "#5a3a1a", +"quotes": [ +"Do you know who I am?", +"I do not lower myself for anyone.", +"Mediocrity offends me.", +"Curious. Most curious.", +"We'll see about that.", +"As you wish.", +"The castle keeps its secrets.", +"Mind how you go.", +"Magic always leaves a trace.", +"Trouble follows the unwary." +] +}, +{ +"name": "Fawkes", +"house": null, +"color": "#5a3a1a", +"quotes": [ +"I'm with you, to the end.", +"You'll not face it alone.", +"I don't abandon my friends.", +"All is as it should be.", +"I'm not worried, are you?", +"Everything floats, eventually.", +"Curious. Most curious.", +"We'll see about that.", +"As you wish.", +"The castle keeps its secrets." +] +}, +{ +"name": "Kreacher", +"house": null, +"color": "#5a3a1a", +"quotes": [ +"How original.", +"Charming. Truly.", +"Was that meant to impress?", +"Pathetic.", +"Curious. Most curious.", +"We'll see about that.", +"As you wish.", +"The castle keeps its secrets.", +"Mind how you go.", +"Magic always leaves a trace." +] +}, +{ +"name": "Winky", +"house": null, +"color": "#5a3a1a", +"quotes": [ +"I'm with you, to the end.", +"You'll not face it alone.", +"I don't abandon my friends.", +"I'm getting too old for this.", +"Another full moon, another scar.", +"Rest, while you can.", +"Curious. Most curious.", +"We'll see about that.", +"As you wish.", +"The castle keeps its secrets." +] +}, +{ +"name": "Griphook", +"house": null, +"color": "#5a3a1a", +"quotes": [ +"Gold remembers its owner.", +"Wizards. Always cheating.", +"A deal is a deal.", +"Curious. Most curious.", +"We'll see about that.", +"As you wish.", +"The castle keeps its secrets.", +"Mind how you go.", +"Magic always leaves a trace.", +"Trouble follows the unwary." +] +}, +{ +"name": "Ollivander", +"house": "R", +"color": "#4a4a4a", +"quotes": [ +"The wand chooses the wizard.", +"Curious... very curious.", +"I remember every wand I sold.", +"Patience reveals what haste hides.", +"Every choice carves the soul.", +"Listen more than you speak.", +"Curious. Most curious.", +"We'll see about that.", +"As you wish.", +"The castle keeps its secrets." +] +}, +{ +"name": "Mr Borgin", +"house": null, +"color": "#2a2a3a", +"quotes": [ +"How original.", +"Charming. Truly.", +"Was that meant to impress?", +"Pathetic.", +"The Dark Lord rewards the faithful.", +"Light makes such weak men.", +"Soon, none will oppose us.", +"Curious. Most curious.", +"We'll see about that.", +"As you wish." +] +}, +{ +"name": "Tom the Innkeeper", +"house": null, +"color": "#4a4a4a", +"quotes": [ +"Be kind, it costs nothing.", +"There's good in everyone, mostly.", +"Let me help you up.", +"Curious. Most curious.", +"We'll see about that.", +"As you wish.", +"The castle keeps its secrets.", +"Mind how you go.", +"Magic always leaves a trace.", +"Trouble follows the unwary." +] +}, +{ +"name": "Stan Shunpike", +"house": null, +"color": "#4a4a4a", +"quotes": [ +"Well, this is going splendidly.", +"Mischief, as ever, managed.", +"Hold my butterbeer.", +"Curious. Most curious.", +"We'll see about that.", +"As you wish.", +"The castle keeps its secrets.", +"Mind how you go.", +"Magic always leaves a trace.", +"Trouble follows the unwary." +] +}, +{ +"name": "Ernie Prang", +"house": null, +"color": "#4a4a4a", +"quotes": [ +"I'm getting too old for this.", +"Another full moon, another scar.", +"Rest, while you can.", +"Things are often stranger than they seem.", +"I rather like the upside-down.", +"Do you hear them too?", +"Curious. Most curious.", +"We'll see about that.", +"As you wish.", +"The castle keeps its secrets." +] +}, +{ +"name": "Rita Skeeter", +"house": "S", +"color": "#4a4a4a", +"quotes": [ +"Care to comment, dear?", +"My readers will adore this.", +"A quick quote, just a teensy one.", +"Curious. Most curious.", +"We'll see about that.", +"As you wish.", +"The castle keeps its secrets.", +"Mind how you go.", +"Magic always leaves a trace.", +"Trouble follows the unwary." +] +}, +{ +"name": "Xenophilius Lovegood", +"house": "R", +"color": "#2f4a8a", +"quotes": [ +"Things are often stranger than they seem.", +"I rather like the upside-down.", +"Do you hear them too?", +"Curious. Most curious.", +"We'll see about that.", +"As you wish.", +"The castle keeps its secrets.", +"Mind how you go.", +"Magic always leaves a trace.", +"Trouble follows the unwary." +] +}, +{ +"name": "Mad Auntie Muriel", +"house": null, +"color": "#7a1f1f", +"quotes": [ +"Bah.", +"Speak plainly or not at all.", +"Don't make me repeat myself.", +"How original.", +"Charming. Truly.", +"Was that meant to impress?", +"Pathetic.", +"I'm getting too old for this.", +"Another full moon, another scar.", +"Rest, while you can." +] +}, +{ +"name": "Andromeda Tonks", +"house": "S", +"color": "#7a1f1f", +"quotes": [ +"Come in, come in, you're freezing.", +"There's always room for one more.", +"Family is everything.", +"Do you know who I am?", +"I do not lower myself for anyone.", +"Mediocrity offends me.", +"Curious. Most curious.", +"We'll see about that.", +"As you wish.", +"The castle keeps its secrets." +] +}, +{ +"name": "Regulus Black", +"house": "S", +"color": "#1f5a2f", +"quotes": [ +"I'll not back down, not today.", +"Point me at the danger.", +"Fear is just the start of courage.", +"I'm getting too old for this.", +"Another full moon, another scar.", +"Rest, while you can.", +"Hmph.", +"I do not waste words.", +"We shall see.", +"Curious. Most curious." +] +}, +{ +"name": "Aberforth Dumbledore", +"house": "G", +"color": "#3a5a3a", +"quotes": [ +"Bah.", +"Speak plainly or not at all.", +"Don't make me repeat myself.", +"I'm with you, to the end.", +"You'll not face it alone.", +"I don't abandon my friends.", +"Curious. Most curious.", +"We'll see about that.", +"As you wish.", +"The castle keeps its secrets." +] +}, +{ +"name": "Gellert Grindelwald", +"house": null, +"color": "#2a2a3a", +"quotes": [ +"For the greater good.", +"The old ways must end.", +"Power belongs to the bold.", +"The Dark Lord rewards the faithful.", +"Light makes such weak men.", +"Soon, none will oppose us.", +"Curious. Most curious.", +"We'll see about that.", +"As you wish.", +"The castle keeps its secrets." +] +}, +{ +"name": "Newt Scamander", +"house": "H", +"color": "#9a7a1f", +"quotes": [ +"Oh—sorry, didn't mean to—", +"I'd rather not, if that's alright.", +"They're really quite friendly, I promise.", +"There, there, no harm done.", +"Be kind, it costs nothing.", +"Easy now, easy.", +"Misunderstood, they are, not monsters.", +"Treat 'em right an' they'll treat you right.", +"Beautiful creature, ain't she?", +"Curious. Most curious." +] +}, +{ +"name": "Nicolas Flamel", +"house": null, +"color": "#4a4a4a", +"quotes": [ +"Patience reveals what haste hides.", +"Every choice carves the soul.", +"Listen more than you speak.", +"Curious. Most curious.", +"We'll see about that.", +"As you wish.", +"The castle keeps its secrets.", +"Mind how you go.", +"Magic always leaves a trace.", +"Trouble follows the unwary." +] +}, +{ +"name": "Quirinus Quirrell", +"house": null, +"color": "#2a2a3a", +"quotes": [ +"Oh—sorry, didn't mean to—", +"I'd rather not, if that's alright.", +"They're really quite friendly, I promise.", +"The Dark Lord rewards the faithful.", +"Light makes such weak men.", +"Soon, none will oppose us.", +"Curious. Most curious.", +"We'll see about that.", +"As you wish.", +"The castle keeps its secrets." +] +} +]; diff --git a/dimos/apps/marauders_map/templates/icons/gryffindor.png b/dimos/apps/marauders_map/templates/icons/gryffindor.png new file mode 100644 index 0000000000..af64fa44ba --- /dev/null +++ b/dimos/apps/marauders_map/templates/icons/gryffindor.png @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:5e1d8aea79d363e473581b05473a819261f202d016f4f7d14008177c2d079dac +size 3921 diff --git a/dimos/apps/marauders_map/templates/icons/hufflepuff.png b/dimos/apps/marauders_map/templates/icons/hufflepuff.png new file mode 100644 index 0000000000..214e439892 --- /dev/null +++ b/dimos/apps/marauders_map/templates/icons/hufflepuff.png @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:dc8c8e95ce01e06ba83e37e1f34bead9bb5116988725306f76fa8efb69fe77e9 +size 3803 diff --git a/dimos/apps/marauders_map/templates/icons/ravenclaw.png b/dimos/apps/marauders_map/templates/icons/ravenclaw.png new file mode 100644 index 0000000000..59adcf5432 --- /dev/null +++ b/dimos/apps/marauders_map/templates/icons/ravenclaw.png @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:887f432c99af28a6719ffc0d53993e7f099e922b20cb6e5698a3fb354b4e24a0 +size 3763 diff --git a/dimos/apps/marauders_map/templates/icons/slytherin.png b/dimos/apps/marauders_map/templates/icons/slytherin.png new file mode 100644 index 0000000000..2567fe83fa --- /dev/null +++ b/dimos/apps/marauders_map/templates/icons/slytherin.png @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:3577f17193e01e4851dea9d9b57c86160be6f62b6342aa262743ff8396061896 +size 3719 diff --git a/dimos/apps/marauders_map/templates/marauders_map.html b/dimos/apps/marauders_map/templates/marauders_map.html new file mode 100644 index 0000000000..26943a9fb1 --- /dev/null +++ b/dimos/apps/marauders_map/templates/marauders_map.html @@ -0,0 +1,1826 @@ + + + + + + The Marauder's Map · DimOS + + + + +
+ +
+
The Wrath of Filch
+
A DimOS × Hogwarts Mischief
+
+
+ Night has fallen at Hogwarts. The torches gutter low. Curfew has come. +
+
+ And yet — across the castle, students still scroll their phones beneath + the covers, blue light flickering in the dorms. +
+
+ Argus Filch, hateful caretaker, prowls the corridors. Any phone he + finds is gone forever — locked away in his confiscation drawer beside + the Skiving Snackboxes and Extendable Ears. +
+
+ Tonight, a Unitree Go2 is your familiar. Send the hound to each soul + before Filch arrives. Snatch their phones. Hide them. Save the night. +
+
+

Win Conditions

+
    +
  • Spot every student who appears on the parchment.
  • +
  • Click their face — send the hound to hide their phone.
  • +
  • Beat Filch to all of them before the bell tolls dawn.
  • +
+
+
+ + +
+
+
+ +
+
+

The Marauder’s Map

+
Messrs DimOS solemnly swear they are up to good
+
summoning…
+
+
+ + + +
+
+ + +
— the map is blank until footprints are spied —
+ +
+
The Wrath of Filch
+
Beat Filch — hide every student's phone.
+
    +
  • Spot every student on the parchment.
  • +
  • Click their face — send the hound.
  • +
  • Get to all of them before dawn.
  • +
+
+ +
+ tracking — + release +
+ +
+ +
+ + + + + + + +
+
+ +
+
+
+ +
+
+

Blimey!

+
A spot of mischief afoot — set the hound upon…
+
+
Speak now, lest the parchment lose its patience.
+
+ + +
+
Enter to swear · Esc to recant
+
+
+ + + + + + + + + + + + + + + diff --git a/dimos/apps/marauders_map/templates/potion_latch.mp3 b/dimos/apps/marauders_map/templates/potion_latch.mp3 new file mode 100644 index 0000000000..73613947ed Binary files /dev/null and b/dimos/apps/marauders_map/templates/potion_latch.mp3 differ diff --git a/dimos/apps/marauders_map/templates/socketio.min.js b/dimos/apps/marauders_map/templates/socketio.min.js new file mode 100644 index 0000000000..d6b2d60110 --- /dev/null +++ b/dimos/apps/marauders_map/templates/socketio.min.js @@ -0,0 +1,7 @@ +/*! + * Socket.IO v4.7.5 + * (c) 2014-2024 Guillermo Rauch + * Released under the MIT License. + */ +!function(e,t){"object"==typeof exports&&"undefined"!=typeof module?module.exports=t():"function"==typeof define&&define.amd?define(t):(e="undefined"!=typeof globalThis?globalThis:e||self).io=t()}(this,(function(){"use strict";function e(t){return e="function"==typeof Symbol&&"symbol"==typeof Symbol.iterator?function(e){return typeof e}:function(e){return e&&"function"==typeof Symbol&&e.constructor===Symbol&&e!==Symbol.prototype?"symbol":typeof e},e(t)}function t(e,t){if(!(e instanceof t))throw new TypeError("Cannot call a class as a function")}function n(e,t){for(var n=0;ne.length)&&(t=e.length);for(var n=0,r=new Array(t);n=e.length?{done:!0}:{done:!1,value:e[r++]}},e:function(e){throw e},f:i}}throw new TypeError("Invalid attempt to iterate non-iterable instance.\nIn order to be iterable, non-array objects must have a [Symbol.iterator]() method.")}var o,s=!0,a=!1;return{s:function(){n=n.call(e)},n:function(){var e=n.next();return s=e.done,e},e:function(e){a=!0,o=e},f:function(){try{s||null==n.return||n.return()}finally{if(a)throw o}}}}var v=Object.create(null);v.open="0",v.close="1",v.ping="2",v.pong="3",v.message="4",v.upgrade="5",v.noop="6";var g=Object.create(null);Object.keys(v).forEach((function(e){g[v[e]]=e}));var m,b={type:"error",data:"parser error"},k="function"==typeof Blob||"undefined"!=typeof Blob&&"[object BlobConstructor]"===Object.prototype.toString.call(Blob),w="function"==typeof ArrayBuffer,_=function(e){return"function"==typeof ArrayBuffer.isView?ArrayBuffer.isView(e):e&&e.buffer instanceof ArrayBuffer},E=function(e,t,n){var r=e.type,i=e.data;return k&&i instanceof Blob?t?n(i):A(i,n):w&&(i instanceof ArrayBuffer||_(i))?t?n(i):A(new Blob([i]),n):n(v[r]+(i||""))},A=function(e,t){var n=new FileReader;return n.onload=function(){var e=n.result.split(",")[1];t("b"+(e||""))},n.readAsDataURL(e)};function O(e){return e instanceof Uint8Array?e:e instanceof ArrayBuffer?new Uint8Array(e):new Uint8Array(e.buffer,e.byteOffset,e.byteLength)}for(var T="ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/",R="undefined"==typeof Uint8Array?[]:new Uint8Array(256),C=0;C<64;C++)R[T.charCodeAt(C)]=C;var B,S="function"==typeof ArrayBuffer,N=function(e,t){if("string"!=typeof e)return{type:"message",data:x(e,t)};var n=e.charAt(0);return"b"===n?{type:"message",data:L(e.substring(1),t)}:g[n]?e.length>1?{type:g[n],data:e.substring(1)}:{type:g[n]}:b},L=function(e,t){if(S){var n=function(e){var t,n,r,i,o,s=.75*e.length,a=e.length,c=0;"="===e[e.length-1]&&(s--,"="===e[e.length-2]&&s--);var u=new ArrayBuffer(s),h=new Uint8Array(u);for(t=0;t>4,h[c++]=(15&r)<<4|i>>2,h[c++]=(3&i)<<6|63&o;return u}(e);return x(n,t)}return{base64:!0,data:e}},x=function(e,t){return"blob"===t?e instanceof Blob?e:new Blob([e]):e instanceof ArrayBuffer?e:e.buffer},P=String.fromCharCode(30);function j(){return new TransformStream({transform:function(e,t){!function(e,t){k&&e.data instanceof Blob?e.data.arrayBuffer().then(O).then(t):w&&(e.data instanceof ArrayBuffer||_(e.data))?t(O(e.data)):E(e,!1,(function(e){m||(m=new TextEncoder),t(m.encode(e))}))}(e,(function(n){var r,i=n.length;if(i<126)r=new Uint8Array(1),new DataView(r.buffer).setUint8(0,i);else if(i<65536){r=new Uint8Array(3);var o=new DataView(r.buffer);o.setUint8(0,126),o.setUint16(1,i)}else{r=new Uint8Array(9);var s=new DataView(r.buffer);s.setUint8(0,127),s.setBigUint64(1,BigInt(i))}e.data&&"string"!=typeof e.data&&(r[0]|=128),t.enqueue(r),t.enqueue(n)}))}})}function q(e){return e.reduce((function(e,t){return e+t.length}),0)}function D(e,t){if(e[0].length===t)return e.shift();for(var n=new Uint8Array(t),r=0,i=0;i1?t-1:0),r=1;r1&&void 0!==arguments[1]?arguments[1]:{};return e+"://"+this._hostname()+this._port()+this.opts.path+this._query(t)}},{key:"_hostname",value:function(){var e=this.opts.hostname;return-1===e.indexOf(":")?e:"["+e+"]"}},{key:"_port",value:function(){return this.opts.port&&(this.opts.secure&&Number(443!==this.opts.port)||!this.opts.secure&&80!==Number(this.opts.port))?":"+this.opts.port:""}},{key:"_query",value:function(e){var t=function(e){var t="";for(var n in e)e.hasOwnProperty(n)&&(t.length&&(t+="&"),t+=encodeURIComponent(n)+"="+encodeURIComponent(e[n]));return t}(e);return t.length?"?"+t:""}}]),i}(U),z="0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz-_".split(""),J=64,$={},Q=0,X=0;function G(e){var t="";do{t=z[e%J]+t,e=Math.floor(e/J)}while(e>0);return t}function Z(){var e=G(+new Date);return e!==K?(Q=0,K=e):e+"."+G(Q++)}for(;X0&&void 0!==arguments[0]?arguments[0]:{};return i(e,{xd:this.xd,cookieJar:this.cookieJar},this.opts),new se(this.uri(),e)}},{key:"doWrite",value:function(e,t){var n=this,r=this.request({method:"POST",data:e});r.on("success",t),r.on("error",(function(e,t){n.onError("xhr post error",e,t)}))}},{key:"doPoll",value:function(){var e=this,t=this.request();t.on("data",this.onData.bind(this)),t.on("error",(function(t,n){e.onError("xhr poll error",t,n)})),this.pollXhr=t}}]),s}(W),se=function(e){o(i,e);var n=l(i);function i(e,r){var o;return t(this,i),H(f(o=n.call(this)),r),o.opts=r,o.method=r.method||"GET",o.uri=e,o.data=void 0!==r.data?r.data:null,o.create(),o}return r(i,[{key:"create",value:function(){var e,t=this,n=F(this.opts,"agent","pfx","key","passphrase","cert","ca","ciphers","rejectUnauthorized","autoUnref");n.xdomain=!!this.opts.xd;var r=this.xhr=new ne(n);try{r.open(this.method,this.uri,!0);try{if(this.opts.extraHeaders)for(var o in r.setDisableHeaderCheck&&r.setDisableHeaderCheck(!0),this.opts.extraHeaders)this.opts.extraHeaders.hasOwnProperty(o)&&r.setRequestHeader(o,this.opts.extraHeaders[o])}catch(e){}if("POST"===this.method)try{r.setRequestHeader("Content-type","text/plain;charset=UTF-8")}catch(e){}try{r.setRequestHeader("Accept","*/*")}catch(e){}null===(e=this.opts.cookieJar)||void 0===e||e.addCookies(r),"withCredentials"in r&&(r.withCredentials=this.opts.withCredentials),this.opts.requestTimeout&&(r.timeout=this.opts.requestTimeout),r.onreadystatechange=function(){var e;3===r.readyState&&(null===(e=t.opts.cookieJar)||void 0===e||e.parseCookies(r)),4===r.readyState&&(200===r.status||1223===r.status?t.onLoad():t.setTimeoutFn((function(){t.onError("number"==typeof r.status?r.status:0)}),0))},r.send(this.data)}catch(e){return void this.setTimeoutFn((function(){t.onError(e)}),0)}"undefined"!=typeof document&&(this.index=i.requestsCount++,i.requests[this.index]=this)}},{key:"onError",value:function(e){this.emitReserved("error",e,this.xhr),this.cleanup(!0)}},{key:"cleanup",value:function(e){if(void 0!==this.xhr&&null!==this.xhr){if(this.xhr.onreadystatechange=re,e)try{this.xhr.abort()}catch(e){}"undefined"!=typeof document&&delete i.requests[this.index],this.xhr=null}}},{key:"onLoad",value:function(){var e=this.xhr.responseText;null!==e&&(this.emitReserved("data",e),this.emitReserved("success"),this.cleanup())}},{key:"abort",value:function(){this.cleanup()}}]),i}(U);if(se.requestsCount=0,se.requests={},"undefined"!=typeof document)if("function"==typeof attachEvent)attachEvent("onunload",ae);else if("function"==typeof addEventListener){addEventListener("onpagehide"in I?"pagehide":"unload",ae,!1)}function ae(){for(var e in se.requests)se.requests.hasOwnProperty(e)&&se.requests[e].abort()}var ce="function"==typeof Promise&&"function"==typeof Promise.resolve?function(e){return Promise.resolve().then(e)}:function(e,t){return t(e,0)},ue=I.WebSocket||I.MozWebSocket,he="undefined"!=typeof navigator&&"string"==typeof navigator.product&&"reactnative"===navigator.product.toLowerCase(),fe=function(e){o(i,e);var n=l(i);function i(e){var r;return t(this,i),(r=n.call(this,e)).supportsBinary=!e.forceBase64,r}return r(i,[{key:"name",get:function(){return"websocket"}},{key:"doOpen",value:function(){if(this.check()){var e=this.uri(),t=this.opts.protocols,n=he?{}:F(this.opts,"agent","perMessageDeflate","pfx","key","passphrase","cert","ca","ciphers","rejectUnauthorized","localAddress","protocolVersion","origin","maxPayload","family","checkServerIdentity");this.opts.extraHeaders&&(n.headers=this.opts.extraHeaders);try{this.ws=he?new ue(e,t,n):t?new ue(e,t):new ue(e)}catch(e){return this.emitReserved("error",e)}this.ws.binaryType=this.socket.binaryType,this.addEventListeners()}}},{key:"addEventListeners",value:function(){var e=this;this.ws.onopen=function(){e.opts.autoUnref&&e.ws._socket.unref(),e.onOpen()},this.ws.onclose=function(t){return e.onClose({description:"websocket connection closed",context:t})},this.ws.onmessage=function(t){return e.onData(t.data)},this.ws.onerror=function(t){return e.onError("websocket error",t)}}},{key:"write",value:function(e){var t=this;this.writable=!1;for(var n=function(){var n=e[r],i=r===e.length-1;E(n,t.supportsBinary,(function(e){try{t.ws.send(e)}catch(e){}i&&ce((function(){t.writable=!0,t.emitReserved("drain")}),t.setTimeoutFn)}))},r=0;rMath.pow(2,21)-1){a.enqueue(b);break}i=l*Math.pow(2,32)+f.getUint32(4),r=3}else{if(q(n)e){a.enqueue(b);break}}}})}(Number.MAX_SAFE_INTEGER,e.socket.binaryType),r=t.readable.pipeThrough(n).getReader(),i=j();i.readable.pipeTo(t.writable),e.writer=i.writable.getWriter();!function t(){r.read().then((function(n){var r=n.done,i=n.value;r||(e.onPacket(i),t())})).catch((function(e){}))}();var o={type:"open"};e.query.sid&&(o.data='{"sid":"'.concat(e.query.sid,'"}')),e.writer.write(o).then((function(){return e.onOpen()}))}))})))}},{key:"write",value:function(e){var t=this;this.writable=!1;for(var n=function(){var n=e[r],i=r===e.length-1;t.writer.write(n).then((function(){i&&ce((function(){t.writable=!0,t.emitReserved("drain")}),t.setTimeoutFn)}))},r=0;r1&&void 0!==arguments[1]?arguments[1]:{};return t(this,a),(r=s.call(this)).binaryType="arraybuffer",r.writeBuffer=[],n&&"object"===e(n)&&(o=n,n=null),n?(n=ve(n),o.hostname=n.host,o.secure="https"===n.protocol||"wss"===n.protocol,o.port=n.port,n.query&&(o.query=n.query)):o.host&&(o.hostname=ve(o.host).host),H(f(r),o),r.secure=null!=o.secure?o.secure:"undefined"!=typeof location&&"https:"===location.protocol,o.hostname&&!o.port&&(o.port=r.secure?"443":"80"),r.hostname=o.hostname||("undefined"!=typeof location?location.hostname:"localhost"),r.port=o.port||("undefined"!=typeof location&&location.port?location.port:r.secure?"443":"80"),r.transports=o.transports||["polling","websocket","webtransport"],r.writeBuffer=[],r.prevBufferLen=0,r.opts=i({path:"/engine.io",agent:!1,withCredentials:!1,upgrade:!0,timestampParam:"t",rememberUpgrade:!1,addTrailingSlash:!0,rejectUnauthorized:!0,perMessageDeflate:{threshold:1024},transportOptions:{},closeOnBeforeunload:!1},o),r.opts.path=r.opts.path.replace(/\/$/,"")+(r.opts.addTrailingSlash?"/":""),"string"==typeof r.opts.query&&(r.opts.query=function(e){for(var t={},n=e.split("&"),r=0,i=n.length;r1))return this.writeBuffer;for(var e,t=1,n=0;n=57344?n+=3:(r++,n+=4);return n}(e):Math.ceil(1.33*(e.byteLength||e.size))),n>0&&t>this.maxPayload)return this.writeBuffer.slice(0,n);t+=2}return this.writeBuffer}},{key:"write",value:function(e,t,n){return this.sendPacket("message",e,t,n),this}},{key:"send",value:function(e,t,n){return this.sendPacket("message",e,t,n),this}},{key:"sendPacket",value:function(e,t,n,r){if("function"==typeof t&&(r=t,t=void 0),"function"==typeof n&&(r=n,n=null),"closing"!==this.readyState&&"closed"!==this.readyState){(n=n||{}).compress=!1!==n.compress;var i={type:e,data:t,options:n};this.emitReserved("packetCreate",i),this.writeBuffer.push(i),r&&this.once("flush",r),this.flush()}}},{key:"close",value:function(){var e=this,t=function(){e.onClose("forced close"),e.transport.close()},n=function n(){e.off("upgrade",n),e.off("upgradeError",n),t()},r=function(){e.once("upgrade",n),e.once("upgradeError",n)};return"opening"!==this.readyState&&"open"!==this.readyState||(this.readyState="closing",this.writeBuffer.length?this.once("drain",(function(){e.upgrading?r():t()})):this.upgrading?r():t()),this}},{key:"onError",value:function(e){a.priorWebsocketSuccess=!1,this.emitReserved("error",e),this.onClose("transport error",e)}},{key:"onClose",value:function(e,t){"opening"!==this.readyState&&"open"!==this.readyState&&"closing"!==this.readyState||(this.clearTimeoutFn(this.pingTimeoutTimer),this.transport.removeAllListeners("close"),this.transport.close(),this.transport.removeAllListeners(),"function"==typeof removeEventListener&&(removeEventListener("beforeunload",this.beforeunloadEventListener,!1),removeEventListener("offline",this.offlineEventListener,!1)),this.readyState="closed",this.id=null,this.emitReserved("close",e,t),this.writeBuffer=[],this.prevBufferLen=0)}},{key:"filterUpgrades",value:function(e){for(var t=[],n=0,r=e.length;n=0&&t.num1?t-1:0),r=1;r1?n-1:0),i=1;in._opts.retries&&(n._queue.shift(),t&&t(e));else if(n._queue.shift(),t){for(var i=arguments.length,o=new Array(i>1?i-1:0),s=1;s0&&void 0!==arguments[0]&&arguments[0];if(this.connected&&0!==this._queue.length){var t=this._queue[0];t.pending&&!e||(t.pending=!0,t.tryCount++,this.flags=t.flags,this.emit.apply(this,t.args))}}},{key:"packet",value:function(e){e.nsp=this.nsp,this.io._packet(e)}},{key:"onopen",value:function(){var e=this;"function"==typeof this.auth?this.auth((function(t){e._sendConnectPacket(t)})):this._sendConnectPacket(this.auth)}},{key:"_sendConnectPacket",value:function(e){this.packet({type:Be.CONNECT,data:this._pid?i({pid:this._pid,offset:this._lastOffset},e):e})}},{key:"onerror",value:function(e){this.connected||this.emitReserved("connect_error",e)}},{key:"onclose",value:function(e,t){this.connected=!1,delete this.id,this.emitReserved("disconnect",e,t),this._clearAcks()}},{key:"_clearAcks",value:function(){var e=this;Object.keys(this.acks).forEach((function(t){if(!e.sendBuffer.some((function(e){return String(e.id)===t}))){var n=e.acks[t];delete e.acks[t],n.withError&&n.call(e,new Error("socket has been disconnected"))}}))}},{key:"onpacket",value:function(e){if(e.nsp===this.nsp)switch(e.type){case Be.CONNECT:e.data&&e.data.sid?this.onconnect(e.data.sid,e.data.pid):this.emitReserved("connect_error",new Error("It seems you are trying to reach a Socket.IO server in v2.x with a v3.x client, but they are not compatible (more information here: https://socket.io/docs/v3/migrating-from-2-x-to-3-0/)"));break;case Be.EVENT:case Be.BINARY_EVENT:this.onevent(e);break;case Be.ACK:case Be.BINARY_ACK:this.onack(e);break;case Be.DISCONNECT:this.ondisconnect();break;case Be.CONNECT_ERROR:this.destroy();var t=new Error(e.data.message);t.data=e.data.data,this.emitReserved("connect_error",t)}}},{key:"onevent",value:function(e){var t=e.data||[];null!=e.id&&t.push(this.ack(e.id)),this.connected?this.emitEvent(t):this.receiveBuffer.push(Object.freeze(t))}},{key:"emitEvent",value:function(e){if(this._anyListeners&&this._anyListeners.length){var t,n=y(this._anyListeners.slice());try{for(n.s();!(t=n.n()).done;){t.value.apply(this,e)}}catch(e){n.e(e)}finally{n.f()}}p(s(a.prototype),"emit",this).apply(this,e),this._pid&&e.length&&"string"==typeof e[e.length-1]&&(this._lastOffset=e[e.length-1])}},{key:"ack",value:function(e){var t=this,n=!1;return function(){if(!n){n=!0;for(var r=arguments.length,i=new Array(r),o=0;o0&&e.jitter<=1?e.jitter:0,this.attempts=0}Ie.prototype.duration=function(){var e=this.ms*Math.pow(this.factor,this.attempts++);if(this.jitter){var t=Math.random(),n=Math.floor(t*this.jitter*e);e=0==(1&Math.floor(10*t))?e-n:e+n}return 0|Math.min(e,this.max)},Ie.prototype.reset=function(){this.attempts=0},Ie.prototype.setMin=function(e){this.ms=e},Ie.prototype.setMax=function(e){this.max=e},Ie.prototype.setJitter=function(e){this.jitter=e};var Fe=function(n){o(s,n);var i=l(s);function s(n,r){var o,a;t(this,s),(o=i.call(this)).nsps={},o.subs=[],n&&"object"===e(n)&&(r=n,n=void 0),(r=r||{}).path=r.path||"/socket.io",o.opts=r,H(f(o),r),o.reconnection(!1!==r.reconnection),o.reconnectionAttempts(r.reconnectionAttempts||1/0),o.reconnectionDelay(r.reconnectionDelay||1e3),o.reconnectionDelayMax(r.reconnectionDelayMax||5e3),o.randomizationFactor(null!==(a=r.randomizationFactor)&&void 0!==a?a:.5),o.backoff=new Ie({min:o.reconnectionDelay(),max:o.reconnectionDelayMax(),jitter:o.randomizationFactor()}),o.timeout(null==r.timeout?2e4:r.timeout),o._readyState="closed",o.uri=n;var c=r.parser||je;return o.encoder=new c.Encoder,o.decoder=new c.Decoder,o._autoConnect=!1!==r.autoConnect,o._autoConnect&&o.open(),o}return r(s,[{key:"reconnection",value:function(e){return arguments.length?(this._reconnection=!!e,this):this._reconnection}},{key:"reconnectionAttempts",value:function(e){return void 0===e?this._reconnectionAttempts:(this._reconnectionAttempts=e,this)}},{key:"reconnectionDelay",value:function(e){var t;return void 0===e?this._reconnectionDelay:(this._reconnectionDelay=e,null===(t=this.backoff)||void 0===t||t.setMin(e),this)}},{key:"randomizationFactor",value:function(e){var t;return void 0===e?this._randomizationFactor:(this._randomizationFactor=e,null===(t=this.backoff)||void 0===t||t.setJitter(e),this)}},{key:"reconnectionDelayMax",value:function(e){var t;return void 0===e?this._reconnectionDelayMax:(this._reconnectionDelayMax=e,null===(t=this.backoff)||void 0===t||t.setMax(e),this)}},{key:"timeout",value:function(e){return arguments.length?(this._timeout=e,this):this._timeout}},{key:"maybeReconnectOnOpen",value:function(){!this._reconnecting&&this._reconnection&&0===this.backoff.attempts&&this.reconnect()}},{key:"open",value:function(e){var t=this;if(~this._readyState.indexOf("open"))return this;this.engine=new ge(this.uri,this.opts);var n=this.engine,r=this;this._readyState="opening",this.skipReconnect=!1;var i=qe(n,"open",(function(){r.onopen(),e&&e()})),o=function(n){t.cleanup(),t._readyState="closed",t.emitReserved("error",n),e?e(n):t.maybeReconnectOnOpen()},s=qe(n,"error",o);if(!1!==this._timeout){var a=this._timeout,c=this.setTimeoutFn((function(){i(),o(new Error("timeout")),n.close()}),a);this.opts.autoUnref&&c.unref(),this.subs.push((function(){t.clearTimeoutFn(c)}))}return this.subs.push(i),this.subs.push(s),this}},{key:"connect",value:function(e){return this.open(e)}},{key:"onopen",value:function(){this.cleanup(),this._readyState="open",this.emitReserved("open");var e=this.engine;this.subs.push(qe(e,"ping",this.onping.bind(this)),qe(e,"data",this.ondata.bind(this)),qe(e,"error",this.onerror.bind(this)),qe(e,"close",this.onclose.bind(this)),qe(this.decoder,"decoded",this.ondecoded.bind(this)))}},{key:"onping",value:function(){this.emitReserved("ping")}},{key:"ondata",value:function(e){try{this.decoder.add(e)}catch(e){this.onclose("parse error",e)}}},{key:"ondecoded",value:function(e){var t=this;ce((function(){t.emitReserved("packet",e)}),this.setTimeoutFn)}},{key:"onerror",value:function(e){this.emitReserved("error",e)}},{key:"socket",value:function(e,t){var n=this.nsps[e];return n?this._autoConnect&&!n.active&&n.connect():(n=new Ue(this,e,t),this.nsps[e]=n),n}},{key:"_destroy",value:function(e){for(var t=0,n=Object.keys(this.nsps);t=this._reconnectionAttempts)this.backoff.reset(),this.emitReserved("reconnect_failed"),this._reconnecting=!1;else{var n=this.backoff.duration();this._reconnecting=!0;var r=this.setTimeoutFn((function(){t.skipReconnect||(e.emitReserved("reconnect_attempt",t.backoff.attempts),t.skipReconnect||t.open((function(n){n?(t._reconnecting=!1,t.reconnect(),e.emitReserved("reconnect_error",n)):t.onreconnect()})))}),n);this.opts.autoUnref&&r.unref(),this.subs.push((function(){e.clearTimeoutFn(r)}))}}},{key:"onreconnect",value:function(){var e=this.backoff.attempts;this._reconnecting=!1,this.backoff.reset(),this.emitReserved("reconnect",e)}}]),s}(U),Me={};function Ve(t,n){"object"===e(t)&&(n=t,t=void 0);var r,i=function(e){var t=arguments.length>1&&void 0!==arguments[1]?arguments[1]:"",n=arguments.length>2?arguments[2]:void 0,r=e;n=n||"undefined"!=typeof location&&location,null==e&&(e=n.protocol+"//"+n.host),"string"==typeof e&&("/"===e.charAt(0)&&(e="/"===e.charAt(1)?n.protocol+e:n.host+e),/^(https?|wss?):\/\//.test(e)||(e=void 0!==n?n.protocol+"//"+e:"https://"+e),r=ve(e)),r.port||(/^(http|ws)$/.test(r.protocol)?r.port="80":/^(http|ws)s$/.test(r.protocol)&&(r.port="443")),r.path=r.path||"/";var i=-1!==r.host.indexOf(":")?"["+r.host+"]":r.host;return r.id=r.protocol+"://"+i+":"+r.port+t,r.href=r.protocol+"://"+i+(n&&n.port===r.port?"":":"+r.port),r}(t,(n=n||{}).path||"/socket.io"),o=i.source,s=i.id,a=i.path,c=Me[s]&&a in Me[s].nsps;return n.forceNew||n["force new connection"]||!1===n.multiplex||c?r=new Fe(o,n):(Me[s]||(Me[s]=new Fe(o,n)),r=Me[s]),i.query&&!n.query&&(n.query=i.queryKey),r.socket(i.path,n)}return i(Ve,{Manager:Fe,Socket:Ue,io:Ve,connect:Ve}),Ve})); +//# sourceMappingURL=socket.io.min.js.map diff --git a/dimos/navigation/movement_manager/movement_manager.py b/dimos/navigation/movement_manager/movement_manager.py index ed12dc93ac..20d1ffe7c7 100644 --- a/dimos/navigation/movement_manager/movement_manager.py +++ b/dimos/navigation/movement_manager/movement_manager.py @@ -82,6 +82,9 @@ def stop(self) -> None: super().stop() def _on_click(self, msg: PointStamped) -> None: + if not self._is_navigation_click(msg): + logger.debug("Ignored non-navigation click", frame_id=msg.frame_id) + return if not all(math.isfinite(v) for v in (msg.x, msg.y, msg.z)): logger.warning("Ignored invalid click", x=msg.x, y=msg.y, z=msg.z) return @@ -94,6 +97,10 @@ def _on_click(self, msg: PointStamped) -> None: return logger.debug("Goal", x=round(msg.x, 1), y=round(msg.y, 1), z=round(msg.z, 1)) + # Cancel any active task (e.g. YOLOE bbox tracking) before starting A* navigation. + # Only stop_movement is published here — not the NaN-goal cancel used by _cancel_goal — + # because the new real goal below will override the planner state immediately. + self.stop_movement.publish(Bool(data=True)) self.way_point.publish(msg) self.goal.publish(msg) @@ -140,3 +147,8 @@ def _on_teleop(self, msg: Twist) -> None: ), ) self.cmd_vel.publish(scaled) + + @staticmethod + def _is_navigation_click(msg: PointStamped) -> bool: + frame_parts = msg.frame_id.strip("/").split("/") + return "color_image" not in frame_parts diff --git a/dimos/navigation/movement_manager/test_movement_manager.py b/dimos/navigation/movement_manager/test_movement_manager.py index e266ad3e98..e764df6a33 100644 --- a/dimos/navigation/movement_manager/test_movement_manager.py +++ b/dimos/navigation/movement_manager/test_movement_manager.py @@ -68,8 +68,8 @@ def _twist(lx=0.0): return Twist(linear=Vector3(lx, 0, 0), angular=Vector3(0, 0, 0)) -def _click(x=1.0, y=2.0, z=0.0): - return PointStamped(ts=time.time(), frame_id="map", x=x, y=y, z=z) +def _click(x=1.0, y=2.0, z=0.0, frame_id="map"): + return PointStamped(ts=time.time(), frame_id=frame_id, x=x, y=y, z=z) def test_teleop_suppresses_nav_and_cancels_goal(manager_and_captured): @@ -112,6 +112,18 @@ def test_valid_click_publishes_goal(manager_and_captured): assert captured.way_point == [click] +def test_camera_click_does_not_publish_goal_or_stop(manager_and_captured): + """A 2D camera click belongs to bbox selection, not map navigation.""" + manager, captured = manager_and_captured + click = _click(x=220.0, y=120.0, z=0.0, frame_id="/world/color_image/yoloe_detections") + + manager._on_click(click) + + assert captured.goal == [] + assert captured.way_point == [] + assert captured.stop_movement == [] + + def test_invalid_clicks_rejected(manager_and_captured): """NaN, Inf, and out-of-range clicks should not publish.""" manager, captured = manager_and_captured diff --git a/dimos/robot/all_blueprints.py b/dimos/robot/all_blueprints.py index 3d101cca79..590ad2bd3b 100644 --- a/dimos/robot/all_blueprints.py +++ b/dimos/robot/all_blueprints.py @@ -17,6 +17,7 @@ all_blueprints = { "alfred-nav": "dimos.robot.diy.alfred.blueprints.alfred_nav:alfred_nav", + "bbox-distance-follow": "dimos.robot.custom.blueprints.bbox_distance_follow:bbox_distance_follow", "coordinator-basic": "dimos.control.blueprints.basic:coordinator_basic", "coordinator-cartesian-ik-mock": "dimos.control.blueprints.teleop:coordinator_cartesian_ik_mock", "coordinator-cartesian-ik-piper": "dimos.control.blueprints.teleop:coordinator_cartesian_ik_piper", @@ -58,6 +59,7 @@ "drone-agentic": "dimos.robot.drone.blueprints.agentic.drone_agentic:drone_agentic", "drone-basic": "dimos.robot.drone.blueprints.basic.drone_basic:drone_basic", "dual-xarm6-planner": "dimos.manipulation.blueprints:dual_xarm6_planner", + "go2-marauders-map": "dimos.apps.marauders_map.blueprint:go2_marauders_map", "keyboard-teleop-openarm": "dimos.robot.manipulators.openarm.blueprints:keyboard_teleop_openarm", "keyboard-teleop-openarm-mock": "dimos.robot.manipulators.openarm.blueprints:keyboard_teleop_openarm_mock", "keyboard-teleop-piper": "dimos.robot.manipulators.piper.blueprints:keyboard_teleop_piper", @@ -103,10 +105,12 @@ "unitree-go2-keyboard-teleop": "dimos.robot.unitree.go2.blueprints.basic.unitree_go2_keyboard_teleop:unitree_go2_keyboard_teleop", "unitree-go2-markers": "dimos.robot.unitree.go2.blueprints.smart.unitree_go2:unitree_go2_markers", "unitree-go2-memory": "dimos.robot.unitree.go2.blueprints.smart.unitree_go2:unitree_go2_memory", + "unitree-go2-patrol": "dimos.robot.unitree.go2.blueprints.smart.unitree_go2_patrol:unitree_go2_patrol", "unitree-go2-relocalization": "dimos.robot.unitree.go2.blueprints.smart.unitree_go2:unitree_go2_relocalization", "unitree-go2-ros": "dimos.robot.unitree.go2.blueprints.smart.unitree_go2_ros:unitree_go2_ros", "unitree-go2-security": "dimos.robot.unitree.go2.blueprints.agentic.unitree_go2_security:unitree_go2_security", "unitree-go2-spatial": "dimos.robot.unitree.go2.blueprints.smart.unitree_go2_spatial:unitree_go2_spatial", + "unitree-go2-startup-self-check": "dimos.robot.custom.blueprints.go2_startup_self_check:unitree_go2_startup_self_check", "unitree-go2-temporal-memory": "dimos.robot.unitree.go2.blueprints.agentic.unitree_go2_temporal_memory:unitree_go2_temporal_memory", "unitree-go2-vlm-stream-test": "dimos.robot.unitree.go2.blueprints.smart.unitree_go2_vlm_stream_test:unitree_go2_vlm_stream_test", "unitree-go2-webrtc-keyboard-teleop": "dimos.robot.unitree.go2.blueprints.basic.unitree_go2_webrtc_keyboard_teleop:unitree_go2_webrtc_keyboard_teleop", @@ -119,13 +123,20 @@ "xarm6-planner-only": "dimos.manipulation.blueprints:xarm6_planner_only", "xarm7-planner-coordinator": "dimos.manipulation.blueprints:xarm7_planner_coordinator", "xarm7-planner-coordinator-agent": "dimos.manipulation.blueprints:xarm7_planner_coordinator_agent", + "yoloe-keyboard-teleop": "dimos.robot.custom.blueprints.yoloe_keyboard_teleop:yoloe_keyboard_teleop", + "yoloe-spatial-standoff-follow": "dimos.robot.custom.blueprints.yoloe_spatial_standoff_follow:yoloe_spatial_standoff_follow", + "yoloe-spatial-standoff-follow-ui": "dimos.robot.custom.blueprints.yoloe_spatial_standoff_follow:yoloe_spatial_standoff_follow_ui", + "yoloe-target-lock-distance-follow": "dimos.robot.custom.blueprints.yoloe_target_lock_distance_follow:yoloe_target_lock_distance_follow", + "yoloe-tracking-test": "dimos.robot.custom.blueprints.yoloe_tracking_test:yoloe_tracking_test", } all_modules = { "alfred-high-level": "dimos.robot.diy.alfred.effector_high_level.AlfredHighLevel", "arm-teleop-module": "dimos.teleop.quest.quest_extensions.ArmTeleopModule", + "b-box-distance-behavior-module": "dimos.robot.custom.tasks.bbox_distance_behavior_module.BBoxDistanceBehaviorModule", "b-box-navigation-module": "dimos.navigation.bbox_navigation.BBoxNavigationModule", + "b-box-selection-module": "dimos.robot.custom.modules.bbox_selection_module.BBoxSelectionModule", "b1-connection-module": "dimos.robot.unitree.b1.connection.B1ConnectionModule", "camera-module": "dimos.hardware.sensors.camera.module.CameraModule", "cartesian-motion-controller": "dimos.manipulation.control.servo_control.cartesian_motion_controller.CartesianMotionController", @@ -152,6 +163,7 @@ "go2-connection": "dimos.robot.unitree.go2.connection.GO2Connection", "go2-fleet-connection": "dimos.robot.unitree.go2.fleet_connection.Go2FleetConnection", "go2-memory": "dimos.robot.unitree.go2.blueprints.smart.unitree_go2.Go2Memory", + "go2-startup-self-check": "dimos.robot.custom.modules.go2_startup_self_check_module.Go2StartupSelfCheck", "google-maps-skill-container": "dimos.agents.skills.google_maps_skill_container.GoogleMapsSkillContainer", "gps-nav-skill-container": "dimos.agents.skills.gps_nav_skill.GpsNavSkillContainer", "grasp-gen-module": "dimos.manipulation.grasping.graspgen_module.GraspGenModule", @@ -182,6 +194,7 @@ "object-tracking": "dimos.perception.object_tracker.ObjectTracking", "osm-skill": "dimos.agents.skills.osm.OsmSkill", "path-follower": "dimos.navigation.nav_stack.modules.path_follower.path_follower.PathFollower", + "patrol-module": "dimos.robot.unitree.go2.blueprints.smart.unitree_go2_patrol.PatrolModule", "patrolling-module": "dimos.navigation.patrolling.module.PatrollingModule", "perceive-loop-skill": "dimos.perception.perceive_loop_skill.PerceiveLoopSkill", "person-follow-skill-container": "dimos.agents.skills.person_follow.PersonFollowSkillContainer", @@ -194,6 +207,7 @@ "real-sense-camera": "dimos.hardware.sensors.camera.realsense.camera.RealSenseCamera", "receiver-module": "dimos.utils.demo_image_encoding.ReceiverModule", "recorder": "dimos.memory2.module.Recorder", + "reid-map-module": "dimos.apps.marauders_map.module.ReidMapModule", "reid-module": "dimos.perception.detection.reid.module.ReidModule", "relocalization-module": "dimos.mapping.relocalization.module.RelocalizationModule", "replanning-a-star-planner": "dimos.navigation.replanning_a_star.module.ReplanningAStarPlanner", @@ -204,8 +218,11 @@ "simple-phone-teleop": "dimos.teleop.phone.phone_extensions.SimplePhoneTeleop", "simple-planner": "dimos.navigation.nav_stack.modules.simple_planner.simple_planner.SimplePlanner", "spatial-memory": "dimos.perception.spatial_perception.SpatialMemory", + "spatial-target-lock-module": "dimos.robot.custom.modules.spatial_target_lock_module.SpatialTargetLockModule", "speak-skill": "dimos.agents.skills.speak_skill.SpeakSkill", "tare-planner": "dimos.navigation.nav_stack.modules.tare_planner.tare_planner.TarePlanner", + "target-lock-module": "dimos.robot.custom.modules.target_lock_module.TargetLockModule", + "target-standoff-behavior-module": "dimos.robot.custom.tasks.target_standoff_behavior_module.TargetStandoffBehaviorModule", "temporal-memory": "dimos.perception.experimental.temporal_memory.temporal_memory.TemporalMemory", "terrain-analysis": "dimos.navigation.nav_stack.modules.terrain_analysis.terrain_analysis.TerrainAnalysis", "terrain-map-ext": "dimos.navigation.nav_stack.modules.terrain_map_ext.terrain_map_ext.TerrainMapExt", @@ -219,5 +236,6 @@ "wavefront-frontier-explorer": "dimos.navigation.frontier_exploration.wavefront_frontier_goal_selector.WavefrontFrontierExplorer", "web-input": "dimos.agents.web_human_input.WebInput", "websocket-vis-module": "dimos.web.websocket_vis.websocket_vis_module.WebsocketVisModule", + "yoloe-tracking-module": "dimos.robot.custom.modules.yoloe_tracking_module.YoloeTrackingModule", "zed-camera": "dimos.hardware.sensors.camera.zed.camera.ZEDCamera", } diff --git a/dimos/robot/custom/README.md b/dimos/robot/custom/README.md new file mode 100644 index 0000000000..d46ef471ae --- /dev/null +++ b/dimos/robot/custom/README.md @@ -0,0 +1,285 @@ +# Custom Robot Blueprints + +## 目录结构 + +```text +dimos/robot/custom/ +├── modules/ # 纯业务逻辑,无 blueprint / vis 代码 +│ ├── bbox_selection_module.py # BBoxSelectionModule, BBoxSelectionConfig +│ ├── target_lock_module.py # TargetLockModule, TargetLockConfig +│ ├── spatial_target_lock_module.py # SpatialTargetLockModule, SpatialTargetLockConfig +│ ├── yoloe_tracking_module.py # YoloeTrackingModule, YoloeTrackingConfig +│ └── go2_startup_self_check_module.py # Go2StartupSelfCheck, Go2StartupSelfCheckConfig +├── tasks/ # 任务实现:每个 task 自带状态机 +│ ├── bbox_distance_behavior_module.py # BBoxDistanceBehaviorModule, BBoxDistanceBehaviorConfig +│ └── target_standoff_behavior_module.py # TargetStandoffBehaviorModule, TargetStandoffBehaviorConfig +├── visualization/ # Detection2DArray -> Rerun 2D overlay 适配 +│ └── detection2d_overlay.py # detection_array_to_rerun / detections_overlay / +│ # selected_bbox_overlay / yoloe_overlay +├── blueprints/ # autoconnect 组装 + rerun config + requirements +│ ├── bbox_distance_follow.py # 最小距离任务蓝图 +│ ├── yoloe_spatial_standoff_follow.py # YOLOE + 空间锁定 + 近身/后撤/返回 +│ ├── yoloe_target_lock_distance_follow.py # 推荐闭环示例:YOLOE + selection + target lock + task +│ ├── yoloe_keyboard_teleop.py # 键盘遥控 + YOLOE(非本文重点) +│ ├── yoloe_tracking_test.py # 仅检测/跟踪验证 +│ └── go2_startup_self_check.py # 开机自检蓝图 +└── tests/ + ├── test_bbox_distance_behavior_module.py + ├── test_spatial_target_lock_module.py + ├── test_target_standoff_behavior_module.py + └── test_target_lock_module.py +``` + +依赖方向:`blueprints/` -> `modules/` + `tasks/` + `visualization/`;`tests/` -> `modules/` + `tasks/`。 + +## 如何最小化测试 + +建议按下面顺序做最小验证,能最快定位问题在哪一层。 + +### 1. 先跑 task + lock 单元测试 + +```bash +source .venv/bin/activate +pytest dimos/robot/custom/tests/test_bbox_distance_behavior_module.py dimos/robot/custom/tests/test_target_lock_module.py dimos/robot/custom/tests/test_spatial_target_lock_module.py dimos/robot/custom/tests/test_target_standoff_behavior_module.py -q +``` + +### 2. 再验证 blueprint 自动注册 + +```bash +source .venv/bin/activate +pytest dimos/robot/test_all_blueprints_generation.py +``` + +### 3. 最后跑最小闭环蓝图 + +```bash +.venv/bin/dimos --replay run yoloe-target-lock-distance-follow +.venv/bin/dimos --replay run yoloe-spatial-standoff-follow +``` + +这个顺序可以把问题快速归类到: +- 单模块逻辑 +- 蓝图注册/装配 +- 运行时流转 + +## 当前 blueprint 状态机 + +### 1) yoloe-spatial-standoff-follow(空间锁定 + near/far waypoint 任务) + +组成: +- `unitree_go2` +- `YoloeTrackingModule.blueprint()` +- `BBoxSelectionModule.blueprint()` +- `SpatialTargetLockModule.blueprint()` +- `TargetStandoffBehaviorModule.blueprint()` +- `KeyboardTeleop.blueprint(publish_only_when_active=True)` +- `MovementManager.blueprint()` + +核心行为: +- 用户在 Camera 视图点击 YOLOE bbox。 +- `SpatialTargetLockModule` 把选中 bbox 投影到 lidar 世界点云,缓存 `target_pose`。 +- YOLOE 丢失时继续发布最后一次 `target_pose`,状态为 `using_memory`。 +- 当前任务策略里 YOLOE 只负责提供初始 bbox;空间点锁定后不再用 YOLOE 重匹配刷新目标点。 +- `TargetStandoffBehaviorModule` 固定保存三个点:`target`、离 target 约 `0.5m` 的 `near`、离 target 约 `1.5m` 的 `far`。 +- 行为模块只编排 waypoint,不直接发布速度;实际移动通过 `ReplanningAStarPlanner.set_goal()` 交给现有路径导航。 +- 默认流程:导航到 `near` -> 倒计时停留 `10s`(每 `2s` 打一次 countdown log)-> 导航到 `far` -> 倒计时停留 `10s` -> 导航回 `near` -> 完成并清空选择。 + +状态机: + +```mermaid +stateDiagram-v2 + [*] --> idle + idle --> navigating_near: target_pose -> set_goal(near) + navigating_near --> dwelling_near: planner goal_reached + dwelling_near --> navigating_far: 10s -> set_goal(far) + navigating_far --> dwelling_far: planner goal_reached + dwelling_far --> returning_near: 10s -> set_goal(near) + returning_near --> done: planner goal_reached + done --> idle: clear_selection / new run +``` + +模块消息流: + +```mermaid +flowchart LR + C[color_image] --> Y[YoloeTrackingModule] + Y --> D[detections] + D --> S[BBoxSelectionModule] + S --> USB[user_selected_bbox] + D --> STL[SpatialTargetLockModule] + USB --> STL + L[lidar] --> STL + I[camera_info] --> STL + STL --> LB[locked_bbox] + STL --> TP[target_pose] + TP --> B[TargetStandoffBehaviorModule] + B -->|set_goal near/far/near| P[ReplanningAStarPlanner] + P -->|nav_cmd_vel| MM[MovementManager] + KB[KeyboardTeleop] -->|tele_cmd_vel| MM + MM -->|cmd_vel| V[Go2Connection] + MM -->|stop_movement| STL + MM -->|stop_movement / teleop_active| B +``` + +### 2) yoloe-target-lock-distance-follow(2D lock + one-shot distance follow) + +组成: +- `unitree_go2_basic` +- `YoloeTrackingModule.blueprint()` +- `BBoxSelectionModule.blueprint()` +- `TargetLockModule.blueprint()` +- `BBoxDistanceBehaviorModule.blueprint(approach_distance=0.2)` +- `KeyboardTeleop.blueprint(publish_only_when_active=True)` +- `MovementManager.blueprint()` + +键盘控制(pygame 窗口需要焦点): +- W / S — 前进 / 后退 +- A / D — 左转 / 右转 +- Q / E — 横移 +- Shift — 加速 (2×) | Ctrl — 慢速 (0.5×) +- Space — 紧急停止 | Esc / Q — 退出 + +TargetLock 状态机: + +```mermaid +stateDiagram-v2 + [*] --> unselected + unselected --> locked: selected_bbox 有效 + locked --> searching: 当前帧 id 不匹配 + searching --> locked: 重找成功 + searching --> lost: 超时未找回 + lost --> locked: 用户重新选择 + locked --> unselected: clear_selection / selected_bbox 为空 + searching --> unselected: clear_selection / selected_bbox 为空 +``` + +模块消息流: + +```mermaid +flowchart LR + C[color_image] --> Y[YoloeTrackingModule] + Y --> D[detections] + D --> S[BBoxSelectionModule] + S --> USB[user_selected_bbox] + D --> T[TargetLockModule] + USB --> T + T --> LB[locked_bbox / selected_bbox] + LB --> B[BBoxDistanceBehaviorModule] + L[lidar] --> B + I[camera_info] --> B + B -->|nav_cmd_vel| MM[MovementManager] + KB[KeyboardTeleop] -->|tele_cmd_vel| MM + MM -->|cmd_vel| V[Go2Connection] + MM -->|stop_movement = teleop_active| B +``` + +速度优先级(MovementManager): +- 键盘有输入时:`tele_cmd_vel` 优先,`nav_cmd_vel` 被压制(冷却 1 s) +- 冷却结束后:`nav_cmd_vel`(任务)恢复控制 +- 键盘输入同时触发 `stop_movement → teleop_active` → 任务重置为 idle + +Visualization 点击闭环: + +```mermaid +flowchart LR + A[dimos-view Camera click] --> B[RerunWebSocketServer.clicked_point] + B --> C[BBoxSelectionModule._on_clicked_point] + C --> D[user_selected_bbox] + D --> E[TargetLockModule._on_selected_bbox] + F[detections] --> G[TargetLockModule._on_detections] + E --> H[locked_bbox] + G --> H + H --> K[/color_image/locked_bbox] +``` + +### 2) bbox-distance-follow(最小任务链路) + +组成: +- `unitree_go2_basic` +- `Detection2DModule.blueprint(camera_info=GO2Connection.camera_info_static, publish_detection_images=False)` +- `BBoxSelectionModule.blueprint()` +- `BBoxDistanceBehaviorModule.blueprint()` + +任务状态机: + +```text +idle -> approaching -> done +``` + +状态转移规则: +- 选中 bbox(非空)-> `approaching` +- 到达 `0.2m ± 0.05m` -> `done` +- 选择清空(空 bbox)-> `idle` + +## Task Module + +`BBoxDistanceBehaviorModule` 位于 `tasks/`,职责是“执行任务”,不是“做检测或选择”。 + +职责: +- 输入:`selected_bbox` + `lidar` + `camera_info` +- 输出:`cmd_vel` + `behavior_status` +- 行为目标:点中目标后自动靠近到 0.2m 并结束 + +RPC: +- `start_bbox_distance_behavior(approach_distance=None) -> str` +- `stop_bbox_distance_behavior() -> str` + +默认参数: +- `command_hz = 20.0` +- `approach_distance = 0.2` +- `depth_percentile = 25.0` +- `max_linear_speed = 0.45` +- `max_angular_speed = 0.8` + +点击和深度调参(实机/仿真常用): +- `BBoxSelectionConfig.click_hit_padding_px`:点击命中 bbox 的边缘扩张像素。 +- `BBoxSelectionConfig.click_snap_distance_px`:未命中时吸附到最近 bbox 的最大像素距离。 +- `BBoxDistanceBehaviorConfig.depth_bbox_padding_px`:深度采样时 bbox 初始扩张像素。 +- `BBoxDistanceBehaviorConfig.depth_bbox_max_padding_px`:深度采样允许的最大扩张像素。 +- `BBoxDistanceBehaviorConfig.depth_bbox_padding_step_px`:深度采样逐级扩张步长。 + +推荐调参顺序: +1. 点不中 bbox:先增大 `click_hit_padding_px`,再增大 `click_snap_distance_px`。 +2. 点中了但不前进(常见 `no_depth_for_target`):先增大 `depth_bbox_padding_px`,再增大 `depth_bbox_max_padding_px`。 +3. 深度跳变明显:降低 `depth_bbox_max_padding_px` 或增大 `depth_percentile`。 + +边界行为: +- bbox 为空:发布 `Twist.zero()` 并回到 `idle` +- 深度无效:发布 `Twist.zero()` 并等待 +- 达到目标或 stop:发布 `Twist.zero()` + +## 排查故障 + +### 1. 看不到框 +- 检查检测流是否有输出(`detections` 是否持续更新)。 +- 检查 overlay 绑定是否正确(`/color_image/yoloe_detections`、`/color_image/selected_bbox`、`/color_image/locked_bbox`)。 +- replay 场景下确认模型文件存在。 + +### 2. 点击后不进入 approaching +- 检查 `clicked_point` 是否到达 `BBoxSelectionModule`。 +- 检查 `selected_bbox` 是否非空。 +- 检查 `behavior_status` 是否切到 `approaching`。 + +### 3. 锁定后很快丢失 +- 检查 tracking id 是否稳定。 +- 检查 `TargetLockModule.search_timeout_sec` 是否过短。 +- 检查 `reacquire_by_class` 是否开启。 +- 观察 `lock_status` 是否进入 `searching/lost`。 + +### 4. 不前进但没 done +- 检查点云是否正确投影到 bbox。 +- 检查 `camera_info` 是否匹配当前相机。 +- 检查 bbox 是否过小或位置异常导致深度采样失败。 + +### 5. 注册/运行异常 +- 重新执行: + +```bash +source .venv/bin/activate +pytest dimos/robot/test_all_blueprints_generation.py +``` + +### 6. 建议排查顺序 +1. 单测:`test_bbox_distance_behavior_module.py` + `test_target_lock_module.py` +2. 注册:`test_all_blueprints_generation.py` +3. 运行:`dimos --replay run yoloe-target-lock-distance-follow` diff --git a/dimos/robot/custom/__init__.py b/dimos/robot/custom/__init__.py new file mode 100644 index 0000000000..e69de29bb2 diff --git a/dimos/robot/custom/blueprints/__init__.py b/dimos/robot/custom/blueprints/__init__.py new file mode 100644 index 0000000000..bc1a2ce5cc --- /dev/null +++ b/dimos/robot/custom/blueprints/__init__.py @@ -0,0 +1,13 @@ +# Copyright 2025-2026 Dimensional Inc. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. diff --git a/dimos/robot/custom/blueprints/bbox_distance_follow.py b/dimos/robot/custom/blueprints/bbox_distance_follow.py new file mode 100644 index 0000000000..8b92554911 --- /dev/null +++ b/dimos/robot/custom/blueprints/bbox_distance_follow.py @@ -0,0 +1,123 @@ +from __future__ import annotations # 允许类型注解延迟解析,减少循环导入风险 + +from pathlib import Path +from typing import Any + +from dimos.core.coordination.blueprints import autoconnect # 导入蓝图组合函数 +from dimos.core.coordination.module_coordinator import ModuleCoordinator # 导入直接运行蓝图所需协调器 +from dimos.core.global_config import global_config # 导入全局配置,用于复用 viewer backend 选择 +from dimos.core.transport import LCMTransport # 导入 LCM transport,用于固定 topic 名称 +from dimos.msgs.vision_msgs.Detection2DArray import Detection2DArray # 导入 2D 检测数组消息类型 +from dimos.perception.detection.module2D import Detection2DModule # 导入多 bbox 检测模块 +from dimos.robot.custom.tasks.bbox_distance_behavior_module import BBoxDistanceBehaviorModule # 导入距离任务模块 +from dimos.robot.custom.modules.bbox_selection_module import BBoxSelectionModule # 导入 bbox 选择模块 +from dimos.robot.custom.visualization.detection2d_overlay import ( + detections_overlay, # 黄色候选 bbox overlay + selected_bbox_overlay, # 绿色 selected bbox overlay +) +from dimos.robot.unitree.go2.blueprints.basic.unitree_go2_basic import ( + rerun_config as go2_rerun_config, # 导入 Go2 默认 rerun 配置,用于局部扩展 + unitree_go2_basic, # 导入 Go2 基础蓝图 +) +from dimos.robot.unitree.go2.connection import GO2Connection # 导入 Go2 连接类,复用静态 camera_info +from dimos.utils.data import get_data_dir # 导入数据目录解析函数,用于定位本地模型文件 +from dimos.visualization.vis_module import vis_module # 导入 viewer 模块工厂 + +_YOLO_MODEL_NAME = "yolo11n.pt" # Detection2DModule 默认需要的 YOLO 模型文件名 +_YOLO_MODEL_PATH = get_data_dir("models_yolo") / _YOLO_MODEL_NAME # 本地预下载模型路径 +_YOLO_MODEL_URL = "https://github.com/ultralytics/assets/releases/download/v8.4.0/yolo11n.pt" # 官方权重下载地址 + + +def _require_yolo11n_model() -> str | None: # 定义 blueprint 启动前的本地 YOLO 权重检查 + if _YOLO_MODEL_PATH.exists(): # 如果本地权重已经预下载完成 + return None # 返回 None 表示 requirement 通过 + return _format_missing_yolo_model_error(_YOLO_MODEL_PATH) # 返回明确的预下载指令 + + +def _format_missing_yolo_model_error(model_path: Path) -> str: # 定义缺失模型时的错误消息构造函数 + return ( + f"Missing local YOLO model: {model_path}. " + "Real Go2 tests are usually offline, so pre-download it while online: " + f"mkdir -p {model_path.parent} && curl -L -o {model_path} {_YOLO_MODEL_URL}" + ) + + +def _bbox_distance_rerun_blueprint() -> Any: # 定义 bbox-distance-follow 专用 Rerun 布局 + import rerun as rr # 延迟导入 Rerun,避免非 viewer 路径承担额外导入成本 + import rerun.blueprint as rrb # 延迟导入 Rerun blueprint API + + return rrb.Blueprint( + rrb.Horizontal( + rrb.Spatial2DView( + origin="world/color_image", + contents=["world/color_image/**"], + name="Camera", + ), + rrb.Spatial3DView( + origin="world", + contents=[ + "world/**", + "-world/color_image/detections", # 在 3D 视图中隐藏 YOLO 2D bbox overlay + "-world/color_image/selected_bbox", # 在 3D 视图中隐藏 selected 2D bbox overlay + ], + name="3D", + background=rrb.Background(kind="SolidColor", color=[0, 0, 0]), + line_grid=rrb.LineGrid3D( + plane=rr.components.Plane3D.XY.with_distance(0.5), + ), + overrides={ + "world/lidar": rrb.EntityBehavior(visible=False), + }, + ), + column_shares=[1, 2], + ), + rrb.TimePanel(state="hidden"), + rrb.SelectionPanel(state="hidden"), + ) + + +_bbox_distance_rerun_config = { # bbox-distance-follow 专用 rerun 配置 + **go2_rerun_config, # 继承 Go2 默认 Camera/3D 布局和限频设置 + "blueprint": _bbox_distance_rerun_blueprint, # 使用专用布局,避免 3D view 渲染 2D selected bbox + "visual_override": { + **go2_rerun_config["visual_override"], # 保留 Go2 默认 camera_info、map 和 costmap 转换逻辑 + "world/color_image/detections": detections_overlay, # 对 YOLO 多 bbox topic 做黄色 Camera overlay + "world/color_image/selected_bbox": selected_bbox_overlay, # 对 selected bbox topic 做绿色 2D overlay + }, +} + +_bbox_distance_vis = vis_module( + viewer_backend=global_config.viewer, + rerun_config=_bbox_distance_rerun_config, +) + + +bbox_distance_follow = autoconnect( # 定义 CLI 可运行的 bbox-distance-follow 蓝图 + unitree_go2_basic, # Go2 基础连接和 viewer + _bbox_distance_vis, # 替换 Go2 默认 viewer 配置,增加 bbox overlay + Detection2DModule.blueprint( + camera_info=GO2Connection.camera_info_static, # 复用 Go2 静态相机内参 + publish_detection_images=False, # 关闭 cropped detected_image,避免 3D view 无 Pinhole 警告 + ), + BBoxSelectionModule.blueprint(), # 从多 bbox 中选择单个 bbox + BBoxDistanceBehaviorModule.blueprint(approach_distance=0.2), # 点选后自动靠近到 0.2m 并结束 +).global_config( + n_workers=6, # 给 Go2、viewer、detector、selection 和 behavior 留足 worker + robot_model="unitree_go2", +).transports( + { + ("detections", Detection2DArray): LCMTransport("/color_image/detections", Detection2DArray), # 固定 YOLO bbox topic + ("selected_bbox", Detection2DArray): LCMTransport("/color_image/selected_bbox", Detection2DArray), # 固定 selected bbox topic + } +).requirements( + _require_yolo11n_model, # 检查 YOLO 权重是否已经预下载,避免真机断网时隐式下载失败 +) + + +__all__ = [ + "bbox_distance_follow", +] + + +if __name__ == "__main__": + ModuleCoordinator.build(bbox_distance_follow).loop() diff --git a/dimos/robot/custom/blueprints/go2_startup_self_check.py b/dimos/robot/custom/blueprints/go2_startup_self_check.py new file mode 100644 index 0000000000..cf50d8f49d --- /dev/null +++ b/dimos/robot/custom/blueprints/go2_startup_self_check.py @@ -0,0 +1,20 @@ +from __future__ import annotations # 允许类型注解延迟解析,减少循环导入风险 + +from dimos.core.coordination.blueprints import autoconnect # 导入蓝图组合函数 +from dimos.robot.custom.modules.go2_startup_self_check_module import Go2StartupSelfCheck # 导入自检模块 +from dimos.robot.unitree.go2.blueprints.smart.unitree_go2 import unitree_go2 # 导入现有 Go2 完整蓝图 + +# 复用现有 Go2 完整蓝图,再额外加上一次性启动自检发布者。 +unitree_go2_startup_self_check = autoconnect( # 定义 CLI 可运行的组合蓝图 + unitree_go2, # 第一部分:现有 Go2 连接、viewer、导航和 MovementManager + Go2StartupSelfCheck.blueprint(), # 第二部分:新增的启动自检模块 +).remappings( # 把自检速度命令接入现有 MovementManager 的手动控制入口 + [ + (Go2StartupSelfCheck, "cmd_vel", "tele_cmd_vel"), # 自检发布到 tele_cmd_vel,由 MovementManager 转成 cmd_vel + ] +) + + +__all__ = [ + "unitree_go2_startup_self_check", +] diff --git a/dimos/robot/custom/blueprints/yoloe_keyboard_teleop.py b/dimos/robot/custom/blueprints/yoloe_keyboard_teleop.py new file mode 100644 index 0000000000..9bd18ee2d1 --- /dev/null +++ b/dimos/robot/custom/blueprints/yoloe_keyboard_teleop.py @@ -0,0 +1,154 @@ +# Copyright 2025-2026 Dimensional Inc. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +"""YOLOE + Go2 keyboard teleop blueprint. + +Combines YOLOE open-vocabulary detection with keyboard-controlled movement so +you can manually drive the Go2 (in simulation or on real hardware) while +verifying YOLOE detection quality in the Rerun viewer. + +Usage: + # Simulation (recommended for capability checking) + dimos --simulation run yoloe-keyboard-teleop + + # Replay (detection-only, robot does not move) + dimos --replay run yoloe-keyboard-teleop + + # Real Go2 + dimos --robot-ip 192.168.123.161 --rerun-open native run yoloe-keyboard-teleop + +Keyboard controls (pygame window must have focus): + W / S — forward / backward + A / D — rotate left / right + Shift — speed boost (2×) + Ctrl — slow mode (0.5×) + Space — publish zero Twist (stop) + Esc / Q — quit keyboard window +""" + +from __future__ import annotations + +from typing import Any + +from dimos.core.coordination.blueprints import autoconnect +from dimos.core.global_config import global_config +from dimos.core.transport import LCMTransport +from dimos.msgs.vision_msgs.Detection2DArray import Detection2DArray +from dimos.navigation.movement_manager.movement_manager import MovementManager +from dimos.robot.custom.modules.yoloe_tracking_module import ( + YoloeTrackingModule, + _require_yoloe_lrpc_model, +) +from dimos.robot.custom.visualization.detection2d_overlay import yoloe_overlay +from dimos.robot.unitree.go2.blueprints.basic.unitree_go2_basic import ( + rerun_config as go2_rerun_config, + unitree_go2_basic, +) +from dimos.robot.unitree.keyboard_teleop import KeyboardTeleop +from dimos.visualization.vis_module import vis_module + +_YOLOE_DETECTIONS_TOPIC = "/color_image/yoloe_detections" +_YOLOE_DETECTIONS_ENTITY = "world/color_image/yoloe_detections" + + +def _yoloe_keyboard_rerun_blueprint() -> Any: + import rerun as rr + import rerun.blueprint as rrb + + return rrb.Blueprint( + rrb.Horizontal( + rrb.Spatial2DView( + origin="world/color_image", + contents=["world/color_image/**"], + name="Camera", + ), + rrb.Spatial3DView( + origin="world", + contents=[ + "world/**", + f"-{_YOLOE_DETECTIONS_ENTITY}", # 在 3D 视图中隐藏 YOLOE 2D bbox overlay + ], + name="3D", + background=rrb.Background(kind="SolidColor", color=[0, 0, 0]), + line_grid=rrb.LineGrid3D( + plane=rr.components.Plane3D.XY.with_distance(0.5), + ), + overrides={ + "world/lidar": rrb.EntityBehavior(visible=False), + }, + ), + column_shares=[1, 2], + ), + rrb.TimePanel(state="hidden"), + rrb.SelectionPanel(state="hidden"), + ) + + +_yoloe_keyboard_rerun_config = { + **go2_rerun_config, + "blueprint": _yoloe_keyboard_rerun_blueprint, + "visual_override": { + **go2_rerun_config["visual_override"], + _YOLOE_DETECTIONS_ENTITY: yoloe_overlay, # 对 YOLOE detections 做青色 Camera overlay + }, +} + +_yoloe_keyboard_vis = vis_module( + viewer_backend=global_config.viewer, + rerun_config=_yoloe_keyboard_rerun_config, +) + +# 在 simulation 模式下 MuJoCo 的 locomotion policy 已占用 CoreMLExecutionProvider, +# 强制 YOLOE 使用 CPU 避免 Metal/MPS 资源竞争导致 worker 崩溃。 +# 在 replay / 真机模式下保持 device=None(自动选择 GPU)。 +# _yoloe_device = "cpu" if global_config.simulation else None +_yoloe_device = None + +yoloe_keyboard_teleop = ( + autoconnect( + unitree_go2_basic, + _yoloe_keyboard_vis, + YoloeTrackingModule.blueprint(device=_yoloe_device), + # MovementManager 作为 velocity hub: + # tele_cmd_vel (来自 KeyboardTeleop) → MovementManager → cmd_vel → GO2Connection + # 与 unitree_go2 的控制路径保持一致,simulation 模式下 MujocoConnection 期望此路径。 + MovementManager.blueprint(), + # publish_only_when_active=True: 松开按键后只发一次零 Twist 再静默, + # 不持续刷 cmd_vel,不干扰 MovementManager 的 nav/tele 优先级逻辑。 + KeyboardTeleop.blueprint(publish_only_when_active=True), + ) + .global_config( + n_workers=8, # basic(4) + YoloeTracking + MovementManager + KeyboardTeleop + vis + robot_model="unitree_go2", + ) + .remappings( + [ + # KeyboardTeleop 输出 cmd_vel,需重命名为 tele_cmd_vel 才能连接到 MovementManager + (KeyboardTeleop, "cmd_vel", "tele_cmd_vel"), + ] + ) + .transports( + { + ("detections", Detection2DArray): LCMTransport( + _YOLOE_DETECTIONS_TOPIC, + Detection2DArray, + ), + } + ) + .requirements( + _require_yoloe_lrpc_model, + ) +) + +__all__ = ["yoloe_keyboard_teleop"] diff --git a/dimos/robot/custom/blueprints/yoloe_spatial_standoff_follow.py b/dimos/robot/custom/blueprints/yoloe_spatial_standoff_follow.py new file mode 100644 index 0000000000..116ee73171 --- /dev/null +++ b/dimos/robot/custom/blueprints/yoloe_spatial_standoff_follow.py @@ -0,0 +1,233 @@ +# Copyright 2025-2026 Dimensional Inc. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from __future__ import annotations + +from typing import Any + +from dimos.apps.marauders_map.module import ReidMapModule +from dimos.core.coordination.blueprints import autoconnect +from dimos.core.coordination.module_coordinator import ModuleCoordinator +from dimos.core.global_config import global_config +from dimos.core.transport import LCMTransport +from dimos.msgs.geometry_msgs.PoseStamped import PoseStamped +from dimos.msgs.geometry_msgs.Twist import Twist +from dimos.msgs.vision_msgs.Detection2DArray import Detection2DArray +from dimos.navigation.replanning_a_star.module import ReplanningAStarPlanner +from dimos.robot.custom.modules.bbox_selection_module import BBoxSelectionModule +from dimos.robot.custom.modules.spatial_target_lock_module import SpatialTargetLockModule +from dimos.robot.custom.modules.yoloe_tracking_module import ( + YoloeTrackingModule, + _require_yoloe_lrpc_model, +) +from dimos.robot.custom.tasks.target_standoff_behavior_module import ( + TargetStandoffBehaviorModule, +) +from dimos.robot.custom.visualization.detection2d_overlay import ( + selected_bbox_overlay, + yoloe_overlay, +) +from dimos.robot.unitree.go2.blueprints.basic.unitree_go2_basic import ( + rerun_config as go2_rerun_config, +) +from dimos.robot.unitree.go2.blueprints.smart.unitree_go2 import unitree_go2 +from dimos.robot.unitree.go2.connection import GO2Connection +from dimos.robot.unitree.keyboard_teleop import KeyboardTeleop +from dimos.visualization.vis_module import vis_module + +_YOLOE_DETECTIONS_TOPIC = "/color_image/yoloe_detections" +_USER_SELECTED_BBOX_TOPIC = "/color_image/selected_bbox" +_LOCKED_BBOX_TOPIC = "/color_image/locked_bbox" +_TARGET_POSE_TOPIC = "/target_pose" +_NAV_CMD_VEL_TOPIC = "/nav_cmd_vel" +_TELE_CMD_VEL_TOPIC = "/tele_cmd_vel" + +_YOLOE_DETECTIONS_ENTITY = "world/color_image/yoloe_detections" +_USER_SELECTED_BBOX_ENTITY = "world/color_image/selected_bbox" +_LOCKED_BBOX_ENTITY = "world/color_image/locked_bbox" + + +def _spatial_standoff_rerun_blueprint() -> Any: + import rerun as rr + import rerun.blueprint as rrb + + return rrb.Blueprint( + rrb.Horizontal( + rrb.Spatial2DView( + origin="world/color_image", + contents=["world/color_image/**"], + name="Camera", + ), + rrb.Spatial3DView( + origin="world", + contents=[ + "world/**", + f"-{_YOLOE_DETECTIONS_ENTITY}", + f"-{_USER_SELECTED_BBOX_ENTITY}", + f"-{_LOCKED_BBOX_ENTITY}", + ], + name="3D", + background=rrb.Background(kind="SolidColor", color=[0, 0, 0]), + line_grid=rrb.LineGrid3D( + plane=rr.components.Plane3D.XY.with_distance(0.5), + ), + ), + column_shares=[1, 2], + ), + rrb.TimePanel(state="hidden"), + rrb.SelectionPanel(state="hidden"), + ) + + +_spatial_standoff_rerun_config = { + **go2_rerun_config, + "blueprint": _spatial_standoff_rerun_blueprint, + "visual_override": { + **go2_rerun_config["visual_override"], + _YOLOE_DETECTIONS_ENTITY: yoloe_overlay, + _USER_SELECTED_BBOX_ENTITY: selected_bbox_overlay, + _LOCKED_BBOX_ENTITY: selected_bbox_overlay, + }, +} + +_spatial_standoff_vis = vis_module( + viewer_backend=global_config.viewer, + rerun_config=_spatial_standoff_rerun_config, +) + + +yoloe_spatial_standoff_follow = ( + autoconnect( + unitree_go2, + _spatial_standoff_vis, + YoloeTrackingModule.blueprint(), + BBoxSelectionModule.blueprint(), + SpatialTargetLockModule.blueprint(), + TargetStandoffBehaviorModule.blueprint(), + KeyboardTeleop.blueprint(publish_only_when_active=True), + ) + .global_config( + n_workers=16, + robot_model="unitree_go2", + ) + .remappings( + [ + (BBoxSelectionModule, "selected_bbox", "user_selected_bbox"), + (SpatialTargetLockModule, "selected_bbox", "user_selected_bbox"), + (TargetStandoffBehaviorModule, "teleop_active", "stop_movement"), + (KeyboardTeleop, "cmd_vel", "tele_cmd_vel"), + (ReplanningAStarPlanner, "clicked_point", "planner_clicked_point"), + ] + ) + .transports( + { + ("detections", Detection2DArray): LCMTransport( + _YOLOE_DETECTIONS_TOPIC, + Detection2DArray, + ), + ("user_selected_bbox", Detection2DArray): LCMTransport( + _USER_SELECTED_BBOX_TOPIC, + Detection2DArray, + ), + ("locked_bbox", Detection2DArray): LCMTransport( + _LOCKED_BBOX_TOPIC, + Detection2DArray, + ), + ("target_pose", PoseStamped): LCMTransport( + _TARGET_POSE_TOPIC, + PoseStamped, + ), + ("nav_cmd_vel", Twist): LCMTransport( + _NAV_CMD_VEL_TOPIC, + Twist, + ), + ("tele_cmd_vel", Twist): LCMTransport( + _TELE_CMD_VEL_TOPIC, + Twist, + ), + } + ) + .requirements( + _require_yoloe_lrpc_model, + ) +) + +yoloe_spatial_standoff_follow_ui = ( + autoconnect( + unitree_go2, + _spatial_standoff_vis, + YoloeTrackingModule.blueprint(), + BBoxSelectionModule.blueprint(), + SpatialTargetLockModule.blueprint(), + TargetStandoffBehaviorModule.blueprint(), + ReidMapModule.blueprint( + camera_info=GO2Connection.camera_info_static, + enable_reid=False, + ), + ) + .global_config( + n_workers=16, + robot_model="unitree_go2", + ) + .remappings( + [ + (BBoxSelectionModule, "selected_bbox", "user_selected_bbox"), + (SpatialTargetLockModule, "selected_bbox", "user_selected_bbox"), + (TargetStandoffBehaviorModule, "teleop_active", "stop_movement"), + (ReplanningAStarPlanner, "clicked_point", "planner_clicked_point"), + (ReidMapModule, "cmd_vel", "tele_cmd_vel"), + (ReidMapModule, "pointcloud", "global_map"), + ] + ) + .transports( + { + ("detections", Detection2DArray): LCMTransport( + _YOLOE_DETECTIONS_TOPIC, + Detection2DArray, + ), + ("user_selected_bbox", Detection2DArray): LCMTransport( + _USER_SELECTED_BBOX_TOPIC, + Detection2DArray, + ), + ("locked_bbox", Detection2DArray): LCMTransport( + _LOCKED_BBOX_TOPIC, + Detection2DArray, + ), + ("target_pose", PoseStamped): LCMTransport( + _TARGET_POSE_TOPIC, + PoseStamped, + ), + ("nav_cmd_vel", Twist): LCMTransport( + _NAV_CMD_VEL_TOPIC, + Twist, + ), + ("tele_cmd_vel", Twist): LCMTransport( + _TELE_CMD_VEL_TOPIC, + Twist, + ), + } + ) + .requirements( + _require_yoloe_lrpc_model, + ) +) + +__all__ = [ + "yoloe_spatial_standoff_follow", + "yoloe_spatial_standoff_follow_ui", +] + + +if __name__ == "__main__": + ModuleCoordinator.build(yoloe_spatial_standoff_follow).loop() diff --git a/dimos/robot/custom/blueprints/yoloe_target_lock_distance_follow.py b/dimos/robot/custom/blueprints/yoloe_target_lock_distance_follow.py new file mode 100644 index 0000000000..b8a2c10b4a --- /dev/null +++ b/dimos/robot/custom/blueprints/yoloe_target_lock_distance_follow.py @@ -0,0 +1,160 @@ +from __future__ import annotations + +from typing import Any + +from dimos.core.coordination.blueprints import autoconnect +from dimos.core.coordination.module_coordinator import ModuleCoordinator +from dimos.core.global_config import global_config +from dimos.core.transport import LCMTransport +from dimos.msgs.geometry_msgs.Twist import Twist +from dimos.msgs.vision_msgs.Detection2DArray import Detection2DArray +from dimos.navigation.movement_manager.movement_manager import MovementManager +from dimos.robot.custom.tasks.bbox_distance_behavior_module import BBoxDistanceBehaviorModule +from dimos.robot.custom.modules.bbox_selection_module import BBoxSelectionModule +from dimos.robot.custom.modules.target_lock_module import TargetLockModule +from dimos.robot.custom.modules.yoloe_tracking_module import ( + YoloeTrackingModule, + _require_yoloe_lrpc_model, +) +from dimos.robot.custom.visualization.detection2d_overlay import ( + selected_bbox_overlay, + yoloe_overlay, +) +from dimos.robot.unitree.go2.blueprints.basic.unitree_go2_basic import ( + rerun_config as go2_rerun_config, +) +from dimos.robot.unitree.go2.blueprints.smart.unitree_go2 import unitree_go2 +from dimos.robot.unitree.keyboard_teleop import KeyboardTeleop +from dimos.visualization.vis_module import vis_module + +_YOLOE_DETECTIONS_TOPIC = "/color_image/yoloe_detections" +_USER_SELECTED_BBOX_TOPIC = "/color_image/selected_bbox" +_LOCKED_BBOX_TOPIC = "/color_image/locked_bbox" +_NAV_CMD_VEL_TOPIC = "/nav_cmd_vel" +_TELE_CMD_VEL_TOPIC = "/tele_cmd_vel" + +_YOLOE_DETECTIONS_ENTITY = "world/color_image/yoloe_detections" +_USER_SELECTED_BBOX_ENTITY = "world/color_image/selected_bbox" +_LOCKED_BBOX_ENTITY = "world/color_image/locked_bbox" + + +def _target_lock_rerun_blueprint() -> Any: + import rerun as rr + import rerun.blueprint as rrb + + return rrb.Blueprint( + rrb.Horizontal( + rrb.Spatial2DView( + origin="world/color_image", + contents=["world/color_image/**"], + name="Camera", + ), + rrb.Spatial3DView( + origin="world", + contents=[ + "world/**", + f"-{_YOLOE_DETECTIONS_ENTITY}", + f"-{_USER_SELECTED_BBOX_ENTITY}", + f"-{_LOCKED_BBOX_ENTITY}", + ], + name="3D", + background=rrb.Background(kind="SolidColor", color=[0, 0, 0]), + line_grid=rrb.LineGrid3D( + plane=rr.components.Plane3D.XY.with_distance(0.5), + ), + ), + column_shares=[1, 2], + ), + rrb.TimePanel(state="hidden"), + rrb.SelectionPanel(state="hidden"), + ) + + +_target_lock_rerun_config = { + **go2_rerun_config, + "blueprint": _target_lock_rerun_blueprint, + "visual_override": { + **go2_rerun_config["visual_override"], + _YOLOE_DETECTIONS_ENTITY: yoloe_overlay, + _USER_SELECTED_BBOX_ENTITY: selected_bbox_overlay, + _LOCKED_BBOX_ENTITY: selected_bbox_overlay, + }, +} + +_target_lock_vis = vis_module( + viewer_backend=global_config.viewer, + rerun_config=_target_lock_rerun_config, +) + + +yoloe_target_lock_distance_follow = ( + autoconnect( + unitree_go2, + _target_lock_vis, + YoloeTrackingModule.blueprint(), + BBoxSelectionModule.blueprint(), + TargetLockModule.blueprint(), + BBoxDistanceBehaviorModule.blueprint(), + # Keyboard teleop: publishes tele_cmd_vel when keys held; silent otherwise. + # MovementManager (from unitree_go2) muxes tele_cmd_vel (priority) + nav_cmd_vel + # (task) → cmd_vel. + # MovementManager emits stop_movement for keyboard/map control. The bbox + # selection + target lock modules consume that signal and clear the active + # target so the old bbox cannot restart the one-shot task on the next frame. + # BBoxDistanceBehaviorModule also emits clear_selection_request when the + # one-shot task completes or is stopped by RPC. + KeyboardTeleop.blueprint(publish_only_when_active=True), + ) + .global_config( + n_workers=16, + robot_model="unitree_go2", + ) + .remappings( + [ + (BBoxSelectionModule, "selected_bbox", "user_selected_bbox"), + (TargetLockModule, "selected_bbox", "user_selected_bbox"), + (TargetLockModule, "locked_bbox", "selected_bbox"), + # Task cmd_vel → MovementManager nav_cmd_vel (lower priority than keyboard) + (BBoxDistanceBehaviorModule, "cmd_vel", "nav_cmd_vel"), + # Keyboard cmd_vel → MovementManager tele_cmd_vel (higher priority) + (KeyboardTeleop, "cmd_vel", "tele_cmd_vel"), + # MovementManager stop_movement → task teleop_active (interrupt on teleop) + (MovementManager, "stop_movement", "teleop_active"), + ] + ) + .transports( + { + ("detections", Detection2DArray): LCMTransport( + _YOLOE_DETECTIONS_TOPIC, + Detection2DArray, + ), + ("user_selected_bbox", Detection2DArray): LCMTransport( + _USER_SELECTED_BBOX_TOPIC, + Detection2DArray, + ), + ("locked_bbox", Detection2DArray): LCMTransport( + _LOCKED_BBOX_TOPIC, + Detection2DArray, + ), + ("nav_cmd_vel", Twist): LCMTransport( + _NAV_CMD_VEL_TOPIC, + Twist, + ), + ("tele_cmd_vel", Twist): LCMTransport( + _TELE_CMD_VEL_TOPIC, + Twist, + ), + } + ) + .requirements( + _require_yoloe_lrpc_model, + ) +) + +__all__ = [ + "yoloe_target_lock_distance_follow", +] + + +if __name__ == "__main__": + ModuleCoordinator.build(yoloe_target_lock_distance_follow).loop() diff --git a/dimos/robot/custom/blueprints/yoloe_tracking_test.py b/dimos/robot/custom/blueprints/yoloe_tracking_test.py new file mode 100644 index 0000000000..1a9a8ff568 --- /dev/null +++ b/dimos/robot/custom/blueprints/yoloe_tracking_test.py @@ -0,0 +1,103 @@ +from __future__ import annotations # 允许类型注解延迟解析,减少循环导入风险 + +from typing import Any + +from dimos.core.coordination.blueprints import autoconnect # 导入蓝图组合函数 +from dimos.core.coordination.module_coordinator import ModuleCoordinator # 导入直接运行蓝图所需协调器 +from dimos.core.global_config import global_config # 导入全局配置,用于复用 viewer backend 选择 +from dimos.core.transport import LCMTransport # 导入 LCM transport,用于固定 topic 名称 +from dimos.msgs.vision_msgs.Detection2DArray import Detection2DArray # 导入 2D 检测数组消息类型 +from dimos.robot.custom.modules.yoloe_tracking_module import ( + YoloeTrackingModule, # 导入 YOLOE tracking 模块 + _require_yoloe_lrpc_model, # 导入模型文件预检查函数 +) +from dimos.robot.custom.visualization.detection2d_overlay import yoloe_overlay # 导入青色 YOLOE overlay +from dimos.robot.unitree.go2.blueprints.basic.unitree_go2_basic import ( + rerun_config as go2_rerun_config, # 导入 Go2 默认 rerun 配置,用于局部扩展 + unitree_go2_basic, # 导入 Go2 基础蓝图 +) +from dimos.visualization.vis_module import vis_module # 导入 viewer 模块工厂 + +_YOLOE_DETECTIONS_TOPIC = "/color_image/yoloe_detections" # YOLOE detections LCM topic +_YOLOE_DETECTIONS_ENTITY = "world/color_image/yoloe_detections" # Rerun entity path + + +def _yoloe_tracking_rerun_blueprint() -> Any: # 定义 yoloe-tracking-test 专用 Rerun 布局 + import rerun as rr # 延迟导入 Rerun,避免非 viewer 路径承担额外导入成本 + import rerun.blueprint as rrb # 延迟导入 Rerun blueprint API + + return rrb.Blueprint( + rrb.Horizontal( + rrb.Spatial2DView( + origin="world/color_image", + contents=["world/color_image/**"], + name="Camera", + ), + rrb.Spatial3DView( + origin="world", + contents=[ + "world/**", + f"-{_YOLOE_DETECTIONS_ENTITY}", # 在 3D 视图中隐藏 YOLOE 2D bbox overlay + ], + name="3D", + background=rrb.Background(kind="SolidColor", color=[0, 0, 0]), + line_grid=rrb.LineGrid3D( + plane=rr.components.Plane3D.XY.with_distance(0.5), + ), + overrides={ + "world/lidar": rrb.EntityBehavior(visible=False), + }, + ), + column_shares=[1, 2], + ), + rrb.TimePanel(state="hidden"), + rrb.SelectionPanel(state="hidden"), + ) + + +_yoloe_tracking_rerun_config = { # yoloe-tracking-test 专用 rerun 配置 + **go2_rerun_config, # 继承 Go2 默认 Camera/3D 布局和限频设置 + "blueprint": _yoloe_tracking_rerun_blueprint, + "visual_override": { + **go2_rerun_config["visual_override"], + _YOLOE_DETECTIONS_ENTITY: yoloe_overlay, # 对 YOLOE detections 做青色 Camera overlay + }, +} + +_yoloe_tracking_vis = vis_module( + viewer_backend=global_config.viewer, + rerun_config=_yoloe_tracking_rerun_config, +) + + +yoloe_tracking_test = ( + autoconnect( + unitree_go2_basic, + _yoloe_tracking_vis, + YoloeTrackingModule.blueprint(), + ) + .global_config( + n_workers=6, + robot_model="unitree_go2", + ) + .transports( + { + ("detections", Detection2DArray): LCMTransport( + _YOLOE_DETECTIONS_TOPIC, + Detection2DArray, + ), + } + ) + .requirements( + _require_yoloe_lrpc_model, # 检查 YOLOE 模型文件是否已经预下载 + ) +) + + +__all__ = [ + "yoloe_tracking_test", +] + + +if __name__ == "__main__": + ModuleCoordinator.build(yoloe_tracking_test).loop() diff --git a/dimos/robot/custom/modules/__init__.py b/dimos/robot/custom/modules/__init__.py new file mode 100644 index 0000000000..bc1a2ce5cc --- /dev/null +++ b/dimos/robot/custom/modules/__init__.py @@ -0,0 +1,13 @@ +# Copyright 2025-2026 Dimensional Inc. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. diff --git a/dimos/robot/custom/modules/bbox_selection_module.py b/dimos/robot/custom/modules/bbox_selection_module.py new file mode 100644 index 0000000000..0ec672a01c --- /dev/null +++ b/dimos/robot/custom/modules/bbox_selection_module.py @@ -0,0 +1,410 @@ +from __future__ import annotations # 允许类型注解延迟解析,减少循环导入风险 + +import math # 导入数学工具,用于检查点击像素坐标是否有效 +import threading # 导入线程锁,保护最新 detections 和选择状态 +import time # 导入时间工具,用于空检测消息时间戳 +from typing import Any # 导入通用类型,兼容 LCM 生成消息字段 + +from dimos_lcm.std_msgs import ( + Bool, # type: ignore[import-untyped] # 导入 Bool,用于响应任务中断/完成后的清除请求 +) +from reactivex.disposable import Disposable # 导入 Disposable,用于注册输入流订阅 + +from dimos.core.core import rpc # 导入 rpc 装饰器,让方法可通过 DimOS RPC 调用 +from dimos.core.module import Module, ModuleConfig # 导入模块基类和模块配置基类 +from dimos.core.stream import In, Out # 导入输入输出流类型 +from dimos.msgs.geometry_msgs.PointStamped import ( + PointStamped, # 导入 viewer 点击点类型,用于相机 bbox 点击选择 +) +from dimos.msgs.std_msgs.Header import Header # 导入 DimOS Header 便捷构造,用于空检测头 +from dimos.msgs.vision_msgs.Detection2DArray import Detection2DArray # 导入 2D 检测数组消息类型 +from dimos.utils.logging_config import setup_logger + +_DEFAULT_FRAME_ID = "camera_optical" # 定义空 Detection2DArray 使用的默认坐标系 + +logger = setup_logger() + + +class BBoxSelectionConfig(ModuleConfig): # 定义 bbox 选择模块配置,当前 MVP 暂无额外参数 + click_hit_padding_px: float = 6.0 + click_snap_distance_px: float = 14.0 + + +class BBoxSelectionModule(Module): # 定义 bbox 选择模块,只负责从多 bbox 中转发单个 bbox + config: BBoxSelectionConfig # 声明本模块使用的配置类型 + detections: In[Detection2DArray] # 输入现有检测模块发布的多 bbox Detection2DArray + clicked_point: In[PointStamped] # 输入 dimos-viewer 点击事件,用于把相机像素点击映射到 bbox + stop_movement: In[Bool] # 输入用户接管/地图规划中断信号,用于清空当前 bbox 选择 + clear_selection_request: In[Bool] # 输入 task 完成或被主动停止后的清除请求 + selected_bbox: Out[Detection2DArray] # 输出只包含当前选中 bbox 的 Detection2DArray + + def __init__(self, **kwargs: Any) -> None: # 定义构造函数,接收框架传入的配置参数 + super().__init__(**kwargs) # 调用父类构造函数,让 DimOS 初始化模块和流 + self._lock = threading.RLock() # 创建递归锁,保护最新 detections 和选择状态 + self._latest_detections: Detection2DArray | None = None # 保存最新一帧多 bbox 检测结果 + self._selected_index: int | None = None # 保存通过 index 选择的目标序号 + self._selected_id: str | None = None # 保存通过 id 选择的目标 id + self._last_logged_selection_signature: tuple[str, str | None, int | None] | None = None + + @rpc # 标记 start() 是框架生命周期 RPC + def start(self) -> None: # 定义模块启动逻辑 + super().start() # 启动父类逻辑,包括 RPC 和自动绑定 + self.register_disposable( + Disposable(self.detections.subscribe(self._on_detections)) + ) # 订阅检测流 + self.register_disposable( + Disposable(self.clicked_point.subscribe(self._on_clicked_point)) + ) # 订阅 viewer 点击流 + self.register_disposable( + Disposable(self.stop_movement.subscribe(self._on_stop_movement)) + ) # 订阅用户接管/规划中断信号 + self.register_disposable( + Disposable(self.clear_selection_request.subscribe(self._on_clear_selection_request)) + ) # 订阅 task 完成清除信号 + + @rpc # 标记 list_candidates() 可通过 DimOS RPC 调用 + def list_candidates(self) -> list[dict[str, Any]]: # 返回最新一帧候选 bbox 列表 + with self._lock: # 加锁读取最新 detections,避免和订阅回调并发冲突 + detections = self._latest_detections # 复制引用,缩短锁内逻辑 + + if detections is None: # 如果还没有收到任何检测帧 + return [] # 返回空候选列表 + + return [ + self._candidate_to_dict(index, detection) + for index, detection in enumerate(detections.detections) + ] # 转换每个检测为 RPC 友好的字典 + + @rpc # 标记 select_bbox() 可通过 DimOS RPC 调用 + def select_bbox( + self, index: int | None = None, id: str | None = None + ) -> str: # 保存用户选择的 bbox 条件 + if index is None and id is None: # 如果调用方没有提供 index 或 id + return "select_bbox requires index or id" # 返回可读错误,不改变当前选择 + + if index is not None and id is not None: # 如果调用方同时提供两种选择条件 + return "select_bbox accepts only one of index or id" # 返回可读错误,避免选择语义不明确 + + with self._lock: # 加锁更新选择状态 + self._selected_index = index # 保存 index 选择,未使用时为 None + self._selected_id = str(id) if id is not None else None # 保存 id 选择,统一转为字符串 + latest = self._latest_detections # 取出最新 detections,用于立即刷新 viewer + + if latest is not None: # 如果已经收到过检测帧 + self._publish_selected(latest) # 立即按当前选择发布 selected_bbox,方便 viewer 立刻反馈 + + if index is not None: # 如果用户按 index 选择 + return f"selected bbox index={index}" # 返回确认信息 + + return f"selected bbox id={id}" # 返回按 id 选择的确认信息 + + @rpc # 标记 clear_selection() 可通过 DimOS RPC 调用 + def clear_selection(self) -> str: # 清除当前 bbox 选择 + self._clear_selection(reason="rpc") # 复用统一清除逻辑,保证 RPC 和流事件行为一致 + return "cleared bbox selection" # 返回确认信息 + + def _on_stop_movement(self, msg: Bool) -> None: + if not bool(getattr(msg, "data", False)): + return + self._clear_selection(reason="stop_movement") + + def _on_clear_selection_request(self, msg: Bool) -> None: + if not bool(getattr(msg, "data", False)): + return + self._clear_selection(reason="clear_selection_request") + + def _clear_selection(self, reason: str) -> None: + with self._lock: + had_selection = self._selected_index is not None or self._selected_id is not None + self._selected_index = None + self._selected_id = None + latest = self._latest_detections + + self.selected_bbox.publish( + self._empty_detection_array(latest) + ) # 发布空数组,清除 viewer 中的旧选框 + if had_selection: + logger.info(f"BBoxSelectionModule: cleared selection reason={reason}") + + def _on_detections( + self, detections: Detection2DArray + ) -> None: # 处理检测模块发布的新一帧多 bbox + with self._lock: # 加锁更新最新检测结果 + self._latest_detections = detections # 保存最新一帧 detections + + self._publish_selected(detections) # 每帧根据当前选择转发单个 bbox 或空 bbox + + def _on_clicked_point(self, point: PointStamped) -> None: # 处理 dimos-viewer 发回来的点击点 + logger.debug( + "BBoxSelectionModule: received click " + f"frame_id={point.frame_id!r} x={point.x} y={point.y} z={point.z}" + ) + if not self._is_color_image_click( + point + ): # 只接受 Camera/color_image 视图里的点击,避免误吃 3D 点击 + logger.info("BBoxSelectionModule: ignoring non-color-image click") + return # 非相机点击不改变当前 bbox 选择 + + if not (math.isfinite(point.x) and math.isfinite(point.y)): # 检查点击像素坐标是否可用 + logger.info("BBoxSelectionModule: ignoring click with non-finite coordinates") + return # 无效点击直接忽略,不清除当前选择 + + with self._lock: # 加锁读取最新检测帧 + detections = self._latest_detections # 复制最新 detections,后续命中测试不持锁 + + if detections is None: # 如果 detector 还没有发布过任何候选 bbox + logger.info("BBoxSelectionModule: no detections available yet for click selection") + return # 暂时无法选择,等待下一帧 detections + + logger.debug( + "BBoxSelectionModule: matching click against detections " + f"count={len(detections.detections)} frame_id={getattr(detections.header, 'frame_id', '')!r}" + ) + selected_index = ( + self._find_clicked_detection_index( # 在最新 detections 中查找被点击命中的 bbox + detections, # 传入最新一帧候选 bbox + float(point.x), # 传入 viewer 点击的 x 像素坐标 + float(point.y), # 传入 viewer 点击的 y 像素坐标 + hit_padding_px=float(self.config.click_hit_padding_px), + snap_distance_px=float(self.config.click_snap_distance_px), + ) + ) # 结束命中测试 + + if selected_index is None: + logger.info("BBoxSelectionModule: click did not hit any bbox; clearing selection") + else: + selected = detections.detections[selected_index] + x1, y1, x2, y2 = self._bbox_corners(selected) + selected_id = self._stable_detection_id(selected, selected_index) + logger.info( + "BBoxSelectionModule: click matched bbox " + f"index={selected_index} id={self._detection_id(selected, selected_index)!r} " + f"bbox=({x1:.1f}, {y1:.1f}, {x2:.1f}, {y2:.1f})" + ) + + with self._lock: # 加锁更新选择状态 + if selected_index is None: + self._selected_index = None + self._selected_id = None + else: + self._selected_index = selected_index + self._selected_id = selected_id + + if selected_index is not None: + if selected_id is not None: + logger.debug( + f"BBoxSelectionModule: tracking clicked bbox by stable id id={selected_id!r}" + ) + else: + logger.debug( + "BBoxSelectionModule: clicked bbox has no stable id; falling back to index " + f"index={selected_index}" + ) + + self._publish_selected( + detections + ) # 立即刷新 selected_bbox,让机器人和 viewer 同步看到选择结果 + + def _publish_selected( + self, detections: Detection2DArray + ) -> None: # 根据当前选择发布 selected_bbox + selected = self._find_selected_detection(detections) # 在当前帧里查找被选中的检测 + if selected is None: # 如果当前帧没有选中目标或选择还不存在 + self._log_selection_state(None, None) + self.selected_bbox.publish( + self._empty_detection_array(detections) + ) # 发布空数组,避免下游复用旧 bbox + return # 结束本帧处理 + + selected_index = next( + ( + index + for index, detection in enumerate(detections.detections) + if detection is selected + ), + -1, + ) + self._log_selection_state(selected, selected_index) + msg = Detection2DArray( # 构造只包含一个 detection 的消息 + detections_length=1, # 设置检测数量为 1 + header=detections.header, # 复用当前帧 header,保持时间和坐标系一致 + detections=[selected], # 只放入当前帧匹配到的 selected detection + ) # 结束 Detection2DArray 构造 + self.selected_bbox.publish(msg) # 发布 selected_bbox 给行为模块和 viewer + + def _find_selected_detection( + self, detections: Detection2DArray + ) -> Any | None: # 在当前帧中查找选中的 detection + with self._lock: # 加锁读取选择条件 + selected_index = self._selected_index # 复制 index 选择 + selected_id = self._selected_id # 复制 id 选择 + + if selected_id is not None: # 如果当前使用 id 选择 + for index, detection in enumerate(detections.detections): # 遍历当前帧所有 detection + if ( + self._detection_id(detection, index) == selected_id + ): # 比较真实 id 或 index fallback + return detection # 找到匹配 id 的 detection + return None # 当前帧没有匹配 id 时返回空 + + if selected_index is not None: # 如果当前使用 index 选择 + if 0 <= selected_index < len(detections.detections): # 如果 index 在当前帧范围内 + return detections.detections[selected_index] # 返回当前帧对应序号的 detection + return None # index 越界时返回空 + + return None # 没有选择时返回空 + + @staticmethod # 声明这是不依赖实例状态的工具函数 + def _empty_detection_array( + source: Detection2DArray | None, + ) -> Detection2DArray: # 构造空 Detection2DArray + header = ( + source.header if source is not None else Header(time.time(), _DEFAULT_FRAME_ID) + ) # 优先复用来源 header + return Detection2DArray(detections_length=0, header=header, detections=[]) # 返回空检测数组 + + @classmethod # 声明候选转换需要复用类级工具函数 + def _candidate_to_dict( + cls, index: int, detection: Any + ) -> dict[str, Any]: # 把 detection 转成 RPC 字典 + x1, y1, x2, y2 = cls._bbox_corners(detection) # 计算 bbox 的左上和右下坐标 + confidence, class_id = cls._best_result(detection) # 读取第一条 hypothesis 的置信度和类别 + return { # 返回用户可读且 JSON 友好的候选结构 + "index": index, # 返回当前帧中的候选序号 + "id": cls._detection_id( + detection, index + ), # 返回 detection.id,没有时回退为 index 字符串 + "bbox": [x1, y1, x2, y2], # 返回 xyxy 格式 bbox + "confidence": confidence, # 返回置信度,缺失时为 0.0 + "class_id": class_id, # 返回类别 id,缺失时为 None + } # 结束候选字典 + + @staticmethod # 声明这是不依赖实例状态的工具函数 + def _detection_id( + detection: Any, index: int + ) -> str: # 读取 detection id,并在缺失时回退到 index + detection_id = getattr(detection, "id", "") # 读取 detection.id,缺失时使用空字符串 + return str(detection_id) if detection_id else str(index) # 返回真实 id 或 index 字符串 + + @staticmethod + def _stable_detection_id(detection: Any, index: int) -> str | None: + detection_id = str(getattr(detection, "id", "")).strip() + if not detection_id or detection_id == "-1": + return None + return detection_id + + @staticmethod # 声明这是不依赖实例状态的工具函数 + def _is_color_image_click( + point: PointStamped, + ) -> bool: # 判断点击是否来自相机图像或其 bbox overlay + frame_parts = point.frame_id.strip("/").split( + "/" + ) # 把 entity_path 拆成路径片段,兼容有无前导斜杠 + return "color_image" in frame_parts # 只让 color_image 视图点击驱动 bbox 选择 + + @classmethod # 声明命中测试需要复用 bbox 坐标转换工具 + def _find_clicked_detection_index( # 查找包含点击像素的 detection index + cls, # 传入类本身,方便调用类方法 + detections: Detection2DArray, # 输入最新一帧候选 bbox + x: float, # 输入点击的 x 像素坐标 + y: float, # 输入点击的 y 像素坐标 + hit_padding_px: float = 0.0, + snap_distance_px: float = 0.0, + ) -> int | None: # 返回命中的 bbox index,未命中时返回 None + hits: list[tuple[float, int]] = [] # 保存命中的 bbox 面积和 index,用于重叠时选更小框 + nearest: tuple[float, int] | None = None + for index, detection in enumerate(detections.detections): # 遍历当前帧所有候选 bbox + x1, y1, x2, y2 = cls._bbox_corners(detection) # 读取当前 bbox 的 xyxy 像素范围 + left, right = sorted((x1, x2)) # 归一化左右边界,防止异常 bbox 坐标反向 + top, bottom = sorted((y1, y2)) # 归一化上下边界,防止异常 bbox 坐标反向 + padded_left = left - hit_padding_px + padded_right = right + hit_padding_px + padded_top = top - hit_padding_px + padded_bottom = bottom + hit_padding_px + + if ( + padded_left <= x <= padded_right and padded_top <= y <= padded_bottom + ): # 如果点击点落在当前 bbox 内 + area = max((right - left) * (bottom - top), 0.0) # 计算 bbox 面积,重叠时优先小框 + hits.append((area, index)) # 记录命中的候选 bbox + + # 如果没有直接命中,记录离 bbox 的最近距离,用于吸附到近邻小目标。 + dx = 0.0 + if x < left: + dx = left - x + elif x > right: + dx = x - right + + dy = 0.0 + if y < top: + dy = top - y + elif y > bottom: + dy = y - bottom + + distance_sq = dx * dx + dy * dy + if nearest is None or distance_sq < nearest[0]: + nearest = (distance_sq, index) + + if not hits: # 如果没有任何 bbox 包含点击点 + if nearest is None: + return None + if snap_distance_px <= 0.0: + return None + if nearest[0] <= snap_distance_px * snap_distance_px: + return nearest[1] + return None # 返回 None,调用方会清空当前选择 + + return min(hits)[1] # 多个 bbox 重叠时选择面积最小的那个 + + @staticmethod # 声明这是不依赖实例状态的工具函数 + def _bbox_corners( + detection: Any, + ) -> tuple[float, float, float, float]: # 把中心点 bbox 转成 xyxy + bbox = detection.bbox # 读取 Detection2D 的 bbox 字段 + center = bbox.center.position # 读取 bbox 中心点位置 + half_width = float(bbox.size_x) / 2.0 # 计算 bbox 半宽 + half_height = float(bbox.size_y) / 2.0 # 计算 bbox 半高 + x1 = float(center.x) - half_width # 计算左上角 x + y1 = float(center.y) - half_height # 计算左上角 y + x2 = float(center.x) + half_width # 计算右下角 x + y2 = float(center.y) + half_height # 计算右下角 y + return x1, y1, x2, y2 # 返回 xyxy 四元组 + + @staticmethod # 声明这是不依赖实例状态的工具函数 + def _best_result(detection: Any) -> tuple[float, str | None]: # 读取 detection 的首个分类结果 + results = getattr(detection, "results", []) # 读取 detection.results,缺失时使用空列表 + if not results: # 如果没有任何 hypothesis + return 0.0, None # 返回默认置信度和空类别 + + hypothesis = results[0].hypothesis # 读取第一条 ObjectHypothesis + confidence = float(getattr(hypothesis, "score", 0.0)) # 读取置信度,缺失时为 0.0 + class_id = getattr(hypothesis, "class_id", None) # 读取类别 id,缺失时为 None + return confidence, class_id # 返回置信度和类别 id + + def _log_selection_state(self, selected: Any | None, selected_index: int | None) -> None: + if selected is None: + signature = ("empty", None, None) + if signature == self._last_logged_selection_signature: + return + self._last_logged_selection_signature = signature + logger.info("BBoxSelectionModule: publishing empty selected_bbox") + return + + detection_id = self._detection_id( + selected, max(selected_index if selected_index is not None else 0, 0) + ) + signature = ("selected", detection_id, selected_index) + if signature == self._last_logged_selection_signature: + return + + self._last_logged_selection_signature = signature + logger.info( + "BBoxSelectionModule: publishing selected_bbox " + f"index={selected_index} id={detection_id!r}" + ) + + +__all__ = [ # 声明这个文件希望对外暴露的名字 + "BBoxSelectionConfig", # 暴露选择模块配置 + "BBoxSelectionModule", # 暴露选择模块 +] # 结束 __all__ 列表 diff --git a/dimos/robot/custom/modules/go2_startup_self_check_module.py b/dimos/robot/custom/modules/go2_startup_self_check_module.py new file mode 100644 index 0000000000..b6004b6e4d --- /dev/null +++ b/dimos/robot/custom/modules/go2_startup_self_check_module.py @@ -0,0 +1,147 @@ +from __future__ import annotations # 允许类型注解延迟解析,减少循环导入风险 + +import threading # 导入线程工具,用后台线程执行自检动作 +import time # 导入时间工具,用来计时和控制发布频率 +from typing import Any # 导入 Any,表示这里可以接收任意类型参数 + +from dimos.constants import DEFAULT_THREAD_JOIN_TIMEOUT # 导入线程停止等待的默认超时时间 +from dimos.core.core import rpc # 导入 rpc 装饰器,让生命周期方法可被框架调用 +from dimos.core.module import Module, ModuleConfig # 导入模块基类和模块配置基类 +from dimos.core.stream import Out # 导入输出流类型,用来发布消息 +from dimos.msgs.geometry_msgs.Twist import Twist # 导入速度命令消息类型 +from dimos.msgs.geometry_msgs.Vector3 import Vector3 # 导入三维向量类型 +from dimos.utils.logging_config import setup_logger # 导入日志初始化函数 + +logger = setup_logger() # 创建当前文件使用的日志对象 + + +class Go2StartupSelfCheckConfig(ModuleConfig): # 定义自检模块的配置类 + """Runtime knobs for the one-shot Go2 startup movement check.""" # 说明这些配置控制一次性启动自检 + + linear_speed_mps: float = 0.25 # 自检前进和后退速度,单位是米每秒 + forward_duration_sec: float = 2.0 # 前进持续时间,单位是秒 + backward_duration_sec: float = 2.0 # 后退持续时间,单位是秒 + command_publish_hz: float = 20.0 # 速度命令发布频率,单位是赫兹 + startup_delay_sec: float = 10.0 # 兜底等待时间,避免 Go2 start() 不返回时自检永远不启动 + + +class Go2StartupSelfCheck(Module): # 定义一个 DimOS 模块,用来发布启动自检速度命令 + """Publish one forward/stop/backward/stop sequence after all modules start.""" # 说明模块会在系统启动后执行前进/停止/后退/停止 + + config: Go2StartupSelfCheckConfig # 声明本模块使用上面的配置类型 + cmd_vel: Out[Twist] # 声明输出流,向外发布 Twist 速度命令 + + def __init__(self, **kwargs: Any) -> None: # 定义构造函数,接收框架传入的配置参数 + super().__init__(**kwargs) # 调用父类构造函数,让 DimOS 初始化模块和流 + # 这个事件用于通知后台线程停止,避免退出时还在发运动命令。 + self._stop_event = threading.Event() # 创建停止事件,默认状态是未停止 + # 线程先保持为空,直到 on_system_modules() 才启动,所以 start() 不会触发运动。 + self._thread: threading.Thread | None = None # 保存后台自检线程对象 + # 兜底线程用于在 on_system_modules() 没到达时延迟启动自检。 + self._fallback_thread: threading.Thread | None = None # 保存延迟兜底线程对象 + # 这个锁防止 on_system_modules() 和兜底线程同时启动两个自检线程。 + self._start_lock = threading.Lock() # 创建启动锁,保护自检启动状态 + # 这个标记防止 on_system_modules() 被重复调用时重复执行自检。 + self._started_self_check = False # 记录自检是否已经启动过 + + @rpc # 标记 start() 是可被 DimOS 框架调用的 RPC 生命周期方法 + def start(self) -> None: # 定义模块启动方法 + super().start() # 调用父类启动逻辑,启动 RPC、自动绑定等框架功能 + # start() 只重置状态;真正运动要等所有系统模块都启动完。 + self._stop_event.clear() # 清除停止事件,允许后续自检线程运行 + # 如果 GO2Connection.start() 长时间不返回,on_system_modules() 不会被调用,所以这里启动兜底等待线程。 + self._fallback_thread = threading.Thread( # 创建延迟兜底线程 + target=self._run_delayed_self_check, # 指定线程执行延迟启动函数 + name="Go2StartupSelfCheckFallback", # 给兜底线程起名,方便日志和调试 + daemon=True, # 设置为守护线程,进程退出时不会被它卡住 + ) # 结束兜底线程参数 + self._fallback_thread.start() # 启动兜底线程,但它会先等待 startup_delay_sec + + @rpc # 标记 on_system_modules() 是可被 DimOS 框架调用的 RPC 方法 + def on_system_modules(self, _modules: list[Any]) -> None: # 所有模块启动后,框架会调用这里 + # ModuleCoordinator 会在每个模块 start() 都返回后调用这里。 + self._start_self_check("all modules started") # 所有模块启动完成时,立即启动自检 + + @rpc # 标记 stop() 是可被 DimOS 框架调用的 RPC 生命周期方法 + def stop(self) -> None: # 定义模块停止方法 + # 先通知线程停止,再发零速度,保证退出时机器人不会继续动。 + self._stop_event.set() # 设置停止事件,让后台线程尽快退出循环 + self.cmd_vel.publish(Twist.zero()) # 发布零速度命令,让机器人停下 + + if self._thread is not None and self._thread.is_alive(): # 如果线程存在且仍在运行 + self._thread.join(DEFAULT_THREAD_JOIN_TIMEOUT) # 等待线程结束,但最多等默认超时时间 + + if ( + self._fallback_thread is not None and self._fallback_thread.is_alive() + ): # 如果兜底线程还在等待 + self._fallback_thread.join( + DEFAULT_THREAD_JOIN_TIMEOUT + ) # 等待兜底线程退出,但最多等默认超时时间 + + super().stop() # 调用父类停止逻辑,释放框架资源 + + def _run_delayed_self_check(self) -> None: # 定义延迟兜底启动逻辑 + logger.info("Go2 startup self-check fallback waiting") # 记录兜底等待开始日志 + if self._stop_event.wait( + self.config.startup_delay_sec + ): # 等待配置的秒数,期间如果收到停止就返回 True + return # 如果停止事件已经触发,就不再启动自检 + + self._start_self_check( + "startup delay elapsed" + ) # 等待结束后,如果还没启动过,就兜底启动自检 + + def _start_self_check(self, reason: str) -> None: # 定义统一的自检启动入口 + with self._start_lock: # 加锁,避免两个线程同时进入启动逻辑 + if self._started_self_check: # 如果自检已经启动过 + return # 直接返回,避免重复启动第二个自检线程 + + self._started_self_check = True # 标记自检已经启动 + self._thread = threading.Thread( # 创建后台线程,避免阻塞主流程 + target=self._run_self_check, # 指定线程执行自检函数 + name="Go2StartupSelfCheck", # 给线程起名,方便日志和调试 + daemon=True, # 设置为守护线程,进程退出时不会被它卡住 + ) # 结束线程参数 + logger.info("Go2 startup self-check scheduled", reason=reason) # 记录自检启动原因 + self._thread.start() # 启动后台自检线程 + + def _run_self_check(self) -> None: # 定义后台线程实际执行的自检流程 + logger.info("Go2 startup self-check started") # 记录自检开始日志 + try: # 使用 try/finally,保证异常或停止时也会发零速度 + self._publish_for_duration( # 发布一段时间的前进速度 + speed_mps=self.config.linear_speed_mps, # 前进速度使用配置里的正速度 + duration_sec=self.config.forward_duration_sec, # 前进时间使用配置里的前进时长 + ) # 结束前进发布调用 + self.cmd_vel.publish(Twist.zero()) # 前进结束后发布零速度,先停一下 + + self._publish_for_duration( # 发布一段时间的后退速度 + speed_mps=-self.config.linear_speed_mps, # 后退速度使用配置速度的负数 + duration_sec=self.config.backward_duration_sec, # 后退时间使用配置里的后退时长 + ) # 结束后退发布调用 + self.cmd_vel.publish(Twist.zero()) # 后退结束后发布零速度,让机器人停下 + finally: # 不管上面是否正常结束,最后都执行这里 + # 最后再发一次零速度,确保中断或异常时最终状态也是停止。 + self.cmd_vel.publish(Twist.zero()) # 发布最终零速度命令 + logger.info("Go2 startup self-check finished") # 记录自检结束日志 + + def _publish_for_duration( + self, speed_mps: float, duration_sec: float + ) -> None: # 按固定时间持续发布速度 + period_sec = 1.0 / self.config.command_publish_hz # 根据频率计算每次发布之间的间隔 + end_time = time.monotonic() + duration_sec # 计算这段动作应该结束的时间点 + twist = Twist( # 创建一个 Twist 速度命令 + linear=Vector3(speed_mps, 0.0, 0.0), # 设置 x 方向线速度,y/z 为 0 + angular=Vector3(0.0, 0.0, 0.0), # 设置角速度全为 0,不转向 + ) # 结束 Twist 创建 + + while ( + not self._stop_event.is_set() and time.monotonic() < end_time + ): # 未停止且未超时时继续发布 + self.cmd_vel.publish(twist) # 发布当前速度命令 + time.sleep(period_sec) # 睡眠一个周期,控制发布频率 + + +__all__ = [ # 声明这个文件希望对外暴露的名字 + "Go2StartupSelfCheck", # 暴露自检模块类 + "Go2StartupSelfCheckConfig", # 暴露自检配置类 +] # 结束 __all__ 列表 diff --git a/dimos/robot/custom/modules/spatial_target_lock_module.py b/dimos/robot/custom/modules/spatial_target_lock_module.py new file mode 100644 index 0000000000..3b8f855920 --- /dev/null +++ b/dimos/robot/custom/modules/spatial_target_lock_module.py @@ -0,0 +1,680 @@ +# Copyright 2025-2026 Dimensional Inc. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from __future__ import annotations + +import json +import math +import threading +import time +from typing import Any, Literal + +from dimos_lcm.sensor_msgs import CameraInfo as DimosLcmCameraInfo # type: ignore[import-untyped] +from dimos_lcm.std_msgs import Bool, String # type: ignore[import-untyped] +from reactivex.disposable import Disposable + +from dimos.core.core import rpc +from dimos.core.module import Module, ModuleConfig +from dimos.core.stream import In, Out +from dimos.msgs.geometry_msgs.PoseStamped import PoseStamped +from dimos.msgs.geometry_msgs.Vector3 import Vector3 +from dimos.msgs.sensor_msgs.CameraInfo import CameraInfo +from dimos.msgs.sensor_msgs.PointCloud2 import PointCloud2 +from dimos.msgs.std_msgs.Header import Header +from dimos.msgs.vision_msgs.Detection2DArray import Detection2DArray +from dimos.perception.detection.type.detection2d.bbox import Detection2DBBox +from dimos.perception.detection.type.detection3d.pointcloud import Detection3DPC +from dimos.robot.custom.modules.bbox_selection_module import BBoxSelectionModule +from dimos.utils.logging_config import setup_logger + +logger = setup_logger() + +_DEFAULT_FRAME_ID = "camera_optical" +_WORLD_FRAME_ID = "world" + +LockState = Literal["unselected", "waiting_for_pose", "locked", "using_memory"] +PoseSource = Literal["none", "selected_bbox", "spatial_reacquire", "memory"] + + +def _safe_track_id(raw: Any) -> int: + try: + return int(raw) + except (TypeError, ValueError): + return 0 + + +class SpatialTargetLockConfig(ModuleConfig): + tf_time_tolerance: float = 0.5 + reacquire_distance_m: float = 0.75 + reacquire_max_z_delta_m: float = 0.35 + reacquire_by_class: bool = True + preserve_identity_on_spatial_reacquire: bool = True + + +class SpatialTargetLockModule(Module): + """Keep a selected bbox locked as a world-frame target pose.""" + + config: SpatialTargetLockConfig + + detections: In[Detection2DArray] + selected_bbox: In[Detection2DArray] + lidar: In[PointCloud2] + camera_info: In[CameraInfo] + stop_movement: In[Bool] + clear_selection_request: In[Bool] + + locked_bbox: Out[Detection2DArray] + target_pose: Out[PoseStamped] + lock_status: Out[String] + + def __init__(self, **kwargs: Any) -> None: + super().__init__(**kwargs) + self._lock = threading.RLock() + self._state: LockState = "unselected" + self._target_id: str | None = None + self._stable_target_id: str | None = None + self._target_class_id: str | None = None + self._last_pose: PoseStamped | None = None + self._pose_source: PoseSource = "none" + self._last_seen_at: float | None = None + self._latest_lidar: PointCloud2 | None = None + self._latest_camera_info: CameraInfo | None = None + self._last_selected_detection: Any | None = None + self._last_header: Header | None = None + self._last_projection_block_reason: str | None = None + + @rpc + def start(self) -> None: + super().start() + logger.info( + "SpatialTargetLockModule: started " + f"tf_time_tolerance={self.config.tf_time_tolerance:.3f}s " + f"reacquire_distance_m={self.config.reacquire_distance_m:.3f} " + f"reacquire_max_z_delta_m={self.config.reacquire_max_z_delta_m:.3f} " + f"reacquire_by_class={self.config.reacquire_by_class} " + "preserve_identity_on_spatial_reacquire=" + f"{self.config.preserve_identity_on_spatial_reacquire}" + ) + self.register_disposable(Disposable(self.selected_bbox.subscribe(self._on_selected_bbox))) + self.register_disposable(Disposable(self.detections.subscribe(self._on_detections))) + self.register_disposable(Disposable(self.lidar.subscribe(self._on_lidar))) + self.register_disposable(Disposable(self.camera_info.subscribe(self._on_camera_info))) + self.register_disposable(Disposable(self.stop_movement.subscribe(self._on_stop_movement))) + self.register_disposable( + Disposable(self.clear_selection_request.subscribe(self._on_clear_selection_request)) + ) + self._publish_status(force=True) + + @rpc + def clear_lock(self) -> str: + self._clear_lock() + return "spatial target lock cleared" + + @rpc + def get_lock_state(self) -> dict[str, Any]: + with self._lock: + return { + "state": self._state, + "target_id": self._target_id, + "target_class_id": self._target_class_id, + "last_pose": self._pose_to_dict(self._last_pose), + "pose_source": self._pose_source, + "last_seen_at": self._last_seen_at, + } + + def _on_lidar(self, lidar: PointCloud2) -> None: + with self._lock: + self._latest_lidar = lidar + should_log = self._state == "waiting_for_pose" + self._try_project_waiting_selection() + if should_log: + logger.debug( + "SpatialTargetLockModule: received lidar while waiting " + f"frame_id={lidar.frame_id!r} ts={float(lidar.ts or 0.0):.3f}" + ) + + def _on_camera_info(self, camera_info: CameraInfo) -> None: + with self._lock: + self._latest_camera_info = camera_info + should_log = self._state == "waiting_for_pose" + self._try_project_waiting_selection() + if should_log: + logger.debug( + "SpatialTargetLockModule: received camera_info while waiting " + f"width={camera_info.width} height={camera_info.height}" + ) + + def _on_selected_bbox(self, selected_bbox: Detection2DArray) -> None: + with self._lock: + self._last_header = selected_bbox.header + + detection = self._extract_single_detection(selected_bbox) + if detection is None: + with self._lock: + has_active_lock = self._target_id is not None or self._last_pose is not None + if not has_active_lock: + logger.debug( + "SpatialTargetLockModule: received empty selected_bbox with no active lock; " + "publishing empty locked_bbox" + ) + self.locked_bbox.publish(self._empty_detection_array(selected_bbox.header)) + self._set_state("unselected", pose_source="none") + else: + logger.debug( + "SpatialTargetLockModule: received empty selected_bbox but keeping spatial " + "memory target active" + ) + return + + with self._lock: + has_frozen_pose = self._last_pose is not None + if has_frozen_pose: + logger.debug( + "SpatialTargetLockModule: ignoring selected_bbox update because target pose " + "is already frozen" + ) + return + + self._set_selected_target(detection, selected_bbox.header) + logger.info( + "SpatialTargetLockModule: selected bbox " + f"{self._detection_summary(detection)} " + f"header_frame={selected_bbox.header.frame_id!r} " + f"header_ts={self._header_timestamp(selected_bbox.header):.3f}" + ) + pose = self._project_detection_pose(detection) + if pose is None: + with self._lock: + last_pose = self._last_pose + self.locked_bbox.publish(self._single_detection_array(detection, selected_bbox.header)) + if last_pose is not None: + logger.info( + "SpatialTargetLockModule: selected bbox projection failed; " + f"using memory pose {self._pose_summary(last_pose)}" + ) + self._publish_memory(selected_bbox.header) + else: + logger.info( + "SpatialTargetLockModule: selected bbox projection failed; " + "waiting for lidar/camera_info/tf" + ) + self._set_state("waiting_for_pose", pose_source="none") + return + + self._accept_detection_pose( + detection, + selected_bbox.header, + pose, + pose_source="selected_bbox", + ) + + def _on_detections(self, detections: Detection2DArray) -> None: + with self._lock: + self._last_header = detections.header + target_id = self._target_id + last_pose = self._last_pose + + if target_id is None and last_pose is None: + self.locked_bbox.publish(self._empty_detection_array(detections.header)) + self._set_state("unselected", pose_source="none") + return + + # First pass: YOLO is only used to provide the initial selected bbox. Once a + # 3D target pose exists, keep publishing that frozen memory pose and do not + # spatially re-match detections. + self._publish_memory(detections.header) + + def _on_stop_movement(self, msg: Bool) -> None: + if bool(getattr(msg, "data", False)): + logger.info("SpatialTargetLockModule: stop_movement requested; clearing lock") + self._clear_lock() + + def _on_clear_selection_request(self, msg: Bool) -> None: + if bool(getattr(msg, "data", False)): + logger.info("SpatialTargetLockModule: clear_selection_request received; clearing lock") + self._clear_lock() + + def _try_project_waiting_selection(self) -> None: + with self._lock: + detection = self._last_selected_detection + header = self._last_header + should_try = detection is not None and self._last_pose is None + + if not should_try: + return + + pose = self._project_detection_pose(detection) + if pose is None: + logger.debug("SpatialTargetLockModule: waiting selection projection retry failed") + return + + logger.info( + "SpatialTargetLockModule: waiting selection projection recovered " + f"pose={self._pose_summary(pose)}" + ) + self._accept_detection_pose(detection, header, pose, pose_source="selected_bbox") + + def _set_selected_target(self, detection: Any, header: Header | None) -> None: + stable_id = self._stable_detection_id(detection) + with self._lock: + self._target_id = stable_id or self._detection_id(detection, fallback_index=0) + self._stable_target_id = stable_id + self._target_class_id = self._detection_class_id(detection) + self._last_selected_detection = detection + self._last_header = header + logger.debug( + "SpatialTargetLockModule: target identity set " + f"target_id={self._target_id!r} stable_id={stable_id!r} " + f"class_id={self._target_class_id!r}" + ) + + def _accept_detection_pose( + self, + detection: Any, + header: Header | None, + pose: PoseStamped, + pose_source: PoseSource, + *, + preserve_identity: bool = False, + ) -> None: + now = time.monotonic() + stable_id = self._stable_detection_id(detection) + with self._lock: + previous_target_id = self._target_id + previous_stable_target_id = self._stable_target_id + previous_target_class_id = self._target_class_id + if not preserve_identity: + self._target_id = stable_id or self._detection_id(detection, fallback_index=0) + self._stable_target_id = stable_id + self._target_class_id = self._detection_class_id(detection) + self._last_pose = pose + self._pose_source = pose_source + self._last_seen_at = now + self._last_selected_detection = detection + self._last_header = header + target_id = self._target_id + target_class_id = self._target_class_id + + self.locked_bbox.publish(self._single_detection_array(detection, header)) + self.target_pose.publish(pose) + if preserve_identity: + logger.info( + "SpatialTargetLockModule: spatial reacquire preserving selected identity " + f"target_id={previous_target_id!r} stable_id={previous_stable_target_id!r} " + f"class_id={previous_target_class_id!r} " + f"matched_detection_id={stable_id!r}" + ) + logger.info( + "SpatialTargetLockModule: accepted target pose " + f"source={pose_source} target_id={target_id!r} " + f"class_id={target_class_id!r} pose={self._pose_summary(pose)}" + ) + self._set_state("locked", pose_source=pose_source) + + def _publish_memory(self, header: Header | None) -> None: + with self._lock: + last_pose = self._last_pose + if last_pose is None: + logger.info( + "SpatialTargetLockModule: no memory pose available; waiting for pose" + ) + self.locked_bbox.publish(self._empty_detection_array(header)) + self._set_state("waiting_for_pose", pose_source="none") + return + + self.locked_bbox.publish(self._empty_detection_array(header)) + self.target_pose.publish(last_pose) + logger.debug( + "SpatialTargetLockModule: publishing memory target pose " + f"{self._pose_summary(last_pose)}" + ) + self._set_state("using_memory", pose_source="memory") + + def _find_spatial_reacquire( + self, + detections: Detection2DArray, + target_class_id: str | None, + last_pose: PoseStamped | None, + ) -> tuple[Any, PoseStamped] | None: + if last_pose is None or not detections.detections: + if last_pose is not None: + logger.debug( + "SpatialTargetLockModule: cannot spatially reacquire; " + "detections are empty" + ) + return None + + candidates = list(detections.detections) + original_count = len(candidates) + if self.config.reacquire_by_class and target_class_id is not None: + class_matches = [ + detection + for detection in candidates + if self._detection_class_id(detection) == target_class_id + ] + if class_matches: + candidates = class_matches + logger.debug( + "SpatialTargetLockModule: spatial reacquire candidates " + f"original_count={original_count} filtered_count={len(candidates)} " + f"target_class_id={target_class_id!r}" + ) + + best: tuple[float, Any, PoseStamped] | None = None + projected_count = 0 + rejected_by_distance = 0 + rejected_by_z = 0 + for detection in candidates: + pose = self._project_detection_pose(detection) + if pose is None: + continue + projected_count += 1 + z_delta = abs(float(pose.position.z) - float(last_pose.position.z)) + if z_delta > self.config.reacquire_max_z_delta_m: + rejected_by_z += 1 + logger.debug( + "SpatialTargetLockModule: spatial reacquire rejected by z delta " + f"z_delta={z_delta:.3f} gate_m={self.config.reacquire_max_z_delta_m:.3f} " + f"{self._detection_summary(detection)} pose={self._pose_summary(pose)} " + f"memory_pose={self._pose_summary(last_pose)}" + ) + continue + distance = self._xyz_distance(pose.position, last_pose.position) + if distance > self.config.reacquire_distance_m: + rejected_by_distance += 1 + continue + if best is None or distance < best[0]: + best = (distance, detection, pose) + + if best is None: + logger.debug( + "SpatialTargetLockModule: spatial reacquire failed " + f"candidates={len(candidates)} projected={projected_count} " + f"rejected_by_distance={rejected_by_distance} " + f"rejected_by_z={rejected_by_z} " + f"distance_gate_m={self.config.reacquire_distance_m:.3f} " + f"z_gate_m={self.config.reacquire_max_z_delta_m:.3f}", + ) + return None + logger.debug( + "SpatialTargetLockModule: spatial reacquire selected " + f"distance_to_memory={best[0]:.3f} " + f"{self._detection_summary(best[1])} pose={self._pose_summary(best[2])}" + ) + return best[1], best[2] + + def _project_detection_pose(self, detection: Any) -> PoseStamped | None: + with self._lock: + lidar = self._latest_lidar + camera_info = self._latest_camera_info + + if lidar is None or camera_info is None: + missing = [] + if lidar is None: + missing.append("lidar") + if camera_info is None: + missing.append("camera_info") + self._set_projection_block_reason("missing_" + "_and_".join(missing)) + return None + + x1, y1, x2, y2 = BBoxSelectionModule._bbox_corners(detection) + class_id = self._detection_class_id(detection) + det2d = Detection2DBBox( + bbox=(x1, y1, x2, y2), + track_id=_safe_track_id(self._stable_detection_id(detection)), + class_id=_safe_track_id(class_id), + confidence=self._detection_confidence(detection), + name=class_id or "", + ts=float(lidar.ts or 0.0), + image=None, # type: ignore[arg-type] + ) + + ts = float(lidar.ts or 0.0) + try: + world_to_optical = self.tf.get( + "camera_optical", + lidar.frame_id, + ts, + time_tolerance=self.config.tf_time_tolerance, + ) + except RuntimeError as exc: + self._set_projection_block_reason( + f"missing_tf camera_optical->{lidar.frame_id!r}: {exc}" + ) + return None + if world_to_optical is None: + self._set_projection_block_reason( + f"missing_tf camera_optical->{lidar.frame_id!r}" + ) + return None + + lcm_camera_info = DimosLcmCameraInfo() + lcm_camera_info.K = camera_info.K + lcm_camera_info.width = camera_info.width + lcm_camera_info.height = camera_info.height + + detection_3d = Detection3DPC.from_2d( + det=det2d, + world_pointcloud=lidar, + camera_info=lcm_camera_info, + world_to_optical_transform=world_to_optical, + filters=[], + ) + if detection_3d is None: + self._set_projection_block_reason( + "no_3d_points_in_bbox " + f"bbox=({x1:.1f},{y1:.1f},{x2:.1f},{y2:.1f}) " + f"lidar_frame={lidar.frame_id!r}" + ) + return None + + self._set_projection_block_reason(None) + center = detection_3d.center + return PoseStamped( + ts=float(lidar.ts or time.time()), + frame_id=lidar.frame_id or _WORLD_FRAME_ID, + position=Vector3(center.x, center.y, center.z), + orientation=(0.0, 0.0, 0.0, 1.0), + ) + + def _clear_lock(self) -> None: + with self._lock: + header = self._last_header + target_id = self._target_id + self._target_id = None + self._stable_target_id = None + self._target_class_id = None + self._last_pose = None + self._pose_source = "none" + self._last_seen_at = None + self._last_selected_detection = None + self._last_projection_block_reason = None + logger.info(f"SpatialTargetLockModule: lock cleared previous_target_id={target_id!r}") + self.locked_bbox.publish(self._empty_detection_array(header)) + self._set_state("unselected", pose_source="none") + + def _set_state(self, state: LockState, pose_source: PoseSource) -> None: + with self._lock: + old_state = self._state + old_source = self._pose_source + self._state = state + self._pose_source = pose_source + target_id = self._target_id + last_pose = self._last_pose + status_changed = old_state != state or old_source != pose_source + is_memory_refresh = state == "using_memory" and old_state in { + "locked", + "using_memory", + } + if status_changed and not is_memory_refresh: + logger.info( + "SpatialTargetLockModule: state changed " + f"{old_state}/{old_source} -> {state}/{pose_source} " + f"target_id={target_id!r} pose={self._pose_summary(last_pose)}" + ) + self._publish_status(force=status_changed) + + def _set_projection_block_reason(self, reason: str | None) -> None: + with self._lock: + previous = self._last_projection_block_reason + if previous == reason: + return + self._last_projection_block_reason = reason + if reason is None: + if previous is not None: + logger.info( + "SpatialTargetLockModule: projection recovered " + f"previous_reason={previous}" + ) + return + logger.info(f"SpatialTargetLockModule: projection blocked reason={reason}") + + def _publish_status(self, force: bool = False) -> None: + with self._lock: + payload = { + "state": self._state, + "pose_source": self._pose_source, + "target_id": self._target_id, + "target_class_id": self._target_class_id, + "last_pose": self._pose_to_dict(self._last_pose), + "last_seen_at": self._last_seen_at, + } + if force or self._state in ("unselected", "using_memory", "waiting_for_pose"): + self.lock_status.publish(String(json.dumps(payload, ensure_ascii=True))) + + @staticmethod + def _extract_single_detection(detections: Detection2DArray | None) -> Any | None: + if detections is None or not detections.detections: + return None + return detections.detections[0] + + @staticmethod + def _single_detection_array(detection: Any, header: Header | None) -> Detection2DArray: + safe_header = header if header is not None else Header(time.time(), _DEFAULT_FRAME_ID) + return Detection2DArray(detections_length=1, header=safe_header, detections=[detection]) + + @staticmethod + def _empty_detection_array(header: Header | None) -> Detection2DArray: + safe_header = header if header is not None else Header(time.time(), _DEFAULT_FRAME_ID) + return Detection2DArray(detections_length=0, header=safe_header, detections=[]) + + @classmethod + def _find_by_stable_id( + cls, detections: Detection2DArray, stable_target_id: str + ) -> Any | None: + for detection in detections.detections: + if cls._stable_detection_id(detection) == stable_target_id: + return detection + return None + + @staticmethod + def _stable_detection_id(detection: Any) -> str | None: + detection_id = str(getattr(detection, "id", "")).strip() + if not detection_id or detection_id == "-1": + return None + return detection_id + + @classmethod + def _detection_id(cls, detection: Any, fallback_index: int) -> str: + return cls._stable_detection_id(detection) or str(fallback_index) + + @staticmethod + def _detection_class_id(detection: Any) -> str | None: + results = getattr(detection, "results", []) + if not results: + return None + hypothesis = results[0].hypothesis + class_id = getattr(hypothesis, "class_id", None) + return str(class_id) if class_id is not None else None + + @staticmethod + def _detection_confidence(detection: Any) -> float: + results = getattr(detection, "results", []) + if not results: + return 1.0 + hypothesis = results[0].hypothesis + return float(getattr(hypothesis, "score", 1.0) or 1.0) + + @classmethod + def _detection_summary(cls, detection: Any) -> str: + x1, y1, x2, y2 = BBoxSelectionModule._bbox_corners(detection) + return ( + f"id={cls._stable_detection_id(detection)!r} " + f"class_id={cls._detection_class_id(detection)!r} " + f"score={cls._detection_confidence(detection):.3f} " + f"bbox=({x1:.1f},{y1:.1f},{x2:.1f},{y2:.1f})" + ) + + @staticmethod + def _xy_distance(a: Vector3, b: Vector3) -> float: + dx = float(a.x) - float(b.x) + dy = float(a.y) - float(b.y) + return math.sqrt(dx * dx + dy * dy) + + @staticmethod + def _xyz_distance(a: Vector3, b: Vector3) -> float: + dx = float(a.x) - float(b.x) + dy = float(a.y) - float(b.y) + dz = float(a.z) - float(b.z) + return math.sqrt(dx * dx + dy * dy + dz * dz) + + @staticmethod + def _pose_to_dict(pose: PoseStamped | None) -> dict[str, Any] | None: + if pose is None: + return None + return { + "frame_id": pose.frame_id, + "x": pose.position.x, + "y": pose.position.y, + "z": pose.position.z, + "ts": pose.ts, + } + + @classmethod + def _pose_summary(cls, pose: PoseStamped | None) -> str: + payload = cls._pose_to_dict(pose) + if payload is None: + return "None" + return ( + f"frame={payload['frame_id']!r} " + f"x={float(payload['x']):.3f} " + f"y={float(payload['y']):.3f} " + f"z={float(payload['z']):.3f} " + f"ts={float(payload['ts'] or 0.0):.3f}" + ) + + @staticmethod + def _header_timestamp(header: Header | None) -> float: + if header is None: + return 0.0 + try: + return float(header.timestamp) + except (AttributeError, TypeError, ValueError): + pass + try: + return float(header.ts) + except (AttributeError, TypeError, ValueError): + pass + stamp = getattr(header, "stamp", None) + if stamp is None: + return 0.0 + sec = float(getattr(stamp, "sec", 0.0) or 0.0) + nsec = float(getattr(stamp, "nsec", 0.0) or 0.0) + return sec + nsec / 1_000_000_000.0 + + +__all__ = [ + "SpatialTargetLockConfig", + "SpatialTargetLockModule", +] diff --git a/dimos/robot/custom/modules/target_lock_module.py b/dimos/robot/custom/modules/target_lock_module.py new file mode 100644 index 0000000000..4e16affd1d --- /dev/null +++ b/dimos/robot/custom/modules/target_lock_module.py @@ -0,0 +1,283 @@ +from __future__ import annotations + +import json +import threading +import time +from typing import Any, Literal + +from dimos_lcm.std_msgs import Bool, String +from reactivex.disposable import Disposable + +from dimos.core.core import rpc +from dimos.core.module import Module, ModuleConfig +from dimos.core.stream import In, Out +from dimos.msgs.std_msgs.Header import Header +from dimos.msgs.vision_msgs.Detection2DArray import Detection2DArray + +_DEFAULT_FRAME_ID = "camera_optical" + +LockState = Literal["unselected", "locked", "searching", "lost"] + + +class TargetLockConfig(ModuleConfig): + search_timeout_sec: float = 3.0 + reacquire_by_class: bool = True + + +class TargetLockModule(Module): + """Maintain a stable single-target lock from selected_bbox + detections.""" + + config: TargetLockConfig + + detections: In[Detection2DArray] + selected_bbox: In[Detection2DArray] + stop_movement: In[Bool] + clear_selection_request: In[Bool] + + locked_bbox: Out[Detection2DArray] + lock_status: Out[String] + + def __init__(self, **kwargs: Any) -> None: + super().__init__(**kwargs) + self._lock = threading.RLock() + self._state: LockState = "unselected" + self._target_id: str | None = None + self._target_class_id: str | None = None + self._last_center: tuple[float, float] | None = None + self._last_seen_at: float | None = None + self._last_header: Header | None = None + + @rpc + def start(self) -> None: + super().start() + self.register_disposable(Disposable(self.selected_bbox.subscribe(self._on_selected_bbox))) + self.register_disposable(Disposable(self.detections.subscribe(self._on_detections))) + self.register_disposable(Disposable(self.stop_movement.subscribe(self._on_stop_movement))) + self.register_disposable( + Disposable(self.clear_selection_request.subscribe(self._on_clear_selection_request)) + ) + self._publish_status(force=True) + + @rpc + def clear_lock(self) -> str: + self._clear_lock() + return "target lock cleared" + + @rpc + def get_lock_state(self) -> dict[str, Any]: + with self._lock: + return { + "state": self._state, + "target_id": self._target_id, + "target_class_id": self._target_class_id, + "last_seen_at": self._last_seen_at, + } + + def _on_selected_bbox(self, selected_bbox: Detection2DArray) -> None: + with self._lock: + self._last_header = selected_bbox.header + + detection = self._extract_single_detection(selected_bbox) + if detection is None: + with self._lock: + self._reset_lock_state() + self.locked_bbox.publish(self._empty_detection_array(selected_bbox.header)) + self._set_state("unselected") + return + + now = time.monotonic() + center = self._bbox_center(detection) + class_id = self._detection_class_id(detection) + + with self._lock: + self._target_id = self._detection_id(detection, fallback_index=0) + self._target_class_id = class_id + self._last_center = center + self._last_seen_at = now + + self.locked_bbox.publish(self._single_detection_array(detection, selected_bbox.header)) + self._set_state("locked") + + def _on_stop_movement(self, msg: Bool) -> None: + if bool(getattr(msg, "data", False)): + self._clear_lock() + + def _on_clear_selection_request(self, msg: Bool) -> None: + if bool(getattr(msg, "data", False)): + self._clear_lock() + + def _on_detections(self, detections: Detection2DArray) -> None: + with self._lock: + self._last_header = detections.header + target_id = self._target_id + target_class_id = self._target_class_id + last_center = self._last_center + last_seen_at = self._last_seen_at + + if target_id is None: + self.locked_bbox.publish(self._empty_detection_array(detections.header)) + self._set_state("unselected") + return + + matched = self._find_by_id(detections, target_id) + if matched is not None: + with self._lock: + if self._target_id != target_id: + return + self._update_lock_from_detection(matched, time.monotonic()) + self.locked_bbox.publish(self._single_detection_array(matched, detections.header)) + self._set_state("locked") + return + + now = time.monotonic() + if last_seen_at is None: + with self._lock: + if self._target_id != target_id: + return + self.locked_bbox.publish(self._empty_detection_array(detections.header)) + self._set_state("searching") + return + + if now - last_seen_at > max(self.config.search_timeout_sec, 0.0): + with self._lock: + if self._target_id != target_id: + return + self.locked_bbox.publish(self._empty_detection_array(detections.header)) + self._set_state("lost") + return + + reacquired = self._reacquire_candidate(detections, target_class_id, last_center) + if reacquired is None: + with self._lock: + if self._target_id != target_id: + return + self.locked_bbox.publish(self._empty_detection_array(detections.header)) + self._set_state("searching") + return + + with self._lock: + if self._target_id != target_id: + return + self._update_lock_from_detection(reacquired, now) + self.locked_bbox.publish(self._single_detection_array(reacquired, detections.header)) + self._set_state("locked") + + def _set_state(self, state: LockState) -> None: + with self._lock: + old_state = self._state + self._state = state + self._publish_status(force=old_state != state) + + def _publish_status(self, force: bool = False) -> None: + with self._lock: + state = self._state + payload = { + "state": state, + "target_id": self._target_id, + "target_class_id": self._target_class_id, + } + if force or state in ("unselected", "lost"): + self.lock_status.publish(String(json.dumps(payload, ensure_ascii=True))) + + def _update_lock_from_detection(self, detection: Any, now: float) -> None: + with self._lock: + self._target_id = self._detection_id(detection, fallback_index=0) + self._target_class_id = self._detection_class_id(detection) + self._last_center = self._bbox_center(detection) + self._last_seen_at = now + + def _reset_lock_state(self) -> None: + self._target_id = None + self._target_class_id = None + self._last_center = None + self._last_seen_at = None + + def _clear_lock(self) -> None: + with self._lock: + self._reset_lock_state() + header = self._last_header + self.locked_bbox.publish(self._empty_detection_array(header)) + self._set_state("unselected") + + def _reacquire_candidate( + self, + detections: Detection2DArray, + target_class_id: str | None, + last_center: tuple[float, float] | None, + ) -> Any | None: + candidates: list[Any] = list(detections.detections) + if not candidates: + return None + + if self.config.reacquire_by_class and target_class_id is not None: + class_filtered = [ + d for d in candidates if self._detection_class_id(d) == target_class_id + ] + if class_filtered: + candidates = class_filtered + + if not candidates: + return None + + if last_center is None: + return candidates[0] + + return min( + candidates, + key=lambda detection: self._center_distance_sq( + last_center, self._bbox_center(detection) + ), + ) + + def _find_by_id(self, detections: Detection2DArray, target_id: str) -> Any | None: + for index, detection in enumerate(detections.detections): + if self._detection_id(detection, fallback_index=index) == target_id: + return detection + return None + + @staticmethod + def _extract_single_detection(detections: Detection2DArray | None) -> Any | None: + if detections is None or not detections.detections: + return None + return detections.detections[0] + + @staticmethod + def _single_detection_array(detection: Any, header: Header | None) -> Detection2DArray: + safe_header = header if header is not None else Header(time.time(), _DEFAULT_FRAME_ID) + return Detection2DArray(detections_length=1, header=safe_header, detections=[detection]) + + @staticmethod + def _empty_detection_array(header: Header | None) -> Detection2DArray: + safe_header = header if header is not None else Header(time.time(), _DEFAULT_FRAME_ID) + return Detection2DArray(detections_length=0, header=safe_header, detections=[]) + + @staticmethod + def _detection_id(detection: Any, fallback_index: int) -> str: + detection_id = getattr(detection, "id", "") + return str(detection_id) if detection_id else str(fallback_index) + + @staticmethod + def _detection_class_id(detection: Any) -> str | None: + results = getattr(detection, "results", []) + if not results: + return None + hypothesis = results[0].hypothesis + class_id = getattr(hypothesis, "class_id", None) + return str(class_id) if class_id is not None else None + + @staticmethod + def _bbox_center(detection: Any) -> tuple[float, float]: + center = detection.bbox.center.position + return float(center.x), float(center.y) + + @staticmethod + def _center_distance_sq(a: tuple[float, float], b: tuple[float, float]) -> float: + dx = a[0] - b[0] + dy = a[1] - b[1] + return dx * dx + dy * dy + + +__all__ = [ + "TargetLockConfig", + "TargetLockModule", +] diff --git a/dimos/robot/custom/modules/yoloe_tracking_module.py b/dimos/robot/custom/modules/yoloe_tracking_module.py new file mode 100644 index 0000000000..24de93afdf --- /dev/null +++ b/dimos/robot/custom/modules/yoloe_tracking_module.py @@ -0,0 +1,107 @@ +from __future__ import annotations # 允许类型注解延迟解析,减少循环导入风险 + +from pathlib import Path +from typing import Any + +from dimos.core.core import rpc # 导入 rpc 装饰器,让方法可通过 DimOS RPC 调用 +from dimos.core.module import Module, ModuleConfig # 导入模块基类和模块配置基类 +from dimos.core.stream import In, Out # 导入输入输出流类型 +from dimos.msgs.sensor_msgs.Image import Image, sharpness_barrier +from dimos.msgs.vision_msgs.Detection2DArray import Detection2DArray +from dimos.perception.detection.detectors.yoloe import Yoloe2DDetector, YoloePromptMode +from dimos.utils.data import get_data_dir +from dimos.utils.logging_config import setup_logger +from dimos.utils.reactive import backpressure + +logger = setup_logger() + +_YOLOE_MODEL_DIR_NAME = "models_yoloe" +_YOLOE_LRPC_MODEL_NAME = "yoloe-11s-seg-pf.pt" + + +def _local_yoloe_model_path( + model_path: str = _YOLOE_MODEL_DIR_NAME, + model_name: str | None = None, +) -> Path: + return get_data_dir(model_path) / (model_name or _YOLOE_LRPC_MODEL_NAME) + + +def _format_missing_yoloe_model_error(model_path: Path) -> str: + return ( + f"Missing local YOLOE model: {model_path}. " + "This blueprint runs offline and will not pull model data at startup. " + "Prepare the model in an online environment first: " + "git lfs pull --include data/.lfs/models_yoloe.tar.gz && " + "uv run python -c 'from dimos.utils.data import get_data; print(get_data(\"models_yoloe\"))' && " + f"ls -lh {model_path}" + ) + + +def _require_yoloe_lrpc_model() -> str | None: + """Blueprint requirement check: verify the YOLOE model file is pre-downloaded.""" + model_path = _local_yoloe_model_path() + if model_path.exists(): + return None + return _format_missing_yoloe_model_error(model_path) + + +class YoloeTrackingConfig(ModuleConfig): + max_freq: float = 10.0 + model_path: str = _YOLOE_MODEL_DIR_NAME + model_name: str | None = None + device: str | None = None + max_area_ratio: float | None = 0.3 + + +class YoloeTrackingModule(Module): + """Run offline YOLOE tracking on color_image and publish Detection2DArray.""" + + config: YoloeTrackingConfig + + color_image: In[Image] + detections: Out[Detection2DArray] + + def __init__(self, **kwargs: Any) -> None: + super().__init__(**kwargs) + self._detector: Yoloe2DDetector | None = None + + @rpc + def start(self) -> None: + super().start() + + model_path = _local_yoloe_model_path(self.config.model_path, self.config.model_name) + if not model_path.exists(): + raise RuntimeError(_format_missing_yoloe_model_error(model_path)) + + self._detector = Yoloe2DDetector( + model_path=self.config.model_path, + model_name=self.config.model_name or _YOLOE_LRPC_MODEL_NAME, + device=self.config.device, + prompt_mode=YoloePromptMode.LRPC, + max_area_ratio=self.config.max_area_ratio, + ) + + stream = backpressure( + self.color_image.pure_observable().pipe(sharpness_barrier(self.config.max_freq)) + ) + self.register_disposable(stream.subscribe(self._process_image)) + + @rpc + def stop(self) -> None: + if self._detector is not None: + self._detector.stop() + self._detector = None + super().stop() + + def _process_image(self, image: Image) -> None: + if self._detector is None: + return + detections = self._detector.process_image(image) + self.detections.publish(detections.to_ros_detection2d_array()) + + +__all__ = [ + "YoloeTrackingConfig", + "YoloeTrackingModule", + "_require_yoloe_lrpc_model", +] diff --git a/dimos/robot/custom/tasks/__init__.py b/dimos/robot/custom/tasks/__init__.py new file mode 100644 index 0000000000..bc1a2ce5cc --- /dev/null +++ b/dimos/robot/custom/tasks/__init__.py @@ -0,0 +1,13 @@ +# Copyright 2025-2026 Dimensional Inc. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. diff --git a/dimos/robot/custom/tasks/bbox_distance_behavior_module.py b/dimos/robot/custom/tasks/bbox_distance_behavior_module.py new file mode 100644 index 0000000000..45fb9f6324 --- /dev/null +++ b/dimos/robot/custom/tasks/bbox_distance_behavior_module.py @@ -0,0 +1,448 @@ +from __future__ import annotations # 允许类型注解延迟解析,减少循环导入风险 + +import json # 导入 JSON 工具,用于发布结构化行为状态 +import math # 导入数学工具,用于检查有限数值 +import threading # 导入线程工具,用后台循环发布速度命令 +from typing import Any, Literal # 导入通用类型和状态字面量类型 + +from dimos_lcm.sensor_msgs import ( + CameraInfo as DimosLcmCameraInfo, # type: ignore[import-untyped] # 导入 LCM CameraInfo 类型,用于 Detection3DPC.from_2d() +) +from dimos_lcm.std_msgs import Bool, String # type: ignore[import-untyped] # 导入 LCM 消息类型 +from reactivex.disposable import Disposable # 导入 Disposable,用于注册输入流订阅 + +from dimos.constants import DEFAULT_THREAD_JOIN_TIMEOUT # 导入线程停止等待的默认超时时间 +from dimos.core.core import rpc # 导入 rpc 装饰器,让方法可通过 DimOS RPC 调用 +from dimos.core.module import Module, ModuleConfig # 导入模块基类和模块配置基类 +from dimos.core.stream import In, Out # 导入输入输出流类型 +from dimos.msgs.geometry_msgs.Twist import Twist # 导入速度命令消息类型 +from dimos.msgs.geometry_msgs.Vector3 import Vector3 # 导入三维向量类型,用于构造 Twist +from dimos.msgs.sensor_msgs.CameraInfo import CameraInfo # 导入相机内参消息类型 +from dimos.msgs.sensor_msgs.PointCloud2 import PointCloud2 # 导入 lidar 点云消息类型 +from dimos.msgs.vision_msgs.Detection2DArray import Detection2DArray # 导入 2D 检测数组消息类型 +from dimos.navigation.replanning_a_star.module_spec import ( + ReplanningAStarPlannerSpec, # 导入规划器 Spec,用于 bbox 任务启动时取消 A* 导航 +) +from dimos.perception.detection.type.detection2d.bbox import ( + Detection2DBBox, # 导入 2D bbox 类型,用于构造 Detection3DPC 输入 +) +from dimos.perception.detection.type.detection3d.pointcloud import ( + Detection3DPC, # 导入 3D 点云检测,用于精确距离估计 +) +from dimos.robot.custom.modules.bbox_selection_module import ( + BBoxSelectionModule, # 导入 bbox 选择模块,复用 _bbox_corners 工具 +) +from dimos.utils.logging_config import setup_logger # 导入日志初始化函数 + +logger = setup_logger() # 创建当前文件使用的日志对象 + +BehaviorState = Literal["idle", "approaching", "done"] # 定义行为状态机的合法状态 + +_LINEAR_GAIN = 0.8 # 定义距离误差到线速度的简单比例增益 +_ANGULAR_GAIN = 1.0 # 定义横向像素误差到角速度的简单比例增益 +_DISTANCE_TOLERANCE_M = 0.05 # 定义靠近完成时使用的距离容差 + + +def _safe_track_id(raw: Any) -> int: # 安全地把 detection.id 转换为 int,无法转换时返回 0 + try: + return int(raw) + except (TypeError, ValueError): + return 0 + + +class BBoxDistanceBehaviorConfig(ModuleConfig): # 定义 bbox 距离行为模块配置 + command_hz: float = 20.0 # 速度命令发布频率,单位 Hz + approach_distance: float = 0.2 # 靠近阶段目标距离,单位米 + max_linear_speed: float = 0.45 # 最大线速度,单位 m/s + max_angular_speed: float = 0.8 # 最大角速度,单位 rad/s + tf_time_tolerance: float = 0.5 # TF 查询时间容差,单位秒 + + +class BBoxDistanceBehaviorModule( + Module +): # 定义距离行为模块,只消费 selected bbox、lidar 和 camera_info + config: BBoxDistanceBehaviorConfig # 声明本模块使用的配置类型 + selected_bbox: In[Detection2DArray] # 输入选择模块发布的单 bbox Detection2DArray + lidar: In[PointCloud2] # 输入 Go2 lidar 点云 + camera_info: In[CameraInfo] # 输入 Go2 相机内参 + teleop_active: In[Bool] # 输入遥控信号;有值时中断当前任务,交回键盘控制 + cmd_vel: Out[Twist] # 输出 Go2 速度命令(有 MovementManager 时连接到 nav_cmd_vel) + clear_selection_request: Out[Bool] # 输出 task 完成/停止后的清除请求,防止旧 bbox 重启 task + behavior_status: Out[String] # 输出行为状态,便于日志和调试 + _planner: ( + ReplanningAStarPlannerSpec | None + ) # 可选:A* 规划器,存在时用于 bbox 任务启动时取消当前导航 + + def __init__(self, **kwargs: Any) -> None: # 定义构造函数,接收框架传入的配置参数 + super().__init__(**kwargs) # 调用父类构造函数,让 DimOS 初始化模块和流 + self._lock = threading.RLock() # 创建递归锁,保护传感器缓存和状态机 + self._stop_event = threading.Event() # 创建后台命令循环停止事件 + self._thread: threading.Thread | None = None # 保存后台命令循环线程 + self._state: BehaviorState = "idle" # 初始化行为状态为 idle + self._active_target_id: str | None = None # 保存当前自动靠近的目标 id + self._latest_selected_bbox: Detection2DArray | None = None # 保存最新 selected bbox + self._latest_lidar: PointCloud2 | None = None # 保存最新 lidar 点云 + self._latest_camera_info: CameraInfo | None = None # 保存最新 camera_info + self._active_approach_distance = self.config.approach_distance # 保存本次行为使用的靠近距离 + self._last_block_reason: str | None = None # 记录上一次阻塞原因,避免重复刷日志 + self._planner: ReplanningAStarPlannerSpec | None = ( + None # 可选规划器引用,由 blueprint 在 build 时注入 + ) + + @rpc # 标记 start() 是框架生命周期 RPC + def start(self) -> None: # 定义模块启动逻辑 + super().start() # 启动父类逻辑,包括 RPC 和自动绑定 + self.register_disposable( + Disposable(self.selected_bbox.subscribe(self._on_selected_bbox)) + ) # 订阅 selected bbox + self.register_disposable( + Disposable(self.lidar.subscribe(self._on_lidar)) + ) # 订阅 lidar 点云 + self.register_disposable( + Disposable(self.camera_info.subscribe(self._on_camera_info)) + ) # 订阅 camera_info + self.register_disposable( + Disposable(self.teleop_active.subscribe(self._on_teleop_active)) + ) # 订阅遥控打断信号 + self._stop_event.clear() # 清除停止事件,允许后台线程运行 + self._thread = threading.Thread( # 创建后台命令发布线程 + target=self._command_loop, # 指定线程执行固定频率控制循环 + name="BBoxDistanceBehaviorModule", # 给线程命名,方便日志和调试 + daemon=True, # 设置守护线程,进程退出时不会被它卡住 + ) # 结束线程参数 + self._thread.start() # 启动后台命令发布线程 + logger.info( + "BBoxDistanceBehaviorModule: task loop started " + f"command_hz={self.config.command_hz} approach_distance={self.config.approach_distance}" + ) + self._publish_status("idle") # 发布初始状态 + + @rpc # 标记 stop() 是框架生命周期 RPC + def stop(self) -> None: # 定义模块停止逻辑 + logger.info("BBoxDistanceBehaviorModule: stopping task loop") + self._stop_event.set() # 通知后台线程退出 + self.cmd_vel.publish(Twist.zero()) # 停止时立即发布零速度 + if self._thread is not None and self._thread.is_alive(): # 如果后台线程存在且仍在运行 + self._thread.join(DEFAULT_THREAD_JOIN_TIMEOUT) # 等待后台线程退出,但最多等默认超时 + logger.info("BBoxDistanceBehaviorModule: task loop stopped") + super().stop() # 调用父类停止逻辑,释放订阅和 transport + + @rpc # 标记 start_bbox_distance_behavior() 可通过 DimOS RPC 调用 + def start_bbox_distance_behavior( # 定义行为启动 RPC + self, # 传入模块实例 + approach_distance: float | None = None, # 可选覆盖靠近阶段目标距离 + ) -> str: # 返回可读启动结果 + with self._lock: # 加锁重置状态机参数 + self._active_approach_distance = ( + self.config.approach_distance + if approach_distance is None + else float(approach_distance) + ) # 设置本次靠近距离 + self._state = "approaching" # 进入靠近阶段 + + logger.info( + "BBoxDistanceBehaviorModule: task started by RPC " + f"approach_distance={self._active_approach_distance}" + ) + + self._publish_status("approaching") # 发布状态变化 + return "bbox distance behavior started" # 返回启动确认 + + @rpc # 标记 stop_bbox_distance_behavior() 可通过 DimOS RPC 调用 + def stop_bbox_distance_behavior(self) -> str: # 定义行为停止 RPC + with self._lock: # 加锁更新状态机 + previous_target_id = self._active_target_id + self._state = "idle" # 回到 idle 状态 + self._active_target_id = None # 清空当前目标 id + + self.cmd_vel.publish(Twist.zero()) # 行为停止时发布零速度 + self._request_selection_clear(reason="rpc_stop") # 清掉旧 bbox,避免下一帧自动重启 task + logger.info( + f"BBoxDistanceBehaviorModule: task stopped by RPC target_id={previous_target_id!r}" + ) + self._publish_status("idle") # 发布 idle 状态 + return "bbox distance behavior stopped" # 返回停止确认 + + def _on_selected_bbox(self, selected_bbox: Detection2DArray) -> None: # 处理新的 selected bbox + with self._lock: # 加锁更新缓存 + self._latest_selected_bbox = selected_bbox # 保存最新 selected bbox + detection = self._extract_single_detection(selected_bbox) # 读取当前选中目标 + current_target_id = self._active_target_id # 复制当前目标 id + current_state = self._state # 复制当前状态 + + if detection is None: # 如果当前没有选中目标 + with self._lock: # 加锁重置状态 + previous_target_id = self._active_target_id + self._active_target_id = None # 清空当前目标 id + self._state = "idle" # 回到 idle + + if previous_target_id is not None or current_state != "idle": + self.cmd_vel.publish(Twist.zero()) # 从活动任务退出时立即停止一次 + logger.info( + "BBoxDistanceBehaviorModule: task ended because selection cleared " + f"previous_target_id={previous_target_id!r}" + ) + self._publish_status("idle") # 发布 idle 状态 + return # 结束处理 + + target_id = self._detection_id(detection) # 读取当前目标 id + with self._lock: # 加锁更新当前目标 + self._active_target_id = target_id # 保存当前目标 id + + if ( + current_state == "done" and current_target_id == target_id + ): # 如果已经完成且仍是同一个目标 + self._publish_status("done") # 保持完成状态 + return # 不重新启动任务 + + if current_state == "approaching" and current_target_id == target_id: + return # 同一目标持续靠近中,不重复触发 task 启动日志和状态发布 + + with self._lock: # 加锁切换到靠近状态 + self._state = "approaching" # 新目标或尚未完成时自动进入靠近阶段 + + if self._planner is not None: # 如果存在 A* 规划器,取消当前导航,避免与 bbox 追踪指令竞争 + self._planner.cancel_goal() + + logger.info( + "BBoxDistanceBehaviorModule: task started from selected bbox " + f"target_id={target_id!r} state_before={current_state}" + ) + + self._publish_status("approaching") # 发布自动启动状态 + + def _on_lidar(self, lidar: PointCloud2) -> None: # 处理新的 lidar 点云 + with self._lock: # 加锁更新缓存 + self._latest_lidar = lidar # 保存最新 lidar 点云 + + def _on_camera_info(self, camera_info: CameraInfo) -> None: # 处理新的 camera_info + with self._lock: # 加锁更新缓存 + self._latest_camera_info = camera_info # 保存最新 camera_info + + def _on_teleop_active( + self, _msg: Any + ) -> None: # 收到 MovementManager 的 stop_movement 信号时打断任务 + with self._lock: # 加锁读取并重置状态 + prev_state = self._state # 记录打断前状态 + prev_target = self._active_target_id # 记录打断前目标 id + if prev_state == "idle": # 已经空闲,无需操作 + return + self._state = "idle" # 回到 idle + self._active_target_id = None # 清空目标 + self.cmd_vel.publish(Twist.zero()) # 立即停止 + self._request_selection_clear( + reason="interrupt" + ) # 清掉旧 bbox,避免中断后又被 locked bbox 拉起 + logger.info( + "BBoxDistanceBehaviorModule: task interrupted by teleop " + f"target_id={prev_target!r} state_before={prev_state}" + ) + self._publish_status("idle") # 发布 idle 状态,让外部知道任务已中断 + + def _command_loop(self) -> None: # 固定频率控制循环 + period_sec = 1.0 / max(self.config.command_hz, 1.0) # 根据配置计算循环周期,并防止除零 + while not self._stop_event.wait(period_sec): # 按周期运行,收到停止事件后退出 + twist = self._compute_next_twist() # 根据当前状态和传感器缓存计算下一条速度命令 + if twist is not None: # idle 状态会返回 None,表示无需重复发布 + self.cmd_vel.publish(twist) # 发布速度命令 + + self.cmd_vel.publish(Twist.zero()) # 线程退出前再发布一次零速度 + + def _compute_next_twist(self) -> Twist | None: # 计算当前周期应该发布的速度 + with self._lock: # 加锁读取状态和传感器缓存 + state = self._state # 复制当前状态 + selected_bbox = self._latest_selected_bbox # 复制 latest selected bbox + lidar = self._latest_lidar # 复制 latest lidar + camera_info = self._latest_camera_info # 复制 latest camera_info + approach_distance = self._active_approach_distance # 复制本次靠近距离 + + if state == "idle": # 如果行为未启动 + self._set_block_reason(None) + return None # 不重复发布速度命令 + + if state == "done": # 如果行为已经完成 + self._set_block_reason(None) + return Twist.zero() # 持续发布零速度,确保完成后保持停止 + + detection = self._extract_single_detection( + selected_bbox + ) # 从 selected_bbox 中取出单个 detection + if ( + detection is None or lidar is None or camera_info is None + ): # 如果 bbox、lidar 或 camera_info 任一缺失 + missing: list[str] = [] + if detection is None: + missing.append("selected_bbox") + if lidar is None: + missing.append("lidar") + if camera_info is None: + missing.append("camera_info") + self._set_block_reason(f"missing_inputs:{','.join(missing)}") + return Twist.zero() # 发布零速度并等待数据补齐 + + distance = self._estimate_3d_distance( + detection, lidar, camera_info + ) # 用 TF + pointcloud 估计 3D 距离 + if distance is None: # 如果 TF 或点云无法给出有效距离 + self._set_block_reason(f"no_3d_detection:{self._detection_id(detection)!r}") + return Twist.zero() # 发布零速度并等待下一帧 + + self._set_block_reason(None) + + bbox_center_x = float(detection.bbox.center.position.x) # 读取 bbox 中心 x 像素坐标 + target_distance = approach_distance # 始终靠近到目标距离 + twist = self._make_twist( + distance, target_distance, bbox_center_x, camera_info + ) # 生成线速度和角速度命令 + + if ( + state == "approaching" and distance <= approach_distance + _DISTANCE_TOLERANCE_M + ): # 如果已经到达靠近目标距离 + completed_target_id = self._detection_id(detection) + with self._lock: # 加锁切换完成状态 + self._state = "done" # 标记行为完成 + self._active_target_id = ( + completed_target_id # 记录完成的目标,避免同一目标被重复自动重启 + ) + logger.info( + "BBoxDistanceBehaviorModule: task completed " + f"target_id={completed_target_id!r} distance={distance:.3f}" + ) + self._publish_status("done", distance=distance) # 发布完成状态 + self._request_selection_clear( + reason="completed" + ) # 完成后回到等待用户输入,不继续消费旧 bbox + return Twist.zero() # 完成时发布零速度 + + return twist # 返回当前控制周期的速度命令 + + def _request_selection_clear(self, reason: str) -> None: + self.clear_selection_request.publish(Bool(data=True)) + logger.info(f"BBoxDistanceBehaviorModule: requested selection clear reason={reason}") + + def _make_twist( # 定义根据距离和 bbox 横向位置生成 Twist 的函数 + self, # 传入模块实例 + distance: float, # 当前估计距离 + target_distance: float, # 当前阶段目标距离 + bbox_center_x: float, # bbox 中心 x 像素坐标 + camera_info: CameraInfo, # 相机内参 + ) -> Twist: # 返回速度命令 + distance_error = distance - target_distance # 计算目标距离误差,正数表示目标太远 + linear_x = self._clamp( + distance_error * _LINEAR_GAIN, + -self.config.max_linear_speed, + self.config.max_linear_speed, + ) # 计算并限幅线速度 + fx = float(camera_info.K[0]) if len(camera_info.K) >= 9 else 0.0 # 读取相机 fx + cx = float(camera_info.K[2]) if len(camera_info.K) >= 9 else 0.0 # 读取相机 cx + angular_z = ( + 0.0 if fx <= 0.0 else -((bbox_center_x - cx) / fx) * _ANGULAR_GAIN + ) # 根据 bbox 横向误差计算转向速度 + angular_z = self._clamp( + angular_z, -self.config.max_angular_speed, self.config.max_angular_speed + ) # 对角速度限幅 + return Twist( # 构造 Twist 命令 + linear=Vector3(linear_x, 0.0, 0.0), # 设置 x 方向线速度 + angular=Vector3(0.0, 0.0, angular_z), # 设置 yaw 角速度 + ) # 结束 Twist 构造 + + def _estimate_3d_distance( # 定义基于 TF + 世界坐标系点云的 3D 距离估计函数 + self, # 传入模块实例 + detection: Any, # 当前选中的 LCM Detection2D + lidar: PointCloud2, # 最新 lidar 点云(Go2 原始点云 frame_id="world") + camera_info: CameraInfo, # 最新相机内参 + ) -> float | None: # 返回机器人与检测目标之间的 2D 欧氏距离(单位米),失败时返回 None + # 1. 构造 Detection2DBBox 供 Detection3DPC.from_2d() 使用 + x1, y1, x2, y2 = BBoxSelectionModule._bbox_corners(detection) # 读取 bbox 的 xyxy 像素范围 + det2d = Detection2DBBox( + bbox=(x1, y1, x2, y2), # 传入像素 bbox + track_id=_safe_track_id(getattr(detection, "id", None)), # 读取 track id,非整型时用 0 + class_id=0, # class_id 在此不重要 + confidence=float(getattr(detection, "score", 1.0) or 1.0), # 读取置信度,缺失时用 1.0 + name=str(getattr(detection, "class_label", "") or ""), # 读取类名,缺失时用空字符串 + ts=float(lidar.ts or 0.0), # 用 lidar 时间戳对齐 TF 查询 + image=None, # from_2d() 不需要 image 字段 + ) + # 2. 从 TF 获取 world→camera_optical 变换 + ts = float(lidar.ts or 0.0) # 用 lidar 时间戳对齐 TF + try: + world_to_optical = self.tf.get( + "camera_optical", lidar.frame_id, ts, time_tolerance=self.config.tf_time_tolerance + ) # 查询 world→camera_optical 变换 + except RuntimeError as exc: + logger.debug(f"BBoxDistanceBehaviorModule: TF unavailable: {exc}") + return None + if world_to_optical is None: # 如果 TF 暂时不可用 + return None # 等待下一帧 + # 3. 包装 LCM CameraInfo + lcm_ci = DimosLcmCameraInfo() # 构造 LCM CameraInfo 对象 + lcm_ci.K = camera_info.K # 复制内参矩阵 + lcm_ci.width = camera_info.width # 复制图像宽度 + lcm_ci.height = camera_info.height # 复制图像高度 + # 4. 投影到 3D(世界坐标系) + detection_3d = Detection3DPC.from_2d( + det=det2d, # 传入 2D bbox + world_pointcloud=lidar, # 传入世界坐标系点云 + camera_info=lcm_ci, # 传入 LCM CameraInfo + world_to_optical_transform=world_to_optical, # 传入外参变换 + filters=[], # 不做额外过滤,保证速度 + ) # 返回带世界坐标系点云的 Detection3DPC,或 None + if detection_3d is None: # 如果 bbox 内没有有效点 + return None # 等待下一帧 + # 5. 获取机器人当前位置(world 坐标系) + try: + robot_tf = self.tf.get( + "world", "base_link", time_tolerance=self.config.tf_time_tolerance + ) # 查询机器人位置 + except RuntimeError as exc: + logger.debug(f"BBoxDistanceBehaviorModule: robot TF unavailable: {exc}") + return None + if robot_tf is None: # 如果机器人 TF 暂时不可用 + return None # 等待下一帧 + # 6. 计算机器人与检测目标中心的 2D 欧氏距离 + center = detection_3d.center # 获取 bbox 点云质心(世界坐标系) + robot_pos = robot_tf.translation # 获取机器人在世界坐标系中的位置 + dx = float(center.x) - float(robot_pos.x) # x 轴方向距离分量 + dy = float(center.y) - float(robot_pos.y) # y 轴方向距离分量 + distance = math.sqrt(dx * dx + dy * dy) # 计算 XY 平面欧氏距离 + return ( + distance if math.isfinite(distance) and distance > 0.0 else None + ) # 只接受有限且正的距离 + + def _set_block_reason(self, reason: str | None) -> None: + if reason == self._last_block_reason: + return + self._last_block_reason = reason + if reason is not None: + logger.info(f"BBoxDistanceBehaviorModule: task blocked reason={reason}") + + @staticmethod # 声明这是不依赖实例状态的工具函数 + def _extract_single_detection( + selected_bbox: Detection2DArray | None, + ) -> Any | None: # 从 selected_bbox 中取单个 detection + if ( + selected_bbox is None + or selected_bbox.detections_length == 0 + or not selected_bbox.detections + ): # 如果没有选中 bbox + return None # 返回空 + return selected_bbox.detections[0] # 返回第一个 detection,选择模块保证最多一个 + + @staticmethod # 声明这是不依赖实例状态的工具函数 + def _detection_id(detection: Any) -> str: # 读取 detection.id,并在缺失时回退为空字符串 + detection_id = getattr(detection, "id", "") # 读取 detection.id,缺失时使用空字符串 + return str(detection_id) if detection_id else "" # 返回真实 id 或空字符串 + + @staticmethod # 声明这是不依赖实例状态的工具函数 + def _clamp(value: float, lower: float, upper: float) -> float: # 把数值限制在上下界之间 + return max(lower, min(upper, value)) # 返回限幅后的数值 + + def _publish_status(self, state: BehaviorState, **fields: float) -> None: # 发布行为状态消息 + payload: dict[str, Any] = {"state": state, **fields} # 构造 JSON 状态载荷 + self.behavior_status.publish(String(json.dumps(payload))) # 发布 JSON 字符串状态 + + +__all__ = [ # 声明这个文件希望对外暴露的名字 + "BBoxDistanceBehaviorConfig", # 暴露行为模块配置 + "BBoxDistanceBehaviorModule", # 暴露行为模块 +] # 结束 __all__ 列表 diff --git a/dimos/robot/custom/tasks/target_standoff_behavior_module.py b/dimos/robot/custom/tasks/target_standoff_behavior_module.py new file mode 100644 index 0000000000..10ebfa2573 --- /dev/null +++ b/dimos/robot/custom/tasks/target_standoff_behavior_module.py @@ -0,0 +1,585 @@ +# Copyright 2025-2026 Dimensional Inc. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from __future__ import annotations + +import json +import math +import threading +import time +from typing import Any, Literal + +from dimos_lcm.std_msgs import Bool, String # type: ignore[import-untyped] +from reactivex.disposable import Disposable + +from dimos.constants import DEFAULT_THREAD_JOIN_TIMEOUT +from dimos.core.core import rpc +from dimos.core.module import Module, ModuleConfig +from dimos.core.stream import In, Out +from dimos.msgs.geometry_msgs.PoseStamped import PoseStamped +from dimos.msgs.geometry_msgs.Quaternion import Quaternion +from dimos.msgs.geometry_msgs.Transform import Transform +from dimos.msgs.geometry_msgs.Vector3 import Vector3 +from dimos.navigation.replanning_a_star.module_spec import ReplanningAStarPlannerSpec +from dimos.utils.logging_config import setup_logger + +logger = setup_logger() + +BehaviorState = Literal[ + "idle", + "navigating_near", + "dwelling_near", + "navigating_far", + "dwelling_far", + "returning_near", + "done", + "failed", +] +GoalName = Literal["near", "far", "return_near"] + +_WORLD_FRAME_ID = "world" + + +class TargetStandoffBehaviorConfig(ModuleConfig): + command_hz: float = 5.0 + near_distance: float = 0.5 + far_distance: float = 3.0 + near_dwell_duration_sec: float = 10.0 + far_dwell_duration_sec: float = 10.0 + countdown_log_interval_sec: float = 2.0 + goal_timeout_sec: float = 45.0 + tf_time_tolerance: float = 0.5 + restart_target_distance_m: float = 0.2 + + +class TargetStandoffBehaviorModule(Module): + """Compute near/far target waypoints and delegate motion to the path planner.""" + + config: TargetStandoffBehaviorConfig + + target_pose: In[PoseStamped] + teleop_active: In[Bool] + + clear_selection_request: Out[Bool] + behavior_status: Out[String] + + _planner: ReplanningAStarPlannerSpec | None + + def __init__(self, **kwargs: Any) -> None: + super().__init__(**kwargs) + self._lock = threading.RLock() + self._stop_event = threading.Event() + self._thread: threading.Thread | None = None + self._state: BehaviorState = "idle" + self._state_started_at = time.monotonic() + self._target_pose: PoseStamped | None = None + self._near_pose: PoseStamped | None = None + self._far_pose: PoseStamped | None = None + self._current_goal_name: GoalName | None = None + self._current_goal_pose: PoseStamped | None = None + self._goal_started_at: float | None = None + self._last_countdown_log_at = 0.0 + self._last_block_reason: str | None = None + self._last_geometry_block_reason: str | None = None + self._planner = None + + @rpc + def start(self) -> None: + super().start() + logger.info( + "TargetStandoffBehaviorModule: started " + f"command_hz={self.config.command_hz:.1f} " + f"near_distance={self.config.near_distance:.3f} " + f"far_distance={self.config.far_distance:.3f} " + f"near_dwell_duration_sec={self.config.near_dwell_duration_sec:.3f} " + f"far_dwell_duration_sec={self.config.far_dwell_duration_sec:.3f} " + f"countdown_log_interval_sec={self.config.countdown_log_interval_sec:.3f} " + f"goal_timeout_sec={self.config.goal_timeout_sec:.3f} " + f"tf_time_tolerance={self.config.tf_time_tolerance:.3f}s" + ) + self.register_disposable(Disposable(self.target_pose.subscribe(self._on_target_pose))) + self.register_disposable(Disposable(self.teleop_active.subscribe(self._on_teleop_active))) + self._stop_event.clear() + self._thread = threading.Thread( + target=self._behavior_loop, + name="TargetStandoffBehaviorModule", + daemon=True, + ) + self._thread.start() + self._publish_status("idle") + + @rpc + def stop(self) -> None: + logger.info("TargetStandoffBehaviorModule: stopping; cancelling planner goal") + self._stop_event.set() + self._cancel_planner_goal() + if self._thread is not None and self._thread.is_alive(): + self._thread.join(DEFAULT_THREAD_JOIN_TIMEOUT) + super().stop() + + @rpc + def start_behavior(self) -> str: + with self._lock: + target_pose = self._target_pose + if target_pose is None: + return "target standoff behavior has no target pose yet" + if self._start_sequence(target_pose): + return "target standoff behavior started" + return "target standoff behavior could not start" + + @rpc + def stop_behavior(self) -> str: + with self._lock: + previous_state = self._state + self._reset_sequence_locked(state="idle") + self._cancel_planner_goal() + self.clear_selection_request.publish(Bool(data=True)) + logger.info( + "TargetStandoffBehaviorModule: stop_behavior requested " + f"state_before={previous_state}" + ) + self._publish_status("idle") + return "target standoff behavior stopped" + + def _on_target_pose(self, target_pose: PoseStamped) -> None: + with self._lock: + state = self._state + previous_target = self._target_pose + self._target_pose = target_pose + + if state == "idle": + self._start_sequence(target_pose) + return + + if state in ("done", "failed") and self._is_new_target(target_pose, previous_target): + logger.info( + "TargetStandoffBehaviorModule: received new target after terminal state; " + f"restarting pose={self._pose_summary(target_pose)}" + ) + self._start_sequence(target_pose) + return + + logger.debug( + "TargetStandoffBehaviorModule: ignoring target_pose update during active sequence " + f"state={state} pose={self._pose_summary(target_pose)}" + ) + + def _on_teleop_active(self, msg: Any) -> None: + if not bool(getattr(msg, "data", False)): + return + with self._lock: + previous_state = self._state + if previous_state == "idle": + return + self._reset_sequence_locked(state="idle") + self._cancel_planner_goal() + self.clear_selection_request.publish(Bool(data=True)) + logger.info( + "TargetStandoffBehaviorModule: interrupted by teleop " + f"state_before={previous_state}" + ) + self._publish_status("idle") + + def _start_sequence(self, target_pose: PoseStamped) -> bool: + waypoints = self._compute_waypoints(target_pose) + if waypoints is None: + self._publish_status("idle", target_source="blocked") + return False + + target, near, far = waypoints + with self._lock: + self._target_pose = target + self._near_pose = near + self._far_pose = far + self._current_goal_name = None + self._current_goal_pose = None + self._goal_started_at = None + + logger.info( + "TargetStandoffBehaviorModule: waypoints computed " + f"target={self._pose_summary(target)} " + f"near={self._pose_summary(near)} " + f"far={self._pose_summary(far)}" + ) + return self._dispatch_goal("near", near, "navigating_near") + + def _behavior_loop(self) -> None: + period_sec = 1.0 / max(self.config.command_hz, 1.0) + while not self._stop_event.wait(period_sec): + self._tick() + + def _tick(self) -> None: + with self._lock: + state = self._state + state_started_at = self._state_started_at + near_pose = self._near_pose + far_pose = self._far_pose + current_goal_name = self._current_goal_name + goal_started_at = self._goal_started_at + + if state in ("idle", "done", "failed"): + self._set_block_reason(None) + return + + planner = self._planner + if planner is None: + self._fail_sequence("missing_planner_spec") + return + + now = time.monotonic() + if state in ("navigating_near", "navigating_far", "returning_near"): + if self._planner_goal_reached(planner): + if state == "navigating_near": + self._transition("dwelling_near", now, reason="near_goal_reached") + return + if state == "navigating_far": + self._transition("dwelling_far", now, reason="far_goal_reached") + return + self._complete_sequence() + return + if ( + goal_started_at is not None + and now - goal_started_at > self.config.goal_timeout_sec + ): + self._fail_sequence(f"{current_goal_name or 'unknown'}_goal_timeout") + return + self._publish_status(state, elapsed_sec=now - state_started_at) + return + + if state == "dwelling_near": + elapsed = now - state_started_at + if elapsed < self.config.near_dwell_duration_sec: + self._log_countdown( + state, + self.config.near_dwell_duration_sec, + elapsed, + now, + ) + self._publish_status(state, elapsed_sec=elapsed) + return + if far_pose is None: + self._fail_sequence("missing_far_pose") + return + self._dispatch_goal("far", far_pose, "navigating_far") + return + + if state == "dwelling_far": + elapsed = now - state_started_at + if elapsed < self.config.far_dwell_duration_sec: + self._log_countdown( + state, + self.config.far_dwell_duration_sec, + elapsed, + now, + ) + self._publish_status(state, elapsed_sec=elapsed) + return + if near_pose is None: + self._fail_sequence("missing_near_pose") + return + self._dispatch_goal("return_near", near_pose, "returning_near") + + def _dispatch_goal( + self, + goal_name: GoalName, + goal_pose: PoseStamped, + next_state: BehaviorState, + ) -> bool: + planner = self._planner + if planner is None: + self._fail_sequence("missing_planner_spec") + return False + + accepted = planner.set_goal(goal_pose) + if not accepted: + self._fail_sequence(f"{goal_name}_goal_rejected") + return False + + now = time.monotonic() + with self._lock: + old_state = self._state + self._state = next_state + self._state_started_at = now + self._current_goal_name = goal_name + self._current_goal_pose = goal_pose + self._goal_started_at = now + self._last_countdown_log_at = 0.0 + logger.info( + "TargetStandoffBehaviorModule: navigation goal set " + f"name={goal_name} state={old_state}->{next_state} " + f"goal={self._pose_summary(goal_pose)}" + ) + self._publish_status(next_state, goal_name=goal_name) + return True + + def _compute_waypoints( + self, + target_pose: PoseStamped, + ) -> tuple[PoseStamped, PoseStamped, PoseStamped] | None: + if target_pose.frame_id and target_pose.frame_id != _WORLD_FRAME_ID: + self._set_geometry_block_reason(f"target_frame_is_{target_pose.frame_id!r}") + return None + robot_tf = self._get_robot_transform() + if robot_tf is None: + self._set_geometry_block_reason("missing_world_to_base_link_tf") + return None + + target_x = float(target_pose.position.x) + target_y = float(target_pose.position.y) + robot_x = float(robot_tf.translation.x) + robot_y = float(robot_tf.translation.y) + ux, uy = self._unit_from_target_to_robot(target_x, target_y, robot_x, robot_y) + self._set_geometry_block_reason(None) + + target = self._make_pose(target_x, target_y, 0.0, yaw=robot_tf.rotation.to_euler().yaw) + near = self._make_nav_pose_facing_target( + target_x + ux * self.config.near_distance, + target_y + uy * self.config.near_distance, + target_x, + target_y, + ) + far = self._make_nav_pose_facing_target( + target_x + ux * self.config.far_distance, + target_y + uy * self.config.far_distance, + target_x, + target_y, + ) + return target, near, far + + def _make_nav_pose_facing_target( + self, + x: float, + y: float, + target_x: float, + target_y: float, + ) -> PoseStamped: + yaw = math.atan2(target_y - y, target_x - x) + return self._make_pose(x, y, 0.0, yaw=yaw) + + @staticmethod + def _make_pose(x: float, y: float, z: float, *, yaw: float) -> PoseStamped: + return PoseStamped( + ts=time.time(), + frame_id=_WORLD_FRAME_ID, + position=Vector3(x, y, z), + orientation=Quaternion.from_euler(Vector3(0.0, 0.0, yaw)), + ) + + def _get_robot_transform(self) -> Transform | None: + try: + return self.tf.get( + _WORLD_FRAME_ID, + "base_link", + time_tolerance=self.config.tf_time_tolerance, + ) + except RuntimeError as exc: + logger.debug(f"TargetStandoffBehaviorModule: robot TF lookup failed: {exc}") + return None + + def _complete_sequence(self) -> None: + with self._lock: + target_pose = self._target_pose + near_pose = self._near_pose + far_pose = self._far_pose + self._state = "done" + self._state_started_at = time.monotonic() + self._current_goal_name = None + self._current_goal_pose = None + self._goal_started_at = None + self._last_countdown_log_at = 0.0 + logger.info( + "TargetStandoffBehaviorModule: behavior complete " + f"target={self._pose_summary(target_pose)} " + f"near={self._pose_summary(near_pose)} " + f"far={self._pose_summary(far_pose)}" + ) + self.clear_selection_request.publish(Bool(data=True)) + self._publish_status("done") + + def _fail_sequence(self, reason: str) -> None: + with self._lock: + previous_state = self._state + self._state = "failed" + self._state_started_at = time.monotonic() + self._current_goal_name = None + self._current_goal_pose = None + self._goal_started_at = None + self._cancel_planner_goal() + self.clear_selection_request.publish(Bool(data=True)) + self._set_block_reason(reason) + logger.info( + "TargetStandoffBehaviorModule: behavior failed " + f"state_before={previous_state} reason={reason}" + ) + self._publish_status("failed", target_source=reason) + + def _reset_sequence_locked(self, *, state: BehaviorState) -> None: + self._state = state + self._state_started_at = time.monotonic() + self._target_pose = None + self._near_pose = None + self._far_pose = None + self._current_goal_name = None + self._current_goal_pose = None + self._goal_started_at = None + self._last_countdown_log_at = 0.0 + + def _cancel_planner_goal(self) -> None: + if self._planner is not None: + self._planner.cancel_goal() + + def _planner_goal_reached(self, planner: ReplanningAStarPlannerSpec) -> bool: + try: + return planner.is_goal_reached() + except RuntimeError as exc: + self._set_block_reason(f"planner_goal_check_failed:{exc}") + return False + + def _set_block_reason(self, reason: str | None) -> None: + if reason == self._last_block_reason: + return + self._last_block_reason = reason + if reason is not None: + logger.info(f"TargetStandoffBehaviorModule: blocked reason={reason}") + + def _set_geometry_block_reason(self, reason: str | None) -> None: + if reason == self._last_geometry_block_reason: + return + previous = self._last_geometry_block_reason + self._last_geometry_block_reason = reason + if reason is None: + if previous is not None: + logger.info( + "TargetStandoffBehaviorModule: target geometry recovered " + f"previous_reason={previous}" + ) + return + logger.info(f"TargetStandoffBehaviorModule: target geometry blocked reason={reason}") + + def _transition( + self, + state: BehaviorState, + started_at: float, + *, + reason: str, + ) -> None: + with self._lock: + old_state = self._state + self._state = state + self._state_started_at = started_at + self._current_goal_name = None + self._current_goal_pose = None + self._goal_started_at = None + self._last_countdown_log_at = 0.0 + logger.info( + "TargetStandoffBehaviorModule: action transition " + f"completed={old_state} entered={state} reason={reason}" + ) + self._publish_status(state) + + def _log_countdown( + self, + state: BehaviorState, + duration_sec: float, + elapsed_sec: float, + now: float, + ) -> None: + with self._lock: + last_logged_at = self._last_countdown_log_at + if ( + last_logged_at > 0.0 + and now - last_logged_at < self.config.countdown_log_interval_sec + ): + return + self._last_countdown_log_at = now + remaining_sec = max(duration_sec - elapsed_sec, 0.0) + logger.info( + "TargetStandoffBehaviorModule: dwell countdown " + f"state={state} elapsed_sec={elapsed_sec:.1f} remaining_sec={remaining_sec:.1f}" + ) + + def _publish_status( + self, + state: BehaviorState, + *, + elapsed_sec: float | None = None, + goal_name: str | None = None, + target_source: str = "target_pose", + ) -> None: + with self._lock: + payload: dict[str, Any] = { + "state": state, + "target_source": target_source, + "elapsed_sec": 0.0 if elapsed_sec is None else elapsed_sec, + "goal_name": goal_name or self._current_goal_name, + "target_pose": self._pose_to_dict(self._target_pose), + "near_pose": self._pose_to_dict(self._near_pose), + "far_pose": self._pose_to_dict(self._far_pose), + "current_goal_pose": self._pose_to_dict(self._current_goal_pose), + } + self.behavior_status.publish(String(json.dumps(payload, ensure_ascii=True))) + + @staticmethod + def _unit_from_target_to_robot( + target_x: float, + target_y: float, + robot_x: float, + robot_y: float, + ) -> tuple[float, float]: + dx = robot_x - target_x + dy = robot_y - target_y + norm = math.sqrt(dx * dx + dy * dy) + if norm < 1e-6: + return (-1.0, 0.0) + return dx / norm, dy / norm + + def _is_new_target( + self, + target_pose: PoseStamped, + previous_target: PoseStamped | None, + ) -> bool: + if previous_target is None: + return True + dx = float(target_pose.position.x) - float(previous_target.position.x) + dy = float(target_pose.position.y) - float(previous_target.position.y) + return math.sqrt(dx * dx + dy * dy) > self.config.restart_target_distance_m + + @staticmethod + def _pose_to_dict(pose: PoseStamped | None) -> dict[str, Any] | None: + if pose is None: + return None + return { + "frame_id": pose.frame_id, + "x": float(pose.position.x), + "y": float(pose.position.y), + "z": float(pose.position.z), + "ts": float(pose.ts or 0.0), + } + + @staticmethod + def _pose_summary(pose: PoseStamped | None) -> str: + if pose is None: + return "None" + return ( + f"frame={pose.frame_id!r} " + f"x={float(pose.position.x):.3f} " + f"y={float(pose.position.y):.3f} " + f"z={float(pose.position.z):.3f} " + f"ts={float(pose.ts or 0.0):.3f}" + ) + +__all__ = [ + "TargetStandoffBehaviorConfig", + "TargetStandoffBehaviorModule", +] diff --git a/dimos/robot/custom/tests/__init__.py b/dimos/robot/custom/tests/__init__.py new file mode 100644 index 0000000000..bc1a2ce5cc --- /dev/null +++ b/dimos/robot/custom/tests/__init__.py @@ -0,0 +1,13 @@ +# Copyright 2025-2026 Dimensional Inc. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. diff --git a/dimos/robot/custom/tests/test_bbox_distance_behavior_module.py b/dimos/robot/custom/tests/test_bbox_distance_behavior_module.py new file mode 100644 index 0000000000..1ffb1e8463 --- /dev/null +++ b/dimos/robot/custom/tests/test_bbox_distance_behavior_module.py @@ -0,0 +1,225 @@ +# Copyright 2025-2026 Dimensional Inc. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from __future__ import annotations + +from collections.abc import Callable +from typing import Any + +from dimos_lcm.std_msgs import Bool # type: ignore[import-untyped] +from dimos_lcm.vision_msgs import ( + BoundingBox2D, + Detection2D, + ObjectHypothesis, + ObjectHypothesisWithPose, + Point2D, + Pose2D, +) +import numpy as np +import pytest + +from dimos.msgs.sensor_msgs.CameraInfo import CameraInfo +from dimos.msgs.sensor_msgs.PointCloud2 import PointCloud2 +from dimos.msgs.std_msgs.Header import Header +from dimos.msgs.vision_msgs.Detection2DArray import Detection2DArray +from dimos.protocol.rpc.spec import RPCSpec +from dimos.robot.custom.tasks.bbox_distance_behavior_module import ( + BBoxDistanceBehaviorModule, +) + + +class _NoopRPC(RPCSpec): + def __init__( + self, + *, + rpc_timeouts: dict[str, float] | None = None, + default_rpc_timeout: float = 120.0, + ) -> None: + self.rpc_timeouts = {} if rpc_timeouts is None else dict(rpc_timeouts) + self.default_rpc_timeout = default_rpc_timeout + + def serve_module_rpc(self, module: Any, name: str | None = None) -> None: + pass + + def serve_rpc(self, f: Callable[..., Any], name: str) -> Callable[[], None]: + return lambda: None + + def call( + self, + name: str, + arguments: tuple[list[Any], dict[str, Any]], + cb: Callable[[Any], None] | None, + ) -> Callable[[], None] | None: + return (lambda: None) if cb is not None else None + + def call_nowait(self, name: str, arguments: tuple[list[Any], dict[str, Any]]) -> None: + pass + + def start(self) -> None: + pass + + def stop(self) -> None: + pass + + +@pytest.fixture() +def module() -> BBoxDistanceBehaviorModule: + instance = BBoxDistanceBehaviorModule(rpc_transport=_NoopRPC) + try: + yield instance + finally: + instance.stop() + + +def _make_detection(detection_id: str, x1: float, y1: float, x2: float, y2: float) -> Detection2D: + center_x = (x1 + x2) / 2.0 + center_y = (y1 + y2) / 2.0 + return Detection2D( + id=detection_id, + results_length=1, + header=Header(123.0, "camera_optical"), + bbox=BoundingBox2D( + center=Pose2D(position=Point2D(x=center_x, y=center_y), theta=0.0), + size_x=x2 - x1, + size_y=y2 - y1, + ), + results=[ + ObjectHypothesisWithPose(hypothesis=ObjectHypothesis(class_id="person", score=0.9)) + ], + ) + + +def _make_array(*detections: Detection2D) -> Detection2DArray: + return Detection2DArray( + detections_length=len(detections), + header=Header(123.0, "camera_optical"), + detections=list(detections), + ) + + +def _subscribe_status(module: BBoxDistanceBehaviorModule) -> list[Any]: + received: list[Any] = [] + module.behavior_status.subscribe(received.append) + return received + + +def _subscribe_clear_requests(module: BBoxDistanceBehaviorModule) -> list[Any]: + received: list[Any] = [] + module.clear_selection_request.subscribe(received.append) + return received + + +def test_selected_bbox_auto_starts_approach(module: BBoxDistanceBehaviorModule) -> None: + status = _subscribe_status(module) + module._on_lidar(PointCloud2()) + module._on_camera_info(CameraInfo.from_intrinsics(100.0, 100.0, 50.0, 50.0, 100, 100)) + + module._on_selected_bbox(_make_array(_make_detection("target", 40.0, 40.0, 60.0, 60.0))) + + assert module._state == "approaching" + assert status + + +def test_selected_bbox_reaches_point_two_and_finishes( + module: BBoxDistanceBehaviorModule, + monkeypatch: pytest.MonkeyPatch, +) -> None: + clear_requests = _subscribe_clear_requests(module) + module._on_lidar(PointCloud2()) + module._on_camera_info(CameraInfo.from_intrinsics(100.0, 100.0, 50.0, 50.0, 100, 100)) + module._on_selected_bbox(_make_array(_make_detection("target", 40.0, 40.0, 60.0, 60.0))) + + distances = [0.30, 0.24] + + monkeypatch.setattr( + module, + "_estimate_3d_distance", + lambda *args, **kwargs: distances.pop(0), + ) + + twist = module._compute_next_twist() + + assert twist is not None + assert twist.linear.x > 0.0 + done_twist = module._compute_next_twist() + assert done_twist is not None + assert done_twist.linear.x == 0.0 + assert done_twist.angular.z == 0.0 + assert module._state == "done" + assert clear_requests and clear_requests[-1].data + + +def test_empty_selected_bbox_resets_to_idle(module: BBoxDistanceBehaviorModule) -> None: + cmd_published: list = [] + module.cmd_vel.subscribe(cmd_published.append) + + module._on_selected_bbox(_make_array(_make_detection("target", 40.0, 40.0, 60.0, 60.0))) + module._on_selected_bbox(_make_array()) + + assert module._state == "idle" + assert cmd_published and cmd_published[-1].linear.x == 0.0 + + +def test_empty_selected_bbox_while_idle_stays_silent( + module: BBoxDistanceBehaviorModule, +) -> None: + cmd_published: list = [] + module.cmd_vel.subscribe(cmd_published.append) + + module._on_selected_bbox(_make_array()) + + assert module._state == "idle" + assert cmd_published == [] + + +def test_estimate_3d_distance_returns_none_without_tf(module: BBoxDistanceBehaviorModule) -> None: + """_estimate_3d_distance returns None when TF transform is not available.""" + detection = _make_detection("target", 100.0, 100.0, 150.0, 150.0) + lidar = PointCloud2.from_numpy(np.array([[1.0, 2.0, 0.5]], dtype=np.float32), frame_id="world") + camera_info = CameraInfo.from_intrinsics(100.0, 100.0, 50.0, 50.0, 640, 480) + + # tf.get returns None by default since no TF data is published in tests + distance = module._estimate_3d_distance(detection, lidar, camera_info) + + assert distance is None + + +def test_teleop_active_interrupts_approaching_task(module: BBoxDistanceBehaviorModule) -> None: + """_on_teleop_active resets an approaching task to idle.""" + module._on_selected_bbox(_make_array(_make_detection("target", 40.0, 40.0, 60.0, 60.0))) + assert module._state == "approaching" + + cmd_published: list = [] + clear_requests = _subscribe_clear_requests(module) + module.cmd_vel.subscribe(cmd_published.append) + + module._on_teleop_active(Bool(data=True)) + + assert module._state == "idle" + assert module._active_target_id is None + # A zero Twist must be published to stop the robot + assert cmd_published and cmd_published[-1].linear.x == 0.0 + assert clear_requests and clear_requests[-1].data + + +def test_teleop_active_noop_when_already_idle(module: BBoxDistanceBehaviorModule) -> None: + """_on_teleop_active does nothing when task is already idle.""" + assert module._state == "idle" + cmd_published: list = [] + module.cmd_vel.subscribe(cmd_published.append) + + module._on_teleop_active(Bool(data=True)) + + assert module._state == "idle" + assert not cmd_published # no spurious zero published diff --git a/dimos/robot/custom/tests/test_bbox_selection_module.py b/dimos/robot/custom/tests/test_bbox_selection_module.py new file mode 100644 index 0000000000..cebeaafe5f --- /dev/null +++ b/dimos/robot/custom/tests/test_bbox_selection_module.py @@ -0,0 +1,256 @@ +# Copyright 2025-2026 Dimensional Inc. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from __future__ import annotations + +from collections.abc import Callable +from typing import Any + +from dimos_lcm.std_msgs import Bool +from dimos_lcm.vision_msgs import ( + BoundingBox2D, + Detection2D, + ObjectHypothesis, + ObjectHypothesisWithPose, + Point2D, + Pose2D, +) +import pytest + +from dimos.msgs.geometry_msgs.PointStamped import PointStamped +from dimos.msgs.std_msgs.Header import Header +from dimos.msgs.vision_msgs.Detection2DArray import Detection2DArray +from dimos.protocol.rpc.spec import RPCSpec +from dimos.robot.custom.modules.bbox_selection_module import BBoxSelectionModule + + +class _NoopRPC(RPCSpec): + def __init__( + self, + *, + rpc_timeouts: dict[str, float] | None = None, + default_rpc_timeout: float = 120.0, + ) -> None: + self.rpc_timeouts = {} if rpc_timeouts is None else dict(rpc_timeouts) + self.default_rpc_timeout = default_rpc_timeout + + def serve_module_rpc(self, module: Any, name: str | None = None) -> None: + pass + + def serve_rpc(self, f: Callable[..., Any], name: str) -> Callable[[], None]: + return lambda: None + + def call( + self, + name: str, + arguments: tuple[list[Any], dict[str, Any]], + cb: Callable[[Any], None] | None, + ) -> Callable[[], None] | None: + return (lambda: None) if cb is not None else None + + def call_nowait(self, name: str, arguments: tuple[list[Any], dict[str, Any]]) -> None: + pass + + def start(self) -> None: + pass + + def stop(self) -> None: + pass + + +@pytest.fixture() +def module() -> BBoxSelectionModule: + instance = BBoxSelectionModule(rpc_transport=_NoopRPC) + try: + yield instance + finally: + instance.stop() + + +def _make_detection( + detection_id: str, + x1: float, + y1: float, + x2: float, + y2: float, +) -> Detection2D: + center_x = (x1 + x2) / 2.0 + center_y = (y1 + y2) / 2.0 + return Detection2D( + id=detection_id, + results_length=1, + header=Header(123.0, "camera_optical"), + bbox=BoundingBox2D( + center=Pose2D(position=Point2D(x=center_x, y=center_y), theta=0.0), + size_x=x2 - x1, + size_y=y2 - y1, + ), + results=[ + ObjectHypothesisWithPose(hypothesis=ObjectHypothesis(class_id="person", score=0.9)) + ], + ) + + +def _make_array(*detections: Detection2D) -> Detection2DArray: + return Detection2DArray( + detections_length=len(detections), + header=Header(123.0, "camera_optical"), + detections=list(detections), + ) + + +def _subscribe_selected(module: BBoxSelectionModule) -> list[Any]: + received: list[Any] = [] + module.selected_bbox.subscribe(received.append) + return received + + +def test_camera_click_selects_matching_bbox(module: BBoxSelectionModule) -> None: + received = _subscribe_selected(module) + detections = _make_array( + _make_detection("left", 0.0, 0.0, 100.0, 100.0), + _make_detection("target", 200.0, 100.0, 260.0, 180.0), + ) + + module._on_detections(detections) + received.clear() + module._on_clicked_point( + PointStamped(x=220.0, y=120.0, z=0.0, frame_id="/world/color_image/detections") + ) + + assert len(received) == 1 + assert received[0].detections_length == 1 + assert received[0].detections[0].id == "target" + + +def test_camera_click_miss_clears_selection(module: BBoxSelectionModule) -> None: + received = _subscribe_selected(module) + detections = _make_array(_make_detection("target", 20.0, 20.0, 120.0, 120.0)) + + module._on_detections(detections) + module.select_bbox(index=0) + received.clear() + module._on_clicked_point(PointStamped(x=300.0, y=300.0, z=0.0, frame_id="/world/color_image")) + + assert len(received) == 1 + assert received[0].detections_length == 0 + assert received[0].detections == [] + + +def test_non_camera_click_does_not_change_selection(module: BBoxSelectionModule) -> None: + received = _subscribe_selected(module) + detections = _make_array(_make_detection("target", 20.0, 20.0, 120.0, 120.0)) + + module._on_detections(detections) + module.select_bbox(index=0) + received.clear() + module._on_clicked_point(PointStamped(x=300.0, y=300.0, z=0.0, frame_id="/world")) + module._on_detections(detections) + + assert len(received) == 1 + assert received[0].detections_length == 1 + assert received[0].detections[0].id == "target" + + +def test_stop_movement_clears_selection(module: BBoxSelectionModule) -> None: + received = _subscribe_selected(module) + detections = _make_array(_make_detection("target", 20.0, 20.0, 120.0, 120.0)) + + module._on_detections(detections) + module.select_bbox(index=0) + received.clear() + + module._on_stop_movement(Bool(data=True)) + module._on_detections(detections) + + assert len(received) == 2 + assert all(msg.detections_length == 0 for msg in received) + + +def test_overlapping_camera_click_prefers_smaller_bbox(module: BBoxSelectionModule) -> None: + received = _subscribe_selected(module) + detections = _make_array( + _make_detection("large", 0.0, 0.0, 200.0, 200.0), + _make_detection("small", 50.0, 50.0, 80.0, 80.0), + ) + + module._on_detections(detections) + received.clear() + module._on_clicked_point(PointStamped(x=60.0, y=60.0, z=0.0, frame_id="world/color_image")) + + assert len(received) == 1 + assert received[0].detections_length == 1 + assert received[0].detections[0].id == "small" + + +def test_camera_click_tracks_by_detection_id_across_reordered_frames( + module: BBoxSelectionModule, +) -> None: + received = _subscribe_selected(module) + first_frame = _make_array( + _make_detection("left", 0.0, 0.0, 100.0, 100.0), + _make_detection("target", 200.0, 100.0, 260.0, 180.0), + _make_detection("right", 320.0, 120.0, 380.0, 210.0), + ) + second_frame = _make_array( + _make_detection("right", 325.0, 125.0, 385.0, 215.0), + _make_detection("left", 5.0, 5.0, 105.0, 105.0), + _make_detection("target", 205.0, 105.0, 265.0, 185.0), + ) + + module._on_detections(first_frame) + received.clear() + module._on_clicked_point( + PointStamped(x=220.0, y=120.0, z=0.0, frame_id="/world/color_image/yoloe_detections") + ) + module._on_detections(second_frame) + + assert len(received) == 2 + assert received[0].detections_length == 1 + assert received[0].detections[0].id == "target" + assert received[1].detections_length == 1 + assert received[1].detections[0].id == "target" + + +def test_camera_click_near_bbox_edge_uses_padding(module: BBoxSelectionModule) -> None: + received = _subscribe_selected(module) + detections = _make_array(_make_detection("target", 100.0, 100.0, 160.0, 180.0)) + + module._on_detections(detections) + received.clear() + module._on_clicked_point( + PointStamped(x=95.0, y=130.0, z=0.0, frame_id="/world/color_image/yoloe_detections") + ) + + assert len(received) == 1 + assert received[0].detections_length == 1 + assert received[0].detections[0].id == "target" + + +def test_camera_click_near_bbox_snaps_to_nearest(module: BBoxSelectionModule) -> None: + received = _subscribe_selected(module) + detections = _make_array( + _make_detection("left", 20.0, 20.0, 70.0, 70.0), + _make_detection("target", 200.0, 100.0, 260.0, 180.0), + ) + + module._on_detections(detections) + received.clear() + module._on_clicked_point( + PointStamped(x=266.0, y=140.0, z=0.0, frame_id="/world/color_image/yoloe_detections") + ) + + assert len(received) == 1 + assert received[0].detections_length == 1 + assert received[0].detections[0].id == "target" diff --git a/dimos/robot/custom/tests/test_spatial_target_lock_module.py b/dimos/robot/custom/tests/test_spatial_target_lock_module.py new file mode 100644 index 0000000000..233850468c --- /dev/null +++ b/dimos/robot/custom/tests/test_spatial_target_lock_module.py @@ -0,0 +1,239 @@ +# Copyright 2025-2026 Dimensional Inc. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from __future__ import annotations + +from collections.abc import Callable +from typing import Any + +from dimos_lcm.std_msgs import Bool +from dimos_lcm.vision_msgs import ( + BoundingBox2D, + Detection2D, + ObjectHypothesis, + ObjectHypothesisWithPose, + Point2D, + Pose2D, +) +import pytest + +from dimos.msgs.geometry_msgs.PoseStamped import PoseStamped +from dimos.msgs.geometry_msgs.Vector3 import Vector3 +from dimos.msgs.std_msgs.Header import Header +from dimos.msgs.vision_msgs.Detection2DArray import Detection2DArray +from dimos.protocol.rpc.spec import RPCSpec +from dimos.robot.custom.modules.spatial_target_lock_module import SpatialTargetLockModule + + +class _NoopRPC(RPCSpec): + def __init__( + self, + *, + rpc_timeouts: dict[str, float] | None = None, + default_rpc_timeout: float = 120.0, + ) -> None: + self.rpc_timeouts = {} if rpc_timeouts is None else dict(rpc_timeouts) + self.default_rpc_timeout = default_rpc_timeout + + def serve_module_rpc(self, module: Any, name: str | None = None) -> None: + pass + + def serve_rpc(self, f: Callable[..., Any], name: str) -> Callable[[], None]: + return lambda: None + + def call( + self, + name: str, + arguments: tuple[list[Any], dict[str, Any]], + cb: Callable[[Any], None] | None, + ) -> Callable[[], None] | None: + return (lambda: None) if cb is not None else None + + def call_nowait(self, name: str, arguments: tuple[list[Any], dict[str, Any]]) -> None: + pass + + def start(self) -> None: + pass + + def stop(self) -> None: + pass + + +@pytest.fixture() +def module() -> SpatialTargetLockModule: + instance = SpatialTargetLockModule(rpc_transport=_NoopRPC, reacquire_distance_m=0.5) + try: + yield instance + finally: + instance.stop() + + +def _make_detection( + detection_id: str, + class_id: str, + x1: float, + y1: float, + x2: float, + y2: float, +) -> Detection2D: + center_x = (x1 + x2) / 2.0 + center_y = (y1 + y2) / 2.0 + return Detection2D( + id=detection_id, + results_length=1, + header=Header(123.0, "camera_optical"), + bbox=BoundingBox2D( + center=Pose2D(position=Point2D(x=center_x, y=center_y), theta=0.0), + size_x=x2 - x1, + size_y=y2 - y1, + ), + results=[ + ObjectHypothesisWithPose(hypothesis=ObjectHypothesis(class_id=class_id, score=0.9)) + ], + ) + + +def _make_array(*detections: Detection2D) -> Detection2DArray: + return Detection2DArray( + detections_length=len(detections), + header=Header(123.0, "camera_optical"), + detections=list(detections), + ) + + +def _pose(x: float, y: float = 0.0, z: float = 0.0) -> PoseStamped: + return PoseStamped( + ts=123.0, + frame_id="world", + position=Vector3(x, y, z), + orientation=(0.0, 0.0, 0.0, 1.0), + ) + + +def test_selected_bbox_projects_to_locked_target_pose( + module: SpatialTargetLockModule, + monkeypatch: pytest.MonkeyPatch, +) -> None: + target_messages: list[PoseStamped] = [] + locked_messages: list[Any] = [] + module.target_pose.subscribe(target_messages.append) + module.locked_bbox.subscribe(locked_messages.append) + monkeypatch.setattr(module, "_project_detection_pose", lambda _detection: _pose(1.0)) + + module._on_selected_bbox(_make_array(_make_detection("target", "person", 10, 10, 40, 40))) + + assert module.get_lock_state()["state"] == "locked" + assert target_messages[-1].position.x == pytest.approx(1.0) + assert locked_messages[-1].detections_length == 1 + + +def test_empty_selected_bbox_after_lock_keeps_spatial_memory( + module: SpatialTargetLockModule, + monkeypatch: pytest.MonkeyPatch, +) -> None: + monkeypatch.setattr(module, "_project_detection_pose", lambda _detection: _pose(1.0)) + + module._on_selected_bbox(_make_array(_make_detection("target", "person", 10, 10, 40, 40))) + module._on_selected_bbox(_make_array()) + + state = module.get_lock_state() + assert state["state"] == "locked" + assert state["last_pose"]["x"] == pytest.approx(1.0) + + +def test_empty_detections_publish_memory_pose( + module: SpatialTargetLockModule, + monkeypatch: pytest.MonkeyPatch, +) -> None: + target_messages: list[PoseStamped] = [] + locked_messages: list[Any] = [] + module.target_pose.subscribe(target_messages.append) + module.locked_bbox.subscribe(locked_messages.append) + monkeypatch.setattr(module, "_project_detection_pose", lambda _detection: _pose(1.0)) + + module._on_selected_bbox(_make_array(_make_detection("target", "person", 10, 10, 40, 40))) + module._on_detections(_make_array()) + + assert module.get_lock_state()["state"] == "using_memory" + assert target_messages[-1].position.x == pytest.approx(1.0) + assert locked_messages[-1].detections_length == 0 + + +def test_detections_after_lock_keep_initial_memory_pose( + module: SpatialTargetLockModule, + monkeypatch: pytest.MonkeyPatch, +) -> None: + poses = { + "target": _pose(1.0), + "new-id": _pose(1.2), + "far": _pose(4.0), + } + + def _project(detection: Detection2D) -> PoseStamped: + return poses[detection.id] + + monkeypatch.setattr(module, "_project_detection_pose", _project) + + module._on_selected_bbox(_make_array(_make_detection("target", "person", 10, 10, 40, 40))) + module._on_detections( + _make_array( + _make_detection("far", "person", 100, 100, 130, 130), + _make_detection("new-id", "person", 12, 12, 42, 42), + ) + ) + + state = module.get_lock_state() + assert state["state"] == "using_memory" + assert state["target_id"] == "target" + assert state["last_pose"]["x"] == pytest.approx(1.0) + + +def test_spatial_reacquire_rejects_candidate_with_large_z_jump( + module: SpatialTargetLockModule, + monkeypatch: pytest.MonkeyPatch, +) -> None: + poses = { + "target": _pose(1.0, z=0.6), + "ground": _pose(1.1, z=0.05), + } + + def _project(detection: Detection2D) -> PoseStamped: + return poses[detection.id] + + monkeypatch.setattr(module, "_project_detection_pose", _project) + + module._on_selected_bbox(_make_array(_make_detection("target", "person", 10, 10, 40, 40))) + module._on_detections(_make_array(_make_detection("ground", "person", 12, 12, 42, 42))) + + state = module.get_lock_state() + assert state["state"] == "using_memory" + assert state["target_id"] == "target" + assert state["last_pose"]["x"] == pytest.approx(1.0) + assert state["last_pose"]["z"] == pytest.approx(0.6) + + +def test_clear_request_clears_lock( + module: SpatialTargetLockModule, + monkeypatch: pytest.MonkeyPatch, +) -> None: + locked_messages: list[Any] = [] + module.locked_bbox.subscribe(locked_messages.append) + monkeypatch.setattr(module, "_project_detection_pose", lambda _detection: _pose(1.0)) + + module._on_selected_bbox(_make_array(_make_detection("target", "person", 10, 10, 40, 40))) + module._on_clear_selection_request(Bool(data=True)) + + assert module.get_lock_state()["state"] == "unselected" + assert module.get_lock_state()["last_pose"] is None + assert locked_messages[-1].detections_length == 0 diff --git a/dimos/robot/custom/tests/test_target_lock_module.py b/dimos/robot/custom/tests/test_target_lock_module.py new file mode 100644 index 0000000000..e12df3fcfc --- /dev/null +++ b/dimos/robot/custom/tests/test_target_lock_module.py @@ -0,0 +1,180 @@ +# Copyright 2025-2026 Dimensional Inc. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from __future__ import annotations + +from collections.abc import Callable +from typing import Any + +from dimos_lcm.std_msgs import Bool +from dimos_lcm.vision_msgs import ( + BoundingBox2D, + Detection2D, + ObjectHypothesis, + ObjectHypothesisWithPose, + Point2D, + Pose2D, +) +import pytest + +from dimos.msgs.std_msgs.Header import Header +from dimos.msgs.vision_msgs.Detection2DArray import Detection2DArray +from dimos.protocol.rpc.spec import RPCSpec +from dimos.robot.custom.modules.target_lock_module import TargetLockModule + + +class _NoopRPC(RPCSpec): + def __init__( + self, + *, + rpc_timeouts: dict[str, float] | None = None, + default_rpc_timeout: float = 120.0, + ) -> None: + self.rpc_timeouts = {} if rpc_timeouts is None else dict(rpc_timeouts) + self.default_rpc_timeout = default_rpc_timeout + + def serve_module_rpc(self, module: Any, name: str | None = None) -> None: + pass + + def serve_rpc(self, f: Callable[..., Any], name: str) -> Callable[[], None]: + return lambda: None + + def call( + self, + name: str, + arguments: tuple[list[Any], dict[str, Any]], + cb: Callable[[Any], None] | None, + ) -> Callable[[], None] | None: + return (lambda: None) if cb is not None else None + + def call_nowait(self, name: str, arguments: tuple[list[Any], dict[str, Any]]) -> None: + pass + + def start(self) -> None: + pass + + def stop(self) -> None: + pass + + +@pytest.fixture() +def module() -> TargetLockModule: + instance = TargetLockModule(rpc_transport=_NoopRPC, search_timeout_sec=0.5) + try: + yield instance + finally: + instance.stop() + + +def _make_detection( + detection_id: str, + class_id: str, + x1: float, + y1: float, + x2: float, + y2: float, +) -> Detection2D: + center_x = (x1 + x2) / 2.0 + center_y = (y1 + y2) / 2.0 + return Detection2D( + id=detection_id, + results_length=1, + header=Header(123.0, "camera_optical"), + bbox=BoundingBox2D( + center=Pose2D(position=Point2D(x=center_x, y=center_y), theta=0.0), + size_x=x2 - x1, + size_y=y2 - y1, + ), + results=[ + ObjectHypothesisWithPose(hypothesis=ObjectHypothesis(class_id=class_id, score=0.9)) + ], + ) + + +def _make_array(*detections: Detection2D) -> Detection2DArray: + return Detection2DArray( + detections_length=len(detections), + header=Header(123.0, "camera_optical"), + detections=list(detections), + ) + + +def test_transition_unselected_to_locked(module: TargetLockModule) -> None: + selected = _make_array(_make_detection("target", "person", 10.0, 10.0, 40.0, 40.0)) + + module._on_selected_bbox(selected) + + assert module.get_lock_state()["state"] == "locked" + + +def test_stop_movement_clears_locked_bbox(module: TargetLockModule) -> None: + selected = _make_array(_make_detection("target", "person", 10.0, 10.0, 40.0, 40.0)) + locked_messages: list[Any] = [] + module.locked_bbox.subscribe(locked_messages.append) + + module._on_selected_bbox(selected) + module._on_stop_movement(Bool(data=True)) + + assert module.get_lock_state()["state"] == "unselected" + assert locked_messages[-1].detections_length == 0 + + +def test_transition_locked_to_searching(module: TargetLockModule) -> None: + selected = _make_array(_make_detection("target", "person", 10.0, 10.0, 40.0, 40.0)) + module._on_selected_bbox(selected) + + # No candidates and still inside timeout => searching + module._on_detections(_make_array()) + + assert module.get_lock_state()["state"] == "searching" + + +def test_transition_searching_to_locked_by_reacquire(module: TargetLockModule) -> None: + selected = _make_array(_make_detection("target", "person", 10.0, 10.0, 40.0, 40.0)) + module._on_selected_bbox(selected) + module._on_detections(_make_array()) + assert module.get_lock_state()["state"] == "searching" + + # Different id but same class and near last center => reacquire and lock + module._on_detections(_make_array(_make_detection("new-id", "person", 12.0, 12.0, 42.0, 42.0))) + + state = module.get_lock_state() + assert state["state"] == "locked" + assert state["target_id"] == "new-id" + + +def test_transition_searching_to_lost_after_timeout( + module: TargetLockModule, + monkeypatch: pytest.MonkeyPatch, +) -> None: + # Control monotonic time deterministically. + now = {"value": 100.0} + + def _monotonic() -> float: + return now["value"] + + monkeypatch.setattr("dimos.robot.custom.modules.target_lock_module.time.monotonic", _monotonic) + + selected = _make_array(_make_detection("target", "person", 10.0, 10.0, 40.0, 40.0)) + module._on_selected_bbox(selected) + + # No match and still inside timeout => searching + now["value"] = 100.1 + module._on_detections(_make_array()) + assert module.get_lock_state()["state"] == "searching" + + # Beyond timeout => lost + now["value"] = 101.0 + module._on_detections(_make_array()) + assert module.get_lock_state()["state"] == "lost" diff --git a/dimos/robot/custom/tests/test_target_standoff_behavior_module.py b/dimos/robot/custom/tests/test_target_standoff_behavior_module.py new file mode 100644 index 0000000000..48ef00fe3f --- /dev/null +++ b/dimos/robot/custom/tests/test_target_standoff_behavior_module.py @@ -0,0 +1,278 @@ +# Copyright 2025-2026 Dimensional Inc. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from __future__ import annotations + +from collections.abc import Callable +from typing import Any + +from dimos_lcm.std_msgs import Bool +import pytest + +from dimos.msgs.geometry_msgs.PoseStamped import PoseStamped +from dimos.msgs.geometry_msgs.Transform import Transform +from dimos.msgs.geometry_msgs.Vector3 import Vector3 +from dimos.navigation.base import NavigationState +from dimos.protocol.rpc.spec import RPCSpec +from dimos.robot.custom.tasks import target_standoff_behavior_module as behavior_module +from dimos.robot.custom.tasks.target_standoff_behavior_module import ( + TargetStandoffBehaviorModule, +) + + +class _NoopRPC(RPCSpec): + def __init__( + self, + *, + rpc_timeouts: dict[str, float] | None = None, + default_rpc_timeout: float = 120.0, + ) -> None: + self.rpc_timeouts = {} if rpc_timeouts is None else dict(rpc_timeouts) + self.default_rpc_timeout = default_rpc_timeout + + def serve_module_rpc(self, module: Any, name: str | None = None) -> None: + pass + + def serve_rpc(self, f: Callable[..., Any], name: str) -> Callable[[], None]: + return lambda: None + + def call( + self, + name: str, + arguments: tuple[list[Any], dict[str, Any]], + cb: Callable[[Any], None] | None, + ) -> Callable[[], None] | None: + return (lambda: None) if cb is not None else None + + def call_nowait(self, name: str, arguments: tuple[list[Any], dict[str, Any]]) -> None: + pass + + def start(self) -> None: + pass + + def stop(self) -> None: + pass + + +class _FakePlanner: + def __init__(self) -> None: + self.goals: list[PoseStamped] = [] + self.cancel_count = 0 + self.goal_reached = False + self.state = NavigationState.IDLE + + def set_goal(self, goal: PoseStamped) -> bool: + self.goals.append(goal) + self.goal_reached = False + self.state = NavigationState.FOLLOWING_PATH + return True + + def get_state(self) -> NavigationState: + return self.state + + def is_goal_reached(self) -> bool: + return self.goal_reached + + def cancel_goal(self) -> bool: + self.cancel_count += 1 + self.state = NavigationState.IDLE + return True + + def set_replanning_enabled(self, enabled: bool) -> None: + pass + + def set_safe_goal_clearance(self, clearance: float) -> None: + pass + + def reset_safe_goal_clearance(self) -> None: + pass + + +@pytest.fixture() +def module(monkeypatch: pytest.MonkeyPatch) -> TargetStandoffBehaviorModule: + instance = TargetStandoffBehaviorModule(rpc_transport=_NoopRPC) + instance._planner = _FakePlanner() + monkeypatch.setattr( + instance, + "_get_robot_transform", + lambda: Transform( + translation=Vector3(0.0, 0.0, 0.0), + frame_id="world", + child_frame_id="base_link", + ), + ) + try: + yield instance + finally: + instance.stop() + + +def _pose(x: float, y: float = 0.0, z: float = 0.2) -> PoseStamped: + return PoseStamped( + ts=123.0, + frame_id="world", + position=Vector3(x, y, z), + orientation=(0.0, 0.0, 0.0, 1.0), + ) + + +def _planner(module: TargetStandoffBehaviorModule) -> _FakePlanner: + assert isinstance(module._planner, _FakePlanner) + return module._planner + + +def test_selected_target_computes_target_near_far_and_dispatches_near_goal( + module: TargetStandoffBehaviorModule, +) -> None: + module._on_target_pose(_pose(2.0)) + + planner = _planner(module) + assert module._state == "navigating_near" + assert len(planner.goals) == 1 + assert module._target_pose is not None + assert module._near_pose is not None + assert module._far_pose is not None + assert module._target_pose.position.x == pytest.approx(2.0) + assert module._near_pose.position.x == pytest.approx(1.5) + assert module._far_pose.position.x == pytest.approx(0.5) + assert planner.goals[0].position.x == pytest.approx(1.5) + + +def test_full_waypoint_sequence_uses_planner_arrival_and_dwell_timers( + module: TargetStandoffBehaviorModule, + monkeypatch: pytest.MonkeyPatch, +) -> None: + now = {"value": 100.0} + monkeypatch.setattr(behavior_module.time, "monotonic", lambda: now["value"]) + clear_requests: list[Any] = [] + module.clear_selection_request.subscribe(clear_requests.append) + + module._on_target_pose(_pose(2.0)) + planner = _planner(module) + assert module._state == "navigating_near" + + planner.goal_reached = True + module._tick() + assert module._state == "dwelling_near" + assert len(planner.goals) == 1 + + now["value"] = 109.0 + module._tick() + assert module._state == "dwelling_near" + assert len(planner.goals) == 1 + + now["value"] = 110.1 + module._tick() + assert module._state == "navigating_far" + assert len(planner.goals) == 2 + assert planner.goals[1].position.x == pytest.approx(0.5) + + planner.goal_reached = True + module._tick() + assert module._state == "dwelling_far" + + now["value"] = 120.2 + module._tick() + assert module._state == "returning_near" + assert len(planner.goals) == 3 + assert planner.goals[2].position.x == pytest.approx(1.5) + + planner.goal_reached = True + module._tick() + assert module._state == "done" + assert len(clear_requests) == 1 + + +def test_dwell_countdown_logs_when_started_and_every_two_seconds( + module: TargetStandoffBehaviorModule, + monkeypatch: pytest.MonkeyPatch, +) -> None: + now = {"value": 100.0} + log_messages: list[str] = [] + monkeypatch.setattr(behavior_module.time, "monotonic", lambda: now["value"]) + monkeypatch.setattr(behavior_module.logger, "info", lambda message: log_messages.append(message)) + + module._on_target_pose(_pose(2.0)) + planner = _planner(module) + planner.goal_reached = True + module._tick() + assert module._state == "dwelling_near" + + module._tick() + assert sum("dwell countdown" in message for message in log_messages) == 1 + + now["value"] = 101.0 + module._tick() + assert sum("dwell countdown" in message for message in log_messages) == 1 + + now["value"] = 102.1 + module._tick() + assert sum("dwell countdown" in message for message in log_messages) == 2 + + +def test_missing_robot_tf_blocks_sequence( + module: TargetStandoffBehaviorModule, + monkeypatch: pytest.MonkeyPatch, +) -> None: + monkeypatch.setattr(module, "_get_robot_transform", lambda: None) + + module._on_target_pose(_pose(2.0)) + + assert module._state == "idle" + assert _planner(module).goals == [] + + +def test_teleop_interrupts_sequence_and_clears_selection( + module: TargetStandoffBehaviorModule, +) -> None: + clear_requests: list[Any] = [] + module.clear_selection_request.subscribe(clear_requests.append) + module._on_target_pose(_pose(2.0)) + + module._on_teleop_active(Bool(data=True)) + + assert module._state == "idle" + assert _planner(module).cancel_count == 1 + assert len(clear_requests) == 1 + + +def test_stop_behavior_cancels_planner_and_clears_selection( + module: TargetStandoffBehaviorModule, +) -> None: + clear_requests: list[Any] = [] + module.clear_selection_request.subscribe(clear_requests.append) + module._on_target_pose(_pose(2.0)) + + result = module.stop_behavior() + + assert result == "target standoff behavior stopped" + assert module._state == "idle" + assert _planner(module).cancel_count == 1 + assert len(clear_requests) == 1 + + +def test_goal_timeout_fails_sequence( + module: TargetStandoffBehaviorModule, + monkeypatch: pytest.MonkeyPatch, +) -> None: + now = {"value": 100.0} + module.config.goal_timeout_sec = 2.0 + monkeypatch.setattr(behavior_module.time, "monotonic", lambda: now["value"]) + module._on_target_pose(_pose(2.0)) + + now["value"] = 103.0 + module._tick() + + assert module._state == "failed" + assert _planner(module).cancel_count == 1 diff --git a/dimos/robot/custom/visualization/__init__.py b/dimos/robot/custom/visualization/__init__.py new file mode 100644 index 0000000000..bc1a2ce5cc --- /dev/null +++ b/dimos/robot/custom/visualization/__init__.py @@ -0,0 +1,13 @@ +# Copyright 2025-2026 Dimensional Inc. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. diff --git a/dimos/robot/custom/visualization/detection2d_overlay.py b/dimos/robot/custom/visualization/detection2d_overlay.py new file mode 100644 index 0000000000..89e79e828b --- /dev/null +++ b/dimos/robot/custom/visualization/detection2d_overlay.py @@ -0,0 +1,81 @@ +# Copyright 2025-2026 Dimensional Inc. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +from __future__ import annotations + +from typing import Any + +from dimos.msgs.vision_msgs.Detection2DArray import Detection2DArray + + +def detection_array_to_rerun( + detections: Detection2DArray, + color: tuple[int, int, int, int], + draw_order: float, +) -> Any: + """Convert a Detection2DArray to a Rerun Boxes2D 2D overlay. + + Args: + detections: The Detection2DArray message to visualize. + color: RGBA color for all boxes. + draw_order: Rerun draw order; higher values render on top. + """ + import rerun as rr # 延迟导入 Rerun,避免非 viewer 路径承担额外导入成本 + + boxes: list[list[float]] = [] + labels: list[str] = [] + colors: list[list[int]] = [] + + for index, detection in enumerate(detections.detections): + bbox = detection.bbox + center = bbox.center.position + x = float(center.x - bbox.size_x / 2.0) + y = float(center.y - bbox.size_y / 2.0) + width = float(bbox.size_x) + height = float(bbox.size_y) + boxes.append([x, y, width, height]) + labels.append(str(detection.id or index)) # 优先显示 detection.id,没有 id 时显示当前序号 + colors.append(list(color)) + + return rr.Boxes2D( # 空 boxes 会清除 viewer 里的旧框 + array=boxes, + array_format=rr.Box2DFormat.XYWH, + colors=colors, + labels=labels, + show_labels=True, + draw_order=draw_order, + ) + + +def detections_overlay(detections: Detection2DArray) -> Any: + """Yellow semi-transparent overlay for YOLO candidate bboxes (draw_order=90).""" + return detection_array_to_rerun(detections, (255, 190, 0, 180), 90.0) + + +def selected_bbox_overlay(detections: Detection2DArray) -> Any: + """Solid green overlay for the RPC-selected bbox (draw_order=100).""" + return detection_array_to_rerun(detections, (0, 255, 0, 255), 100.0) + + +def yoloe_overlay(detections: Detection2DArray) -> Any: + """Cyan overlay for YOLOE tracking detections (draw_order=95).""" + return detection_array_to_rerun(detections, (0, 220, 255, 220), 95.0) + + +__all__ = [ + "detection_array_to_rerun", + "detections_overlay", + "selected_bbox_overlay", + "yoloe_overlay", +] diff --git a/dimos/robot/unitree/connection.py b/dimos/robot/unitree/connection.py index 44101cc19d..1625813f9d 100644 --- a/dimos/robot/unitree/connection.py +++ b/dimos/robot/unitree/connection.py @@ -93,12 +93,28 @@ def to_ndarray(self, format=None): # type: ignore[no-untyped-def] class UnitreeWebRTCConnection(Resource): _SPORT_API_ID_RAGEMODE: int = 2059 - def __init__(self, ip: str, mode: str = "ai") -> None: + # How far to raise the head (body pitch, radians) when the robot stops. + # The Go2 Euler command accepts roughly [-0.75, 0.75] rad per axis. + _LOOK_UP_PITCH_RAD: float = 0.6 + + def __init__( + self, + ip: str, + mode: str = "ai", + connection_method: WebRTCConnectionMethod = WebRTCConnectionMethod.LocalSTA, + ) -> None: self.ip = ip self.mode = mode self.stop_timer: threading.Timer | None = None self.cmd_vel_timeout = 0.2 - self.conn = LegionConnection(WebRTCConnectionMethod.LocalSTA, ip=self.ip) + # When True, raise the head once whenever the robot transitions from + # moving to stopped, and level it again when it starts moving. + self.look_up_on_stop = True + self.look_up_pitch = self._LOOK_UP_PITCH_RAD + self._was_moving = False + # LocalSTA: robot joined your router (sends id="STA_localNetwork"). + # LocalAP: you're on the robot's own hotspot at 192.168.12.1 (sends id=""). + self.conn = LegionConnection(connection_method, ip=self.ip) self.connect() def connect(self) -> None: @@ -175,6 +191,7 @@ def move(self, twist: Twist, duration: float = 0.0) -> bool: bool: True if command was sent successfully """ x, y, yaw = twist.linear.x, twist.linear.y, twist.angular.z + moving = bool(x or y or yaw) # WebRTC coordinate mapping: # x - Positive right, negative left @@ -209,12 +226,17 @@ async def async_move_duration() -> None: # Send continuous move commands for the duration future = asyncio.run_coroutine_threadsafe(async_move_duration(), self.loop) future.result() + # The robot stops at the end of a timed move; mark it moving so + # stop_movement() raises the head on the resulting stop. + if moving: + self._was_moving = True # Stop after duration self.stop_movement() else: # Single command for continuous movement future = asyncio.run_coroutine_threadsafe(async_move(), self.loop) future.result() + self._update_head_posture(moving) return True except Exception as e: print(f"Failed to send movement command: {e}") @@ -321,8 +343,33 @@ def set_obstacle_avoidance(self, enabled: bool = True) -> None: ) def free_walk(self) -> bool: - """Activate FreeWalk locomotion mode — enables walking and velocity commands.""" - return bool(self.publish_request(RTC_TOPIC["SPORT_MOD"], {"api_id": SPORT_CMD["FreeWalk"]})) + """Activate FreeWalk locomotion AND enable joystick velocity input. + + Two-step protocol (mirrors enable_rage_mode, which is the only path + in this codebase that demonstrably gets the dog walking from + joystick input): + + 1. ``FreeWalk`` (1045) — switches the locomotion gait. + 2. ``SwitchJoystick`` (1027, data=True) — explicitly enables joystick + velocity interpretation. WITHOUT this, lx/ly are interpreted as + body-pose lean (BalanceStand semantics) and the dog only sways + in place even though the gait is technically FreeWalk. + + Returns True if both publishes succeed. + """ + gait_ok = bool( + self.publish_request(RTC_TOPIC["SPORT_MOD"], {"api_id": SPORT_CMD["FreeWalk"]}) + ) + # Small settle so the gait switch lands before we flip the joystick + # channel — matches the 2 s delay rage_mode uses for the same reason. + time.sleep(0.3) + joystick_ok = bool( + self.publish_request( + RTC_TOPIC["SPORT_MOD"], + {"api_id": SPORT_CMD["SwitchJoystick"], "parameter": {"data": True}}, + ) + ) + return gait_ok and joystick_ok def enable_rage_mode(self) -> bool: """Enable Rage Mode on the Go2 via WebRTC. @@ -347,6 +394,50 @@ def enable_rage_mode(self) -> bool: ) return rage_ok and joystick_ok + def set_body_euler(self, roll: float = 0.0, pitch: float = 0.0, yaw: float = 0.0) -> bool: + """Set the body orientation (radians) via the Euler sport command. + + Requires the robot to be standing (BalanceStand). Each axis is clamped + to the Go2's safe range of roughly [-0.75, 0.75] rad. + """ + + def _clamp(v: float) -> float: + return max(-0.75, min(0.75, v)) + + return bool( + self.publish_request( + RTC_TOPIC["SPORT_MOD"], + { + "api_id": SPORT_CMD["Euler"], + "parameter": {"x": _clamp(roll), "y": _clamp(pitch), "z": _clamp(yaw)}, + }, + ) + ) + + def look_up(self, pitch: float | None = None) -> bool: + """Raise the robot's head by pitching the body nose-up. + + `pitch` is the look-up angle in radians (positive = look up); defaults + to ``look_up_pitch``. On the Go2 a positive body pitch raises the front; + flip the sign here if your unit tilts the other way. + """ + angle = self.look_up_pitch if pitch is None else pitch + return self.set_body_euler(pitch=angle) + + def reset_posture(self) -> bool: + """Level the body again (head neutral).""" + return self.set_body_euler() + + def _update_head_posture(self, moving: bool) -> None: + """Raise the head on the moving->stopped edge, level it on stopped->moving.""" + if not self.look_up_on_stop: + return + if moving and not self._was_moving: + self.reset_posture() + elif not moving and self._was_moving: + self.look_up() + self._was_moving = moving + def liedown(self) -> bool: return bool( self.publish_request(RTC_TOPIC["SPORT_MOD"], {"api_id": SPORT_CMD["StandDown"]}) @@ -424,6 +515,11 @@ def stop_movement(self) -> None: if self.stop_timer: self.stop_timer.cancel() self.stop_timer = None + # Reaching the timeout means commands stopped arriving while moving: + # treat that as a stop and raise the head once. + if self.look_up_on_stop and self._was_moving: + self._was_moving = False + self.look_up() def disconnect(self) -> None: """Disconnect from the robot and clean up resources.""" diff --git a/dimos/robot/unitree/go2/blueprints/agentic/unitree_go2_agentic_ollama.py b/dimos/robot/unitree/go2/blueprints/agentic/unitree_go2_agentic_ollama.py index 20576ffe14..8719244304 100644 --- a/dimos/robot/unitree/go2/blueprints/agentic/unitree_go2_agentic_ollama.py +++ b/dimos/robot/unitree/go2/blueprints/agentic/unitree_go2_agentic_ollama.py @@ -23,7 +23,7 @@ unitree_go2_agentic_ollama = autoconnect( unitree_go2_spatial, McpServer.blueprint(), - McpClient.blueprint(model="ollama:qwen3:8b"), + McpClient.blueprint(model="ollama:qwen3:4b"), _common_agentic, ).requirements( ollama_installed, diff --git a/dimos/robot/unitree/go2/blueprints/smart/unitree_go2_patrol.py b/dimos/robot/unitree/go2/blueprints/smart/unitree_go2_patrol.py new file mode 100644 index 0000000000..3bfd080a17 --- /dev/null +++ b/dimos/robot/unitree/go2/blueprints/smart/unitree_go2_patrol.py @@ -0,0 +1,295 @@ +#!/usr/bin/env python3 +# Copyright 2025-2026 Dimensional Inc. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +"""Go2 safe-patrol: scan for nearby person, then step back/forward/left/right. + +On start, after the robot stands up and detections stabilize, checks whether +any person (COCO class 0) is detected with at least one LIDAR return within +`safety_radius` of the base link. If clear, walks `step_distance` in each of +four directions using odometry as the stop criterion. + +Run: + python -m dimos.robot.unitree.go2.blueprints.smart.unitree_go2_patrol --ip +""" + +from __future__ import annotations + +import math +import threading +import time +from typing import Any + +import numpy as np +from pydantic import Field + +from dimos.core.coordination.blueprints import autoconnect +from dimos.core.core import rpc +from dimos.core.module import Module, ModuleConfig +from dimos.core.stream import In, Out +from dimos.msgs.geometry_msgs.PoseStamped import PoseStamped +from dimos.msgs.geometry_msgs.Twist import Twist +from dimos.msgs.geometry_msgs.Vector3 import Vector3 +from dimos.msgs.sensor_msgs.PointCloud2 import PointCloud2 +from dimos.msgs.vision_msgs.Detection2DArray import Detection2DArray +from dimos.perception.detection.module3D import Detection3DModule +from dimos.robot.unitree.go2.blueprints.basic.unitree_go2_basic import unitree_go2_basic +from dimos.robot.unitree.go2.connection import GO2Connection +from dimos.utils.logging_config import setup_logger + +logger = setup_logger() + +PERSON_CLASS_ID = 0 # COCO + + +class PatrolConfig(ModuleConfig): + safety_radius: float = Field(default=1.0) # meters + step_distance: float = Field(default=0.5) # meters per move + move_speed: float = Field(default=0.3) # m/s + startup_delay: float = Field(default=4.0) # let standup settle + detection_warmup: float = Field(default=2.0) # collect detections before deciding + move_timeout_factor: float = Field(default=4.0) # safety vs. expected duration + z_floor: float = Field(default=-0.1) # ignore points below (ground) + z_ceiling: float = Field(default=1.8) # ignore points above (ceiling) + publish_hz: float = Field(default=20.0) + + +class PatrolModule(Module): + """Single-shot safety-checked patrol sequence.""" + + config: PatrolConfig + + detections: In[Detection2DArray] + lidar: In[PointCloud2] + odom: In[PoseStamped] + cmd_vel: Out[Twist] + + _latest_detections: Detection2DArray | None = None + _latest_lidar: PointCloud2 | None = None + _latest_pose: PoseStamped | None = None + _patrol_thread: threading.Thread | None = None + _stop_event: threading.Event + + def __init__(self, **kwargs: Any) -> None: + super().__init__(**kwargs) + self._stop_event = threading.Event() + + @rpc + def start(self) -> None: + super().start() + self.register_disposable( + self.detections.observable().subscribe(self._on_detections) + ) + self.register_disposable(self.lidar.observable().subscribe(self._on_lidar)) + self.register_disposable(self.odom.observable().subscribe(self._on_odom)) + + self._patrol_thread = threading.Thread( + target=self._run_patrol, name="go2-patrol", daemon=True + ) + self._patrol_thread.start() + + @rpc + def stop(self) -> None: + self._stop_event.set() + self._send_twist(0.0, 0.0) + if self._patrol_thread and self._patrol_thread.is_alive(): + self._patrol_thread.join(timeout=2.0) + super().stop() + + def _on_detections(self, msg: Detection2DArray) -> None: + self._latest_detections = msg + + def _on_lidar(self, msg: PointCloud2) -> None: + self._latest_lidar = msg + + def _on_odom(self, msg: PoseStamped) -> None: + self._latest_pose = msg + + def _person_detected(self) -> bool: + msg = self._latest_detections + if msg is None or not msg.detections: + return False + for det in msg.detections: + if not det.results: + continue + cls = det.results[0].hypothesis.class_id + # LCM declares class_id as string, but YOLO converter may pass int. + if cls == PERSON_CLASS_ID or str(cls) == str(PERSON_CLASS_ID): + return True + return False + + def _lidar_point_within(self, radius: float) -> bool: + pc = self._latest_lidar + if pc is None or len(pc) == 0: + return False + try: + pts = np.asarray(pc.pointcloud.points, dtype=np.float32) + except Exception as e: + logger.warning("Could not read pointcloud: %s", e) + return False + if pts.size == 0: + return False + z = pts[:, 2] + mask = (z > self.config.z_floor) & (z < self.config.z_ceiling) + xy = pts[mask, :2] + if xy.size == 0: + return False + d2 = xy[:, 0] ** 2 + xy[:, 1] ** 2 + return bool(np.any(d2 < radius * radius)) + + def _person_within_radius(self) -> bool: + return self._person_detected() and self._lidar_point_within( + self.config.safety_radius + ) + + def _send_twist(self, vx: float, vy: float) -> None: + self.cmd_vel.publish( + Twist( + linear=Vector3(vx, vy, 0.0), + angular=Vector3(0.0, 0.0, 0.0), + ) + ) + + def _move_by(self, vx: float, vy: float, distance: float, label: str) -> None: + if self._latest_pose is None: + logger.warning("Patrol: no odometry yet, skipping %s", label) + return + + start_x = self._latest_pose.x + start_y = self._latest_pose.y + nominal_speed = max(abs(vx), abs(vy)) or self.config.move_speed + timeout = (distance / nominal_speed) * self.config.move_timeout_factor + period = 1.0 / max(self.config.publish_hz, 1.0) + + logger.info("Patrol: moving %s (~%.2fm)", label, distance) + t0 = time.time() + while not self._stop_event.is_set(): + if time.time() - t0 > timeout: + logger.warning("Patrol: %s timed out after %.1fs", label, timeout) + break + pose = self._latest_pose + if pose is None: + break + traveled = math.hypot(pose.x - start_x, pose.y - start_y) + if traveled >= distance: + logger.info("Patrol: %s reached %.2fm", label, traveled) + break + # Safety: abort if a person walks into range mid-motion. + if self._person_within_radius(): + logger.warning("Patrol: person entered safety zone, aborting %s", label) + break + self._send_twist(vx, vy) + time.sleep(period) + + self._send_twist(0.0, 0.0) + time.sleep(0.3) # let the robot settle before the next leg + + def _run_patrol(self) -> None: + try: + logger.info("Patrol: waiting %.1fs for standup", self.config.startup_delay) + if self._stop_event.wait(self.config.startup_delay): + return + + logger.info( + "Patrol: warming up detections for %.1fs", self.config.detection_warmup + ) + if self._stop_event.wait(self.config.detection_warmup): + return + + if self._person_within_radius(): + logger.warning( + "Patrol: person within %.1fm — aborting", self.config.safety_radius + ) + return + + logger.info("Patrol: clear, starting 4-direction sequence") + speed = self.config.move_speed + dist = self.config.step_distance + # ROS REP-103: +x forward, +y left + sequence = [ + ("backward", -speed, 0.0), + ("forward", speed, 0.0), + ("left", 0.0, speed), + ("right", 0.0, -speed), + ] + for label, vx, vy in sequence: + if self._stop_event.is_set(): + break + self._move_by(vx, vy, dist, label) + + logger.info("Patrol: complete") + except Exception: + logger.exception("Patrol: unexpected error, stopping robot") + self._send_twist(0.0, 0.0) + + +# Compose: basic (GO2Connection + vis) + Detection3DModule + PatrolModule. +# Detection3DModule's `pointcloud` input is remapped to the raw `lidar` stream +# (basic blueprint has no global_map / VoxelGridMapper). +unitree_go2_patrol = ( + autoconnect( + unitree_go2_basic, + Detection3DModule.blueprint(camera_info=GO2Connection.camera_info_static), + PatrolModule.blueprint(), + ) + .remappings( + [ + (Detection3DModule, "pointcloud", "lidar"), + ] + ) + .global_config(n_workers=4, robot_model="unitree_go2") +) + + +__all__ = ["PatrolModule", "PatrolConfig", "unitree_go2_patrol"] + + +if __name__ == "__main__": + import argparse + import os + import sys + from pathlib import Path + + from dimos.core.coordination.module_coordinator import ModuleCoordinator + + parser = argparse.ArgumentParser(description="Go2 safe-patrol blueprint runner") + parser.add_argument("--ip", required=True, help="Robot IP (or 'mujoco' / 'replay')") + parser.add_argument("--safety-radius", type=float, default=1.0) + parser.add_argument("--step-distance", type=float, default=0.5) + parser.add_argument("--speed", type=float, default=0.3) + args = parser.parse_args() + + # MuJoCo on macOS spawns a subprocess that needs `mjpython` on PATH. + if args.ip in ("mujoco", "replay", "fake", "mock"): + venv_bin = str(Path(sys.executable).parent) + if venv_bin not in os.environ.get("PATH", "").split(os.pathsep): + os.environ["PATH"] = venv_bin + os.pathsep + os.environ.get("PATH", "") + + kwargs: dict[str, Any] = { + "g": {"robot_ip": args.ip}, + "patrolmodule": { + "safety_radius": args.safety_radius, + "step_distance": args.step_distance, + "move_speed": args.speed, + }, + } + + coordinator = ModuleCoordinator.build(unitree_go2_patrol, kwargs) + coordinator.start_rpc_service() + try: + coordinator.loop() + except KeyboardInterrupt: + pass + finally: + coordinator.stop() diff --git a/dimos/robot/unitree/go2/connection.py b/dimos/robot/unitree/go2/connection.py index 5568a473ef..9097af74c0 100644 --- a/dimos/robot/unitree/go2/connection.py +++ b/dimos/robot/unitree/go2/connection.py @@ -211,6 +211,7 @@ class GO2Connection(Module, Camera, Pointcloud): connection: Go2ConnectionProtocol camera_info_static: CameraInfo = _camera_info_static() _camera_info_thread: Thread | None = None + _freewalk_heartbeat_thread: Thread | None = None _latest_video_frame: Image | None = None @classmethod @@ -255,6 +256,39 @@ def onimage(image: Image) -> None: self.standup() time.sleep(3) self.connection.balance_stand() + time.sleep(2) + + # FreeWalk / SwitchJoystick are WebRTC-specific (real Go2 hardware). + # MuJoCo / DimSim / Replay connections don't have a gait state machine + # to switch — joystick input goes straight to the physics ctrl, so + # skipping this entire block in simulation is correct. + if hasattr(self.connection, "free_walk"): + # Switch to FreeWalk locomotion gait so joystick velocity actually + # walks the dog. Without this, BalanceStand interprets cmd_vel as + # body-pose lean (the dog sways in place instead of moving) — + # every teleop/nav blueprint on real hardware needs this. + # + # FreeWalk transitions are flaky: a single publish_request gets + # dropped if the dog is still mid balance_stand animation or if + # MOTION_SWITCHER is still settling. We send it twice with a wait, + # then a low-rate heartbeat in the background — this is the only + # reliable way we've found to recover from the "dog sways in + # place" symptom. + self.connection.free_walk() + time.sleep(1.5) + self.connection.free_walk() + time.sleep(0.5) + + # Background heartbeat: re-assert FreeWalk every 15 s. Cheap (one + # MOTION_SWITCHER publish; no-op when already in FreeWalk) and + # saves the user from restarting the blueprint when the gait + # silently reverts to BalanceStand (happens occasionally on hot + # reconnect). + self._freewalk_heartbeat_thread = Thread( + target=self._freewalk_heartbeat, + daemon=True, + ) + self._freewalk_heartbeat_thread.start() if self.config.mode == Go2Mode.RAGE: self.connection.enable_rage_mode() @@ -310,6 +344,23 @@ def publish_camera_info(self) -> None: self.camera_info.publish(self.camera_info_static) time.sleep(1.0) + def _freewalk_heartbeat(self) -> None: + """Re-assert FreeWalk locomotion every 15 s as a soft watchdog. + + On real hardware we sometimes see the dog silently fall back to + BalanceStand (joystick lx/ly -> body lean, no walking). Re-issuing + FreeWalk is idempotent when already in that mode, so this is a + cheap insurance policy that fixes the most common "dog only sways" + report without forcing the user to restart the blueprint. + """ + while True: + time.sleep(15.0) + try: + self.connection.free_walk() + except Exception: + # Connection may be torn down during shutdown; ignore. + pass + @rpc def move(self, twist: Twist, duration: float = 0.0) -> bool: """Send movement command to robot.""" diff --git a/dimos/robot/unitree/keyboard_teleop.py b/dimos/robot/unitree/keyboard_teleop.py index 07af844c60..6261cfb66f 100644 --- a/dimos/robot/unitree/keyboard_teleop.py +++ b/dimos/robot/unitree/keyboard_teleop.py @@ -29,8 +29,14 @@ logger = setup_logger() -# Force X11 driver to avoid OpenGL threading issues -os.environ["SDL_VIDEODRIVER"] = "x11" +# Force X11 driver on Linux to avoid OpenGL threading issues. macOS doesn't +# have X11 by default; use the native Cocoa driver instead. +import sys as _sys + +if _sys.platform == "darwin": + os.environ.setdefault("SDL_VIDEODRIVER", "cocoa") +else: + os.environ["SDL_VIDEODRIVER"] = "x11" DEFAULT_LINEAR_SPEED: float = 0.5 # m/s DEFAULT_ANGULAR_SPEED: float = 0.8 # rad/s diff --git a/dimos/visualization/rerun/test_websocket_server.py b/dimos/visualization/rerun/test_websocket_server.py index e62d3536cd..1c8bc10a15 100644 --- a/dimos/visualization/rerun/test_websocket_server.py +++ b/dimos/visualization/rerun/test_websocket_server.py @@ -142,6 +142,37 @@ def test_click_publishes_point_stamped( assert point.ts == pytest.approx(5.0) +def test_click_with_null_z_defaults_to_zero( + server: RerunWebSocketServer, publisher: MockViewerPublisher +) -> None: + """Null depth in a click payload should not break bbox selection.""" + received: list[Any] = [] + done = threading.Event() + + unsub = server.clicked_point.subscribe(lambda point: (received.append(point), done.set())) + + publisher._send( + { + "type": "click", + "x": 1.5, + "y": 2.5, + "z": None, + "entity_path": "/bbox/person", + "timestamp_ms": 5000, + } + ) + publisher.flush() + done.wait(timeout=2.0) + unsub() + + assert len(received) == 1 + point = received[0] + assert point.x == pytest.approx(1.5) + assert point.y == pytest.approx(2.5) + assert point.z == pytest.approx(0.0) + assert point.frame_id == "/bbox/person" + + def test_twist_publishes_on_tele_cmd_vel( server: RerunWebSocketServer, publisher: MockViewerPublisher ) -> None: diff --git a/dimos/visualization/rerun/websocket_server.py b/dimos/visualization/rerun/websocket_server.py index 3b29666fb0..01df186f41 100644 --- a/dimos/visualization/rerun/websocket_server.py +++ b/dimos/visualization/rerun/websocket_server.py @@ -65,6 +65,16 @@ class HeartbeatMsg(TypedDict): ViewerMsg = Union[ClickMsg, TwistMsg, StopMsg, HeartbeatMsg] +def _coerce_float(value: Any, default: float = 0.0, *, field_name: str | None = None) -> float: + if value is None: + if field_name is not None: + logger.debug( + f"RerunWebSocketServer: field '{field_name}' is null; falling back to {default}" + ) + return default + return float(value) + + def _handshake_noise_filter(record: logging.LogRecord) -> bool: """Drop noisy "opening handshake failed" records from port scanners etc.""" msg = record.getMessage() @@ -152,28 +162,39 @@ def _dispatch(self, raw: str | bytes) -> None: msg_type = msg.get("type") if msg_type == "click": + logger.info( + "RerunWebSocketServer: received click " + f"entity_path={msg.get('entity_path', '')!r} " + f"x={msg.get('x')!r} y={msg.get('y')!r} z={msg.get('z')!r} " + f"timestamp_ms={msg.get('timestamp_ms')!r}" + ) + point = PointStamped( + x=_coerce_float(msg.get("x"), field_name="x"), + y=_coerce_float(msg.get("y"), field_name="y"), + z=_coerce_float(msg.get("z"), field_name="z"), + ts=_coerce_float(msg.get("timestamp_ms"), field_name="timestamp_ms") / 1000.0, + frame_id=str(msg.get("entity_path", "")), + ) + logger.debug( + "RerunWebSocketServer: publishing clicked point " + f"frame_id={point.frame_id!r} x={point.x} y={point.y} z={point.z} ts={point.ts}" + ) self.clicked_point.publish( - PointStamped( - x=float(msg.get("x", 0)), - y=float(msg.get("y", 0)), - z=float(msg.get("z", 0)), - ts=float(msg.get("timestamp_ms", 0)) / 1000.0, - frame_id=str(msg.get("entity_path", "")), - ) + point ) elif msg_type == "twist": self.tele_cmd_vel.publish( Twist( linear=Vector3( - float(msg.get("linear_x", 0)), - float(msg.get("linear_y", 0)), - float(msg.get("linear_z", 0)), + _coerce_float(msg.get("linear_x")), + _coerce_float(msg.get("linear_y")), + _coerce_float(msg.get("linear_z")), ), angular=Vector3( - float(msg.get("angular_x", 0)), - float(msg.get("angular_y", 0)), - float(msg.get("angular_z", 0)), + _coerce_float(msg.get("angular_x")), + _coerce_float(msg.get("angular_y")), + _coerce_float(msg.get("angular_z")), ), ) ) diff --git a/dimos/web/templates/rerun_dashboard.html b/dimos/web/templates/rerun_dashboard.html index f0792079e3..2cea9fd6be 100644 --- a/dimos/web/templates/rerun_dashboard.html +++ b/dimos/web/templates/rerun_dashboard.html @@ -31,7 +31,7 @@
- +
+ + diff --git a/examples/go2_phone_control/people.html b/examples/go2_phone_control/people.html new file mode 100644 index 0000000000..d75d0eb5e2 --- /dev/null +++ b/examples/go2_phone_control/people.html @@ -0,0 +1,312 @@ + + + + + + Go2 BOY · 活点地图 + + + +
+
+ 🗺️ 活点地图 · Marauder's Map + ◀ REMOTE +
+ + + + + +
+ + +
+ + +
+
+

① ALL0

+
+
+
+

② VIEW · 7s0

+
+
+
+ + +
+ + + + diff --git a/examples/go2_phone_control/server.py b/examples/go2_phone_control/server.py new file mode 100644 index 0000000000..cd64d44749 --- /dev/null +++ b/examples/go2_phone_control/server.py @@ -0,0 +1,1091 @@ +#!/usr/bin/env python3 +# Copyright 2025-2026 Dimensional Inc. +# SPDX-License-Identifier: Apache-2.0 + +"""Go2 phone-controller server (Scheme B: phone = remote, Mac = brain). + +The Mac holds a single persistent WebRTC connection to the dog. A mobile-friendly +web page (served at /) sends button taps to this server over HTTP; the server +translates each tap into a robot command and returns the result. The phone needs +nothing but a browser on the same Wi-Fi. + +v1 actions: + 1. Stand up + 2. Sit down + 3. March in place (N reps) + 4. D-pad: forward / back / left / right — one tap = one step + 5. Wave hello + 6. Observe people nearby (YOLO on one camera frame) + +Run (on the Mac, while on the dog's Wi-Fi): + .venv/bin/python examples/go2_phone_control/server.py --ip 192.168.12.1 + # then open http://:8800/ on your phone + +Options: + --port 8800 + --step-distance 0.3 # meters per D-pad tap + --step-speed 0.3 # m/s for D-pad / march pulses + --close-ratio 0.5 # person bbox-height/frame-height to count as "near" +""" + +from __future__ import annotations + +import argparse +import base64 +import math +from pathlib import Path +import random +import threading +import time +from typing import Any + +import uvicorn +from fastapi import FastAPI +from fastapi.responses import FileResponse, JSONResponse +from pydantic import BaseModel + +from unitree_webrtc_connect.constants import WebRTCConnectionMethod + +from dimos.msgs.geometry_msgs.Twist import Twist +from dimos.msgs.geometry_msgs.Vector3 import Vector3 +from dimos.robot.unitree.connection import RTC_TOPIC, SPORT_CMD, UnitreeWebRTCConnection +from dimos.utils.data import get_data + +# Real people tracker imports are lazy (inside __init__) — they pull heavy deps +# (torch, ultralytics, torchreid). MockPeopleTracker stays import-free so --mock +# starts instantly. + +SPORT_TOPIC = RTC_TOPIC["SPORT_MOD"] +HERE = Path(__file__).parent + + +# --- Three-level identity model (MOCK for now) ----------------------------- +# track_id : short-term, from tracker; changes when a person re-appears. +# long_term_id : stable unique person ID (real version = YOLO-E + OSNet Re-ID). +# name : human-readable label (Harry Potter chars) mapped to long_term_id. +# +# MockPeopleTracker fakes the perception+reid pipeline so the two lists and the +# UI can be built/validated WITHOUT the camera, models, or dog. To go real, +# replace _loop() with a stream of detections (YOLO-E) fed through +# EmbeddingIDSystem; keep get_state()'s output shape identical. +HP_NAMES = [ + "Harry", "Hermione", "Ron", "Dumbledore", "Hagrid", "Snape", "Draco", + "Luna", "Neville", "McGonagall", "Sirius", "Dobby", "Ginny", "Fred", +] + + +class MockPeopleTracker: + def __init__(self, tick: float = 1.5, recent_window: float = 7.0) -> None: + self._lock = threading.Lock() + self.tick = tick + self.recent_window = recent_window + self._people: dict[int, dict[str, Any]] = {} # long_term_id -> record + self._next_long_id = 1 + self._next_track_id = 100 + self._name_pool = list(HP_NAMES) + random.shuffle(self._name_pool) + self._stop = threading.Event() + threading.Thread(target=self._loop, daemon=True).start() + print("[people] MOCK people tracker running (fake re-id + HP names)") + + @staticmethod + def _mock_thumbnail(name: str, lid: int) -> str: + """Generate a placeholder 'photo' the dog supposedly captured. + + Real version: replace with base64 JPEG of the YOLO bbox crop from the + latest frame. Same `data:` URL form, so the UI doesn't change. + """ + hue = (lid * 47) % 360 + initial = (name[:1] or "?").upper() + svg = ( + f'' + f'' + f'' + f'{initial}' + f'' + ) + return "data:image/svg+xml;base64," + base64.b64encode(svg.encode()).decode() + + # World: 6m x 6m centered on dog. Dog faces +x. Front cone: +x with |angle|<45°. + _WORLD_HALF = 3.0 + _CONE_HALF_ANGLE = math.radians(45) + _SIGHT_RANGE = 4.0 + + def _new_person(self) -> int: + lid = self._next_long_id + self._next_long_id += 1 + name = self._name_pool.pop(0) if self._name_pool else f"Wizard{lid}" + # Spawn at a random spot in the world (some in cone, some not). + x = random.uniform(-self._WORLD_HALF, self._WORLD_HALF) + y = random.uniform(-self._WORLD_HALF, self._WORLD_HALF) + self._people[lid] = { + "long_term_id": lid, + "name": name, + "count": 0, + "first_seen": time.time(), + "last_seen": 0.0, + "visible": False, + "track_id": None, + "thumbnail": self._mock_thumbnail(name, lid), + "pos_x": x, + "pos_y": y, + } + return lid + + def _in_cone(self, x: float, y: float) -> bool: + """In the dog's front sight cone (forward = +x).""" + if x <= 0: + return False + if math.hypot(x, y) > self._SIGHT_RANGE: + return False + return abs(math.atan2(y, x)) <= self._CONE_HALF_ANGLE + + def _enter(self, lid: int) -> None: + """A person becomes visible: new short-term track_id, appearance count++.""" + p = self._people[lid] + p["visible"] = True + p["track_id"] = self._next_track_id + self._next_track_id += 1 + p["count"] += 1 + p["last_seen"] = time.time() + + def _loop(self) -> None: + # Seed a few wizards; their visibility falls out of position vs. cone. + for _ in range(3): + self._new_person() + while not self._stop.is_set(): + time.sleep(self.tick) + with self._lock: + now = time.time() + # Random walk: each person drifts a bit each tick. + for p in self._people.values(): + p["pos_x"] = max(-self._WORLD_HALF, min(self._WORLD_HALF, + p["pos_x"] + random.uniform(-0.35, 0.35))) + p["pos_y"] = max(-self._WORLD_HALF, min(self._WORLD_HALF, + p["pos_y"] + random.uniform(-0.35, 0.35))) + # Visibility is purely geometric: in front cone & in range. + for p in self._people.values(): + in_cone = self._in_cone(p["pos_x"], p["pos_y"]) + was_visible = p["visible"] + p["visible"] = in_cone + if in_cone: + # Re-entry into the cone bumps the count. + if not was_visible: + p["count"] += 1 + p["track_id"] = self._next_track_id + self._next_track_id += 1 + p["last_seen"] = now + # Occasionally spawn a brand-new wizard somewhere. + if random.random() < 0.18 and len(self._people) < 8: + self._new_person() + + def get_state(self) -> dict[str, Any]: + with self._lock: + now = time.time() + all_people = sorted( + ( + { + "long_term_id": p["long_term_id"], + "name": p["name"], + "count": p["count"], + "thumbnail": p["thumbnail"], + "pos_x": round(p["pos_x"], 2), + "pos_y": round(p["pos_y"], 2), + "visible": p["visible"], + "last_seen_ago": round(now - p["last_seen"], 1) if p["last_seen"] else None, + } + for p in self._people.values() + ), + key=lambda x: (-x["count"], x["long_term_id"]), + ) + in_view = [] + for p in self._people.values(): + if p["last_seen"] > 0 and (now - p["last_seen"]) <= self.recent_window: + in_view.append( + { + "track_id": p["track_id"], + "long_term_id": p["long_term_id"], + "name": p["name"], + "secs_ago": round(now - p["last_seen"], 1), + "visible": p["visible"], + "thumbnail": p["thumbnail"], + "pos_x": round(p["pos_x"], 2), + "pos_y": round(p["pos_y"], 2), + } + ) + in_view.sort(key=lambda x: x["secs_ago"]) + return {"all": all_people, "in_view": in_view, "total": len(self._people)} + + +class RealPeopleTracker: + """Real pipeline: YOLO person tracking + OSNet Re-ID for stable long-term IDs. + + Subscribes to the dog's WebRTC video stream, runs detection at ~3 Hz on a + dedicated worker thread (rx thread only stashes the latest frame; older + frames are dropped to avoid blocking the camera stream), assigns long-term + unique IDs via dimos's EmbeddingIDSystem (OSNet body embeddings), maps each + long-term ID to a Harry Potter name, and stores a JPEG thumbnail of the + last bbox crop. + + Output of `get_state()` is identical in shape to MockPeopleTracker so the + UI doesn't change. + """ + + def __init__( + self, + conn: UnitreeWebRTCConnection, + osnet_variant: str = "osnet_x0_5", + recent_window: float = 7.0, + process_hz: float = 3.0, + ) -> None: + # Heavy imports kept local so mock mode stays fast. + import cv2 # noqa: F401 -- used for JPEG encode + from dimos.models.embedding.treid import TorchReIDModel, TorchReIDModelConfig + from dimos.perception.detection.detectors.person.yolo import YoloPersonDetector + from dimos.perception.detection.reid.embedding_id_system import EmbeddingIDSystem + from dimos.robot.unitree.go2.connection import _camera_info_static + + self._cv2 = cv2 + # Camera intrinsics for pinhole pos estimation (shoulder-width assumption). + ci = _camera_info_static() + self._fx = float(ci.K[0]) + self._cx = float(ci.K[2]) + self._person_real_width = 0.45 # meters (shoulder), matches VisualServoing2D + self._lock = threading.Lock() + self.recent_window = recent_window + self._tick = 1.0 / max(0.5, process_hz) + + print("[people] loading YOLO person detector ...") + self._detector = YoloPersonDetector() + print(f"[people] loading OSNet ({osnet_variant}) ...") + self._embed = TorchReIDModel(TorchReIDModelConfig(model_name=osnet_variant)) + self._idsys = EmbeddingIDSystem(model=lambda: self._embed) + + self._people: dict[int, dict[str, Any]] = {} + self._name_pool = list(HP_NAMES) + random.shuffle(self._name_pool) + + # Latest-frame buffer (older frames dropped if worker is busy). + self._latest: Any = None + self._latest_ts: float = 0.0 + self._stop = threading.Event() + + # Subscribe to camera; rx thread only stashes (cheap). + conn.video_stream().subscribe(self._on_frame) + threading.Thread(target=self._worker, daemon=True).start() + print(f"[people] REAL tracker running (target {process_hz:.1f} Hz)") + + def _on_frame(self, image: Any) -> None: + # Replace prior frame; producer is faster than consumer by design. + with self._lock: + self._latest = image + self._latest_ts = time.time() + + def _assign_name(self, lid: int) -> str: + if self._name_pool: + return self._name_pool.pop(0) + return f"Wizard{lid}" + + def _make_thumb(self, detection: Any) -> str: + """JPEG-encode the bbox crop → base64 data URL.""" + try: + img = detection.cropped_image(padding=8) + bgr = img.to_opencv() + # Cap size to keep payload small in /api/people. + h, w = bgr.shape[:2] + if max(h, w) > 64: + scale = 64.0 / max(h, w) + bgr = self._cv2.resize(bgr, (int(w * scale), int(h * scale))) + ok, buf = self._cv2.imencode( + ".jpg", bgr, [int(self._cv2.IMWRITE_JPEG_QUALITY), 65] + ) + if not ok: + return "" + return "data:image/jpeg;base64," + base64.b64encode(buf.tobytes()).decode() + except Exception: + return "" + + def _worker(self) -> None: + while not self._stop.is_set(): + with self._lock: + image = self._latest + self._latest = None + if image is None: + time.sleep(0.05) + continue + try: + self._process(image) + except Exception as e: # noqa: BLE001 + print(f"[people] frame skipped: {e}") + time.sleep(self._tick) + + def _process(self, image: Any) -> None: + detections = self._detector.process_image(image) + seen_lids: set[int] = set() + now = time.time() + + for det in getattr(detections, "detections", []) or []: + # YoloPersonDetector returns persons only; track_id present after + # first BoT-SORT update (may be None on the very first detection). + track_id = getattr(det, "track_id", None) + if track_id is None: + continue + + lid = self._idsys.register_detection(det) + if lid is None or lid < 0: + # Still warming up embeddings for this track (need ≥10). + continue + + with self._lock: + rec = self._people.get(lid) + if rec is None: + rec = { + "long_term_id": lid, + "name": self._assign_name(lid), + "count": 0, + "first_seen": now, + "last_seen": 0.0, + "visible": False, + "track_id": None, + "thumbnail": "", + "bbox": None, # latest (x1,y1,x2,y2) for follow + "image_width": 0, # latest frame width + "pos_x": 0.0, # dog-frame meters, +x forward + "pos_y": 0.0, # dog-frame meters, +y left + } + self._people[lid] = rec + # Re-entry: visible=False → True bumps the appearance count. + if not rec["visible"]: + rec["count"] += 1 + rec["visible"] = True + rec["track_id"] = int(track_id) + rec["last_seen"] = now + # Store the latest bbox so FollowController can servo on it. + try: + x1, y1, x2, y2 = det.bbox + rec["bbox"] = (float(x1), float(y1), float(x2), float(y2)) + rec["image_width"] = int(image.width) + # Pinhole pose estimate (dog frame: +x forward, +y left). + cx_pix = (x1 + x2) / 2.0 + bbox_w = max(1.0, x2 - x1) + distance = (self._person_real_width * self._fx) / bbox_w + x_norm = (cx_pix - self._cx) / self._fx + angle = math.atan(x_norm) # right of optical center = positive + rec["pos_x"] = round(distance * math.cos(angle), 2) + rec["pos_y"] = round(-distance * math.sin(angle), 2) + except Exception: + pass + # Refresh thumbnail (latest crop is usually the cleanest). + thumb = self._make_thumb(det) + if thumb: + rec["thumbnail"] = thumb + seen_lids.add(lid) + + # Mark people not in this frame as no longer visible. + with self._lock: + for lid, rec in self._people.items(): + if lid not in seen_lids and rec["visible"]: + rec["visible"] = False + + def get_state(self) -> dict[str, Any]: + with self._lock: + now = time.time() + all_people = sorted( + ( + { + "long_term_id": p["long_term_id"], + "name": p["name"], + "count": p["count"], + "thumbnail": p["thumbnail"], + "pos_x": p["pos_x"], + "pos_y": p["pos_y"], + "visible": p["visible"], + "last_seen_ago": round(now - p["last_seen"], 1) if p["last_seen"] else None, + } + for p in self._people.values() + ), + key=lambda x: (-x["count"], x["long_term_id"]), + ) + in_view = [] + for p in self._people.values(): + if p["last_seen"] > 0 and (now - p["last_seen"]) <= self.recent_window: + in_view.append( + { + "track_id": p["track_id"], + "long_term_id": p["long_term_id"], + "name": p["name"], + "secs_ago": round(now - p["last_seen"], 1), + "visible": p["visible"], + "thumbnail": p["thumbnail"], + "pos_x": p["pos_x"], + "pos_y": p["pos_y"], + } + ) + in_view.sort(key=lambda x: x["secs_ago"]) + return {"all": all_people, "in_view": in_view, "total": len(self._people)} + + def get_bbox(self, lid: int) -> tuple[tuple[float, float, float, float], int] | None: + """Latest bbox for a long_term_id IF that person was visible recently. + + Returns (bbox xyxy, image_width) or None. + """ + with self._lock: + p = self._people.get(lid) + if p is None or not p["visible"] or p["bbox"] is None: + return None + return (p["bbox"], p["image_width"]) + + +# MockPeopleTracker doesn't have real bboxes; FollowController will see +# get_bbox()=None in mock and just no-op. +def _mock_get_bbox(self, lid: int) -> None: # type: ignore[no-untyped-def] + return None +MockPeopleTracker.get_bbox = _mock_get_bbox # type: ignore[assignment] + + +people: Any = None # set in main() based on --mock and --no-people + + +# --- "Follow this person" selection ---------------------------------------- +# The UI lets the user pick a long_term_id from the people list as the follow +# target. For now this state is read-only on the dog side; a future control +# loop will read get_target() and steer the dog toward that person's bbox. +_tracked_lid: int | None = None +_following: bool = False +_tracked_lock = threading.Lock() + + +def select_target(lid: int | None) -> None: + global _tracked_lid, _following + with _tracked_lock: + _tracked_lid = lid + if lid is None: + _following = False # clearing the target also stops follow + + +def get_target() -> int | None: + with _tracked_lock: + return _tracked_lid + + +def set_following(b: bool) -> None: + global _following + with _tracked_lock: + _following = bool(b) + + +def get_following() -> bool: + with _tracked_lock: + return _following + + +class FollowController: + """Visual-servo follow loop. Reads selected long_term_id + tracker's latest + bbox, drives the dog at ~10 Hz via VisualServoing2D. Safe by default: only + sends Twist commands when `following=True` AND target has a fresh bbox. + """ + + def __init__(self, brain: "RobotBrain", tracker: Any) -> None: + from dimos.msgs.geometry_msgs.Twist import Twist + from dimos.msgs.geometry_msgs.Vector3 import Vector3 + from dimos.navigation.visual_servoing.visual_servoing_2d import VisualServoing2D + from dimos.robot.unitree.go2.connection import _camera_info_static + + self._brain = brain + self._tracker = tracker + self._Twist = Twist + self._Vector3 = Vector3 + # Tuned defaults inside VisualServoing2D: max 0.5 m/s, 0.8 rad/s, + # target distance 1.5m, min distance 0.8m. See visual_servoing_2d.py. + self._vs = VisualServoing2D(_camera_info_static()) + self._lost_max = 20 # ~2s at 10Hz before sending stop + self._period = 0.1 + self._stop = threading.Event() + threading.Thread(target=self._loop, daemon=True).start() + print("[follow] controller running (10 Hz, idle until FOLLOW)") + + def shutdown(self) -> None: + self._stop.set() + + def _zero(self) -> Any: + return self._Twist( + linear=self._Vector3(0.0, 0.0, 0.0), + angular=self._Vector3(0.0, 0.0, 0.0), + ) + + def _loop(self) -> None: + lost = 0 + sent_stop = False # avoid spamming zero twists every tick + while not self._stop.is_set(): + try: + if get_following() and get_target() is not None: + bbox_info = self._tracker.get_bbox(get_target()) + if bbox_info is None: + lost += 1 + if lost >= self._lost_max and not sent_stop: + print("[follow] target lost — sending zero twist") + self._brain.conn.move(self._zero()) + sent_stop = True + else: + bbox, w = bbox_info + twist = self._vs.compute_twist(bbox, w) + self._brain.conn.move(twist) + lost = 0 + sent_stop = False + else: + # Not following: ensure dog is stopped once after each + # follow→idle transition, then go quiet. + if not sent_stop: + try: + self._brain.conn.move(self._zero()) + except Exception: + pass + sent_stop = True + lost = 0 + except Exception as e: # noqa: BLE001 + print(f"[follow] tick error: {e}") + time.sleep(self._period) + + +follow_controller: FollowController | None = None # set in main() if real dog + + +# --- E-STOP: panic stop the dog regardless of state ------------------------ +def panic_stop(brain: Any) -> dict[str, Any]: + """Hard stop: + 1. Cancel any follow / target selection. + 2. Spam zero-twist (3x) — joystick auto-timeout would also do this but + we want to be loud. + 3. Send liedown so the dog actively lowers itself. + Safe to call from any thread, any time. + """ + select_target(None) + set_following(False) + out = {"following": False, "tracked_lid": None, "moves_sent": 0, "liedown": False} + if brain is None or not hasattr(brain, "conn"): + out["mock"] = True + return out + from dimos.msgs.geometry_msgs.Twist import Twist + from dimos.msgs.geometry_msgs.Vector3 import Vector3 + zero = Twist(linear=Vector3(0, 0, 0), angular=Vector3(0, 0, 0)) + for _ in range(3): + try: + brain.conn.move(zero) + out["moves_sent"] += 1 + except Exception as e: # noqa: BLE001 + print(f"[halt] zero-twist failed: {e}") + time.sleep(0.05) + try: + brain.conn.liedown() + out["liedown"] = True + except Exception as e: # noqa: BLE001 + print(f"[halt] liedown failed: {e}") + print(f"[halt] PANIC STOP executed: {out}") + return out + + +class MockBrain: + """Stand-in for RobotBrain — no dog, no WebRTC. For previewing the UI. + + Same method surface as RobotBrain so the API routes work unchanged. + """ + + def __init__(self, step_distance: float = 0.3, **_: Any): + self.step_distance = step_distance + print("[brain] MOCK mode — no robot connection") + + def stand_up(self) -> str: + time.sleep(0.3) + return "standing (mock)" + + def sit_down(self) -> str: + time.sleep(0.3) + return "sitting (mock)" + + def lie_down(self) -> str: + time.sleep(0.3) + return "lying down (mock)" + + def wave(self) -> str: + time.sleep(0.3) + return "waved (mock)" + + def step(self, key: str, fast: bool = False) -> str: + valid = {"W", "S", "Q", "E", "A", "D", + "forward", "back", "left", "right"} + k = key.upper() if len(key) == 1 else key + if k not in valid: + raise ValueError(f"unknown step key {key!r}") + time.sleep(0.15) + return f"stepped {k}{' (fast)' if fast else ''} (mock)" + + def stop_move(self) -> str: + return "stopped (mock)" + + def march(self, reps: int) -> str: + time.sleep(0.5) + return f"marched {reps} reps in place (mock)" + + def observe(self) -> dict[str, Any]: + time.sleep(0.4) + # Fake two detections: one near, one far. + return { + "count": 2, + "near_count": 1, + "people": [ + {"h_ratio": 0.71, "conf": 0.92, "near": True}, + {"h_ratio": 0.19, "conf": 0.63, "near": False}, + ], + } + + +class RobotBrain: + """Owns the single WebRTC connection and serializes commands to the dog.""" + + def __init__( + self, + ip: str, + step_distance: float, + step_speed: float, + close_ratio: float, + mode: str = "normal", + ap: bool = False, + angular_speed: float = 0.5, # rad/s for A/D turning (~28°/tap at 1s) + ): + self.ip = ip + self.step_distance = step_distance + self.step_speed = step_speed + self.angular_speed = angular_speed + self.close_ratio = close_ratio + self._lock = threading.Lock() # one command to the dog at a time + self._yolo: Any = None + method = WebRTCConnectionMethod.LocalAP if ap else WebRTCConnectionMethod.LocalSTA + print(f"[brain] connecting to {ip} (motion mode: {mode}, link: {method.name}) ...") + # mode="normal" enables the classic sport commands (StandUp/BalanceStand/ + # joystick move). The library default "ai" mode ignores them. + # ap=True (LocalAP) is for when you're on the dog's own hotspot (192.168.12.1). + self.conn = UnitreeWebRTCConnection(ip=ip, mode=mode, connection_method=method) + time.sleep(2) # let the MOTION_SWITCHER take effect before commanding + # Bring the dog to a known, command-ready state. + print("[brain] standup ...") + self.conn.standup() + time.sleep(3) + print("[brain] balance_stand ...") + self.conn.balance_stand() + time.sleep(1) + # FreeWalk = locomotion gait; without it, joystick velocity often only + # makes the dog lean in place instead of actually walking. + print("[brain] free_walk (enable locomotion) ...") + self.conn.free_walk() + time.sleep(1) + print("[brain] connected and ready (standing, walk-enabled)") + + # --- posture --------------------------------------------------------- + def stand_up(self) -> str: + with self._lock: + self.conn.standup() + time.sleep(2) + self.conn.balance_stand() + print("[cmd] stand_up") + return "standing" + + def sit_down(self) -> str: + with self._lock: + self.conn.publish_request(SPORT_TOPIC, {"api_id": SPORT_CMD["Sit"]}) + return "sitting" + + def lie_down(self) -> str: + with self._lock: + self.conn.liedown() # StandDown — dog lowers all the way to the ground (prone) + print("[cmd] lie_down") + return "lying down" + + def wave(self) -> str: + with self._lock: + self.conn.publish_request(SPORT_TOPIC, {"api_id": SPORT_CMD["Hello"]}) + time.sleep(2) + self.conn.balance_stand() + print("[cmd] wave") + return "waved" + + def _ensure_walk(self) -> None: + """Re-assert FreeWalk locomotion gait so joystick velocity actually walks.""" + self.conn.free_walk() + time.sleep(0.15) + + # --- locomotion ------------------------------------------------------ + def _pulse(self, x: float, y: float, wz: float, duration: float) -> None: + twist = Twist(linear=Vector3(x, y, 0.0), angular=Vector3(0.0, 0.0, wz)) + self.conn.move(twist, duration=duration) + self.conn.move(Twist(linear=Vector3(0, 0, 0), angular=Vector3(0, 0, 0))) + + # Official mapping (mirrors dimos/robot/unitree/keyboard_teleop.py): + # W/S = forward/back (linear.x) + # Q/E = strafe L/R (linear.y) + # A/D = turn L/R (angular.z) + # Old direction names kept as aliases for backward compatibility. + _KEY_TO_VEC: dict[str, tuple[float, float, float]] = {} # filled in __init__ + + def _build_key_map(self) -> None: + s = self.step_speed + w = self.angular_speed + self._KEY_TO_VEC = { + "W": (s, 0.0, 0.0), + "S": (-s, 0.0, 0.0), + "Q": (0.0, s, 0.0), + "E": (0.0, -s, 0.0), + "A": (0.0, 0.0, w), + "D": (0.0, 0.0, -w), + # legacy aliases (older UI): + "forward": (s, 0.0, 0.0), + "back": (-s, 0.0, 0.0), + "left": (0.0, s, 0.0), + "right": (0.0, -s, 0.0), + } + + def step(self, key: str, fast: bool = False) -> str: + if not self._KEY_TO_VEC: + self._build_key_map() + k = key.upper() if len(key) == 1 else key + vec = self._KEY_TO_VEC.get(k) + if vec is None: + raise ValueError(f"unknown step key {key!r}; expected W/S/Q/E/A/D") + vx, vy, wz = vec + mult = 2.0 if fast else 1.0 + vx *= mult; vy *= mult; wz *= mult + dur = self.step_distance / max(self.step_speed, 0.05) + with self._lock: + self._ensure_walk() + self._pulse(vx, vy, wz, dur) + tag = "⚡" if fast else "" + print(f"[cmd] step {tag}{k} vx={vx:.2f} vy={vy:.2f} wz={wz:.2f}") + return f"stepped {k}{' (fast)' if fast else ''}" + + def stop_move(self) -> str: + """Single zero-twist — dog stops moving but stays standing (no liedown). + This is the equivalent of releasing the WASD keys / the official STOP.""" + with self._lock: + self._pulse(0.0, 0.0, 0.0, 0.0) + print("[cmd] stop_move (zero twist)") + return "stopped" + + def march(self, reps: int) -> str: + """Approximate marching in place: short forward+back pulse pairs (net ~0).""" + pulse_dur = 0.18 + with self._lock: + self._ensure_walk() + for _ in range(reps): + self._pulse(self.step_speed, 0.0, 0.0, pulse_dur) + self._pulse(-self.step_speed, 0.0, 0.0, pulse_dur) + self.conn.balance_stand() + print(f"[cmd] march {reps} reps") + return f"marched {reps} reps in place" + + # --- perception ------------------------------------------------------ + def _ensure_yolo(self) -> Any: + if self._yolo is None: + from ultralytics import YOLO # type: ignore[attr-defined] + + print("[brain] loading YOLO11n-pose ...") + self._yolo = YOLO(get_data("models_yolo") / "yolo11n-pose.pt") + return self._yolo + + def _grab_frame(self, timeout: float = 10.0): # type: ignore[no-untyped-def] + holder: list = [] + got = threading.Event() + + def on_next(frame) -> None: # type: ignore[no-untyped-def] + if not holder: + holder.append(frame) + got.set() + + sub = self.conn.video_stream().subscribe(on_next=on_next) + try: + if not got.wait(timeout=timeout): + raise TimeoutError(f"no camera frame within {timeout:.1f}s") + finally: + sub.dispose() + return holder[0] + + def observe(self) -> dict[str, Any]: + model = self._ensure_yolo() + with self._lock: + frame = self._grab_frame() + img = frame.to_opencv() + frame_h = img.shape[0] + results = model.predict(img, conf=0.5, verbose=False) + people = [] + if results and results[0].boxes is not None: + r = results[0] + for box, cls_id, c in zip(r.boxes.xyxy, r.boxes.cls, r.boxes.conf, strict=False): + if int(cls_id.item()) != 0: # COCO class 0 = person + continue + x1, y1, x2, y2 = (float(v) for v in box.tolist()) + h_ratio = (y2 - y1) / frame_h + people.append( + { + "h_ratio": round(h_ratio, 3), + "conf": round(float(c.item()), 3), + "near": h_ratio >= self.close_ratio, + } + ) + near = sum(1 for p in people if p["near"]) + return {"count": len(people), "near_count": near, "people": people} + + +# --------------------------------------------------------------------------- +brain: RobotBrain | None = None +app = FastAPI(title="Go2 Phone Control") + + +class MarchReq(BaseModel): + reps: int = 10 + + +class StepReq(BaseModel): + key: str | None = None # new: W/S/Q/E/A/D + direction: str | None = None # legacy: forward/back/left/right + fast: bool = False + + +@app.get("/") +def index() -> FileResponse: + return FileResponse(HERE / "index.html") + + +@app.get("/people") +def people_page() -> FileResponse: + return FileResponse(HERE / "people.html") + + +@app.get("/api/people") +def api_people() -> JSONResponse: + if people is None: + return JSONResponse( + { + "all": [], "in_view": [], "total": 0, "disabled": True, + "tracked_lid": None, "following": False, + } + ) + state = people.get_state() + state["tracked_lid"] = get_target() + state["following"] = get_following() + return JSONResponse(state) + + +class TrackSelectReq(BaseModel): + long_term_id: int + + +@app.post("/api/track/select") +def api_track_select(req: TrackSelectReq) -> JSONResponse: + select_target(req.long_term_id) + return JSONResponse({"ok": True, "tracked_lid": req.long_term_id, "following": False}) + + +@app.post("/api/track/clear") +def api_track_clear() -> JSONResponse: + select_target(None) + return JSONResponse({"ok": True, "tracked_lid": None, "following": False}) + + +@app.post("/api/follow/start") +def api_follow_start() -> JSONResponse: + if get_target() is None: + return JSONResponse( + {"ok": False, "error": "no target selected"}, status_code=400 + ) + set_following(True) + return JSONResponse({"ok": True, "following": True, "tracked_lid": get_target()}) + + +@app.post("/api/follow/stop") +def api_follow_stop() -> JSONResponse: + set_following(False) + return JSONResponse({"ok": True, "following": False, "tracked_lid": get_target()}) + + +@app.post("/api/halt") +def api_halt() -> JSONResponse: + """E-STOP: cancel target, stop following, zero-twist x3, liedown. + Reachable any time (no auth, no preconditions).""" + result = panic_stop(brain) + return JSONResponse({"ok": True, **result}) + + +def _ok(result: Any) -> JSONResponse: + return JSONResponse({"ok": True, "result": result}) + + +def _err(msg: str) -> JSONResponse: + return JSONResponse({"ok": False, "error": msg}, status_code=500) + + +@app.post("/api/stand_up") +def api_stand_up() -> JSONResponse: + try: + return _ok(brain.stand_up()) # type: ignore[union-attr] + except Exception as e: # noqa: BLE001 + return _err(str(e)) + + +@app.post("/api/sit_down") +def api_sit_down() -> JSONResponse: + try: + return _ok(brain.sit_down()) # type: ignore[union-attr] + except Exception as e: # noqa: BLE001 + return _err(str(e)) + + +@app.post("/api/lie_down") +def api_lie_down() -> JSONResponse: + try: + return _ok(brain.lie_down()) # type: ignore[union-attr] + except Exception as e: # noqa: BLE001 + return _err(str(e)) + + +@app.post("/api/wave") +def api_wave() -> JSONResponse: + try: + return _ok(brain.wave()) # type: ignore[union-attr] + except Exception as e: # noqa: BLE001 + return _err(str(e)) + + +@app.post("/api/march") +def api_march(req: MarchReq) -> JSONResponse: + try: + return _ok(brain.march(req.reps)) # type: ignore[union-attr] + except Exception as e: # noqa: BLE001 + return _err(str(e)) + + +@app.post("/api/step") +def api_step(req: StepReq) -> JSONResponse: + k = req.key or req.direction + if not k: + return _err("missing 'key' (W/S/Q/E/A/D) or 'direction'") + try: + return _ok(brain.step(k, fast=req.fast)) # type: ignore[union-attr] + except Exception as e: # noqa: BLE001 + return _err(str(e)) + + +@app.post("/api/stop_move") +def api_stop_move() -> JSONResponse: + """Soft stop — single zero twist. Dog stops moving but stays standing. + (Use /api/halt for hard panic: clear follow + zero x3 + liedown.)""" + try: + return _ok(brain.stop_move()) # type: ignore[union-attr] + except Exception as e: # noqa: BLE001 + return _err(str(e)) + + +@app.post("/api/observe") +def api_observe() -> JSONResponse: + try: + return _ok(brain.observe()) # type: ignore[union-attr] + except Exception as e: # noqa: BLE001 + return _err(str(e)) + + +def main() -> None: + parser = argparse.ArgumentParser(description="Go2 phone-control server") + parser.add_argument("--ip", help="Robot IP, e.g. 192.168.12.1 (omit with --mock)") + parser.add_argument("--port", type=int, default=8800, help="HTTP port (default 8800)") + parser.add_argument("--step-distance", type=float, default=0.3, dest="step_distance") + parser.add_argument("--step-speed", type=float, default=0.3, dest="step_speed") + parser.add_argument("--close-ratio", type=float, default=0.5, dest="close_ratio") + parser.add_argument( + "--mode", + default="normal", + help="Go2 motion mode: 'normal' (classic sport cmds) or 'ai' (default normal)", + ) + parser.add_argument( + "--ap", + action="store_true", + help="Use LocalAP link (you're on the dog's own hotspot at 192.168.12.1)", + ) + parser.add_argument( + "--mock", + action="store_true", + help="UI preview mode: no robot connection, fake responses", + ) + parser.add_argument( + "--no-people", + action="store_true", + dest="no_people", + help="Disable the people tracker (saves CPU; /api/people returns empty)", + ) + parser.add_argument( + "--mock-people", + action="store_true", + dest="mock_people", + help="Force the MOCK people tracker even when connected to the real dog", + ) + parser.add_argument( + "--osnet", + default="osnet_x0_5", + help="OSNet variant for Re-ID: x0_25/x0_5/x0_75/x1_0 (default x0_5)", + ) + parser.add_argument( + "--no-follow", + action="store_true", + dest="no_follow", + help="Disable the follow controller (no visual-servo loop runs)", + ) + args = parser.parse_args() + + global brain, people, follow_controller + if args.mock: + brain = MockBrain(step_distance=args.step_distance) # type: ignore[assignment] + else: + if not args.ip: + parser.error("--ip is required unless --mock is set") + brain = RobotBrain( + ip=args.ip, + step_distance=args.step_distance, + step_speed=args.step_speed, + close_ratio=args.close_ratio, + mode=args.mode, + ap=args.ap, + ) + + # People tracker: mock by default in --mock; real (YOLO+OSNet) when on a dog + # unless --mock-people overrides. --no-people disables the feature. + if args.no_people: + people = None + print("[people] disabled (--no-people)") + elif args.mock or args.mock_people: + people = MockPeopleTracker() # type: ignore[assignment] + else: + try: + people = RealPeopleTracker(brain.conn, osnet_variant=args.osnet) # type: ignore[assignment] + except Exception as e: # noqa: BLE001 + print(f"[people] real tracker failed to start: {e}\n[people] falling back to MOCK") + people = MockPeopleTracker() # type: ignore[assignment] + + # Follow controller (only meaningful on a real dog). Always available so the + # API exists for testing; in mock mode it just no-ops since get_bbox returns + # None and brain has no `.conn`. + if args.no_follow or args.mock: + follow_controller = None + if args.no_follow: + print("[follow] disabled (--no-follow)") + else: + try: + follow_controller = FollowController(brain, people) + except Exception as e: # noqa: BLE001 + print(f"[follow] failed to start: {e}") + follow_controller = None + + # 0.0.0.0 so the phone on the same Wi-Fi can reach it. + print(f"\n[server] open http://:{args.port}/ on your phone\n") + uvicorn.run(app, host="0.0.0.0", port=args.port, log_level="info") + + +if __name__ == "__main__": + main() diff --git a/examples/go2_walk_backward.py b/examples/go2_walk_backward.py new file mode 100644 index 0000000000..56c5c28791 --- /dev/null +++ b/examples/go2_walk_backward.py @@ -0,0 +1,85 @@ +#!/usr/bin/env python3 +# Copyright 2025-2026 Dimensional Inc. +# SPDX-License-Identifier: Apache-2.0 + +"""Minimal Go2 motion script: stand up, walk BACKWARD ~2 meters, lie down. + +No perception, no blueprint framework — just a direct WebRTC connection. + +⚠️ Walking backward is blind (no rear camera/LiDAR coverage). Make sure + the path behind the robot is clear. Keep the remote in hand for E-stop. + +Run: + .venv/bin/python examples/go2_walk_backward.py --ip 192.168.12.1 + # Options: --distance 2.0 --speed 0.3 +""" + +from __future__ import annotations + +import argparse +import time + +from dimos.msgs.geometry_msgs.Twist import Twist +from dimos.msgs.geometry_msgs.Vector3 import Vector3 +from dimos.robot.unitree.connection import UnitreeWebRTCConnection + + +def main() -> None: + parser = argparse.ArgumentParser(description="Go2: stand up and walk backward") + parser.add_argument("--ip", required=True, help="Robot IP, e.g. 192.168.12.1") + parser.add_argument( + "--distance", + type=float, + default=2.0, + help="Meters to walk backward (default: 2.0)", + ) + parser.add_argument( + "--speed", + type=float, + default=0.3, + help="Linear speed magnitude in m/s (default: 0.3)", + ) + args = parser.parse_args() + + # Both expected positive; we send -speed on x for backward motion. + if args.distance <= 0 or args.speed <= 0: + parser.error("--distance and --speed must be positive (direction is hard-coded backward)") + + duration = args.distance / args.speed + + print(f"Connecting to {args.ip}...") + conn = UnitreeWebRTCConnection(ip=args.ip) + + try: + print("Standing up...") + conn.standup() + time.sleep(3) + + print("Entering balance stand mode...") + conn.balance_stand() + time.sleep(1) + + print( + f"Walking BACKWARD {args.distance:.2f}m at {args.speed:.2f} m/s ({duration:.1f}s)..." + ) + twist = Twist(linear=Vector3(-args.speed, 0.0, 0.0), angular=Vector3(0.0, 0.0, 0.0)) + conn.move(twist, duration=duration) + + # Belt-and-suspenders: explicitly send a zero twist. + conn.move(Twist(linear=Vector3(0, 0, 0), angular=Vector3(0, 0, 0))) + time.sleep(0.5) + + print("Lying down...") + conn.liedown() + time.sleep(2) + + print("Done.") + except KeyboardInterrupt: + print("\nInterrupted — stopping robot.") + conn.move(Twist(linear=Vector3(0, 0, 0), angular=Vector3(0, 0, 0))) + finally: + conn.stop() + + +if __name__ == "__main__": + main() diff --git a/examples/go2_walk_forward.py b/examples/go2_walk_forward.py new file mode 100644 index 0000000000..cb20b0ac32 --- /dev/null +++ b/examples/go2_walk_forward.py @@ -0,0 +1,100 @@ +#!/usr/bin/env python3 +# Copyright 2025-2026 Dimensional Inc. +# SPDX-License-Identifier: Apache-2.0 + +"""Minimal Go2 motion script: stand up, walk forward ~2 steps, lie down. + +No perception, no blueprint framework. Works against real robot (WebRTC), +MuJoCo simulator, or replay dataset, via the same code path. + +Run: + # Real robot + .venv/bin/python examples/go2_walk_forward.py --ip 192.168.12.1 + + # MuJoCo simulator (macOS: requires .venv/bin on PATH for mjpython; + # script auto-adds it) + .venv/bin/python examples/go2_walk_forward.py --ip mujoco + + # Replay (no-op moves, just for code-path sanity) + .venv/bin/python examples/go2_walk_forward.py --ip replay + + # Options: --distance 1.0 --speed 0.3 +""" + +from __future__ import annotations + +import argparse +import os +import sys +import time +from pathlib import Path + +from dimos.core.global_config import global_config +from dimos.msgs.geometry_msgs.Twist import Twist +from dimos.msgs.geometry_msgs.Vector3 import Vector3 +from dimos.robot.unitree.go2.connection import make_connection + + +def _ensure_venv_on_path() -> None: + """MuJoCo spawns a subprocess that needs `mjpython` on PATH. When the user + invokes us as `.venv/bin/python …` instead of activating the venv, the venv + bin dir isn't on PATH. Add it so the subprocess can find mjpython.""" + venv_bin = Path(sys.executable).parent + path_parts = os.environ.get("PATH", "").split(os.pathsep) + if str(venv_bin) not in path_parts: + os.environ["PATH"] = str(venv_bin) + os.pathsep + os.environ.get("PATH", "") + + +def main() -> None: + parser = argparse.ArgumentParser(description="Go2: stand up and walk forward") + parser.add_argument("--ip", required=True, help="Robot IP, e.g. 192.168.12.1") + parser.add_argument( + "--distance", type=float, default=1.0, help="Meters to walk (default: 1.0 = ~2 steps)" + ) + parser.add_argument( + "--speed", type=float, default=0.3, help="Linear speed in m/s (default: 0.3)" + ) + args = parser.parse_args() + + duration = args.distance / args.speed + + if args.ip in ("mujoco", "replay", "fake", "mock"): + _ensure_venv_on_path() + + print(f"Connecting to {args.ip}...") + # make_connection routes: 'mujoco' -> MujocoConnection, 'replay'/'fake'/'mock' -> + # ReplayConnection, anything else -> UnitreeWebRTCConnection(ip) + conn = make_connection(args.ip, global_config) + conn.start() + + try: + print("Standing up...") + conn.standup() + time.sleep(3) + + print("Entering balance stand mode...") + conn.balance_stand() + time.sleep(1) + + print(f"Walking forward {args.distance:.2f}m at {args.speed:.2f} m/s ({duration:.1f}s)...") + twist = Twist(linear=Vector3(args.speed, 0.0, 0.0), angular=Vector3(0.0, 0.0, 0.0)) + conn.move(twist, duration=duration) + + # Belt-and-suspenders: explicitly send a zero twist. + conn.move(Twist(linear=Vector3(0, 0, 0), angular=Vector3(0, 0, 0))) + time.sleep(0.5) + + print("Lying down...") + conn.liedown() + time.sleep(2) + + print("Done.") + except KeyboardInterrupt: + print("\nInterrupted — stopping robot.") + conn.move(Twist(linear=Vector3(0, 0, 0), angular=Vector3(0, 0, 0))) + finally: + conn.stop() + + +if __name__ == "__main__": + main() diff --git a/scripts/run-blueprint.sh b/scripts/run-blueprint.sh new file mode 100755 index 0000000000..069eff043b --- /dev/null +++ b/scripts/run-blueprint.sh @@ -0,0 +1,118 @@ +#!/usr/bin/env bash +# Run a dimos blueprint and ALWAYS save the full terminal output to a +# timestamped log under /tmp/dimos-logs/. A `latest.log` symlink always +# points at the most recent run so you don't have to remember the +# timestamp when something goes wrong. +# +# Usage: +# scripts/run-blueprint.sh # defaults to go2-marauders-map +# scripts/run-blueprint.sh go2-marauders-map # explicit blueprint +# scripts/run-blueprint.sh yoloe-target-lock-distance-follow # any other one +# +# Extra args after the blueprint name are forwarded to `dimos`, e.g.: +# scripts/run-blueprint.sh go2-marauders-map --replay +# +# Defaults assume the real Go2 at 192.168.12.1 on dimair10. Override with: +# SIM=mujoco scripts/run-blueprint.sh go2-marauders-map # MuJoCo physics +# SIM=dimsim scripts/run-blueprint.sh go2-marauders-map # DimSim +# REPLAY=1 scripts/run-blueprint.sh go2-marauders-map # dataset replay (no physics) +# ROBOT_IP=192.168.123.18 scripts/run-blueprint.sh go2-marauders-map +# +# Note: MuJoCo on macOS needs `mjpython` (not plain python) for the GUI to +# render. The script auto-uses .venv/bin/mjpython when SIM=mujoco AND that +# binary exists. If you see a black/frozen MuJoCo window, that's why. + +set -uo pipefail + +# Repo root = parent of this script's directory. +SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" +REPO_ROOT="$(cd "$SCRIPT_DIR/.." && pwd)" +cd "$REPO_ROOT" + +BLUEPRINT="${1:-go2-marauders-map}" +shift || true # consume the blueprint arg if it was passed +EXTRA_ARGS=("$@") + +# Mode resolution priority: SIM > REPLAY > ROBOT_IP > default real-dog IP. +# We surface this as separate env vars rather than overloading ROBOT_IP so the +# log header is unambiguous about WHY this run isn't talking to real hardware. +SIM="${SIM:-}" +REPLAY="${REPLAY:-}" +ROBOT_IP="${ROBOT_IP:-192.168.12.1}" + +MODE_ARGS=() +MODE_LABEL="" +if [ -n "$SIM" ]; then + MODE_ARGS=(--simulation "$SIM") + MODE_LABEL="simulation=$SIM" +elif [ -n "$REPLAY" ]; then + MODE_ARGS=(--replay) + MODE_LABEL="replay" +else + MODE_ARGS=(--robot-ip "$ROBOT_IP") + MODE_LABEL="robot_ip=$ROBOT_IP" +fi + +LOG_DIR="/tmp/dimos-logs" +mkdir -p "$LOG_DIR" + +# Timestamp like 20260528-013742 — sortable, no spaces, no colon. +TS="$(date +%Y%m%d-%H%M%S)" +LOG_FILE="$LOG_DIR/${BLUEPRINT}-${TS}.log" + +# Auto-rotate: keep newest 20, drop the rest. Quiet on first run. +ls -1t "$LOG_DIR"/*.log 2>/dev/null | tail -n +21 | xargs -I{} rm -f -- {} 2>/dev/null || true + +# Pick interpreter: MuJoCo on macOS requires the special `mjpython` launcher +# (it owns the main thread for the Cocoa GL context). If we're in SIM=mujoco +# AND mjpython exists in the venv, dispatch via it; otherwise fall back to +# the normal `dimos` CLI script. +DIMOS_BIN=".venv/bin/dimos" +if [ "$SIM" = "mujoco" ] && [ -x ".venv/bin/mjpython" ]; then + # Invoke the CLI's entry function under mjpython so the MuJoCo GL window + # runs on the macOS main thread. `dimos` console-script targets + # dimos.robot.cli.dimos:cli_main. + DIMOS_CMD=(.venv/bin/mjpython -c + "from dimos.robot.cli.dimos import cli_main; import sys; sys.exit(cli_main())" + "${MODE_ARGS[@]}" --rerun-open native run "$BLUEPRINT") +else + DIMOS_CMD=("$DIMOS_BIN" "${MODE_ARGS[@]}" --rerun-open native run "$BLUEPRINT") +fi +if [ ${#EXTRA_ARGS[@]} -gt 0 ]; then + DIMOS_CMD+=("${EXTRA_ARGS[@]}") +fi + +# Header in the log so post-hoc reading is easier. +{ + echo "==========================================" + echo " dimos blueprint runner" + echo " blueprint : $BLUEPRINT" + echo " mode : $MODE_LABEL" + echo " started : $(date)" + echo " cwd : $REPO_ROOT" + echo " git HEAD : $(git rev-parse --short HEAD 2>/dev/null || echo n/a)" + echo " git status : $(git status --short 2>/dev/null | wc -l | tr -d ' ') uncommitted files" + echo " cmd : ${DIMOS_CMD[*]}" + echo "==========================================" + echo "" +} | tee "$LOG_FILE" + +echo "[runner] full log -> $LOG_FILE" +echo "[runner] follow live: tail -f $LOG_DIR/latest.log" +echo "" + +# Use PATH so any subprocess that calls `dimos` / `python` finds the venv first. +export PATH="$REPO_ROOT/.venv/bin:$PATH" + +# Run and tee. `script` would give us a PTY (preserves ANSI colors, progress +# bars), but tee is enough for plain-text postmortem. +"${DIMOS_CMD[@]}" 2>&1 | tee -a "$LOG_FILE" +RC=${PIPESTATUS[0]} + +# Refresh the latest.log symlink only AFTER a successful tee so a crashed run +# still leaves its file on disk but `latest.log` doesn't point at empty data. +ln -sfn "$LOG_FILE" "$LOG_DIR/latest.log" + +echo "" +echo "[runner] exit $RC — log: $LOG_FILE" +exit $RC