Skip to content

Add Team Pivot's Fetch hackathon submission#2297

Open
Wenjix wants to merge 8 commits into
dimensionalOS:mainfrom
Wenjix:hackathon/fetch
Open

Add Team Pivot's Fetch hackathon submission#2297
Wenjix wants to merge 8 commits into
dimensionalOS:mainfrom
Wenjix:hackathon/fetch

Conversation

@Wenjix
Copy link
Copy Markdown

@Wenjix Wenjix commented May 28, 2026

One sentence

Fetch is a Unitree Go2 robot dog that trades ice-cold Cokes for instant photos — a vision LLM "reads the room" on every camera frame and decides where to move, what to say, and when to snap the shot, running as a single FastAPI + WebSocket server you can try from a phone browser before any robot is involved.

Hackathon submission from Team Pivot — Philip Seifi (@seifip), Wenjie Fu (@Wenjix), and GuoZi (@GuoZhuoRan).

90-second reviewer path

  1. Demo video: https://www.youtube.com/watch?v=8hHYE1239wg
  2. Full source: https://github.com/seifip/robodog-fetch
  3. Run it (zero hardware): https://github.com/seifip/robodog-fetch#quickstart-run-it-yourself

The opportunity

Fetch is an autonomous brand ambassador and mobile vendor — here for Coca-Cola — that hands out product, creates a memorable branded moment, and walks away with the guest's photo. The longer-term vision: fleets of autonomous robot-dog vendors that roam the beach and self-resupply at beachside bars and vendors, or at dedicated autonomous resupply stations.

What matters

  • DimOS is the runtime — Fetch reuses the DimOS teleop web pattern and drives a real Unitree Go2 over DimOS's WebRTC stack, with selectable connection modes (auto / local_ap / local_sta) so it reaches the dog on its local-AP network at 192.168.12.1 as well as standard Wi-Fi.
  • Fetch is the behavior layer — vision-LLM decision loop, persona, approach/trade/photo state machine, and runtime-switchable voice (one-way TTS across Cartesia/Gemini/OpenAI, plus opt-in two-way Gemini Live with tool calls).
  • Real-time by design — we benchmarked round-trip latency across vision and speech models (scripts/latency_bench.py) and run the fastest combo (Gemini 2.5 Flash-Lite vision + Cartesia Sonic speech); frames are downscaled before analysis and the loop lands around one second.
  • Instant photo → the guest's hands — captured from the Go2's camera + LiDAR and composited with the Fetch logo, the shot (plus our demo recordings) syncs to iCloud / Google Drive via mirror folders (FETCH_PHOTO_MIRROR_DIRS), and a synced phone sends it to a Xiaomi mini-printer via the printer's app for an instant physical print.
  • Three camera sources, one loop — phone browser camera, Record3D USB RGBD (real iPhone LiDAR depth), and a live Go2.

Why a beach?

Quadrupeds earn their keep on terrain wheels can't handle, so we built Fetch around that. We chose sand for a form-factor reason: the Go2's camera sits low and looks up at standing people, but on a beach people sit or lie on the sand — dropping into the dog's natural eye-line and making the interaction feel natural. And it's feasible today: quadrupeds already run on sand (RaiBo at 3 m/s) and sand-walking foot adaptations cut foot sinkage ~46%.

What's next

  • Sense the trade — the Go2 EDU's foot-force sensors could detect a Coke lifted from the back via the change in total load, closing the loop without the camera's framing check.
  • Real sand — fit sand-walking foot adaptations for an outdoor beach deployment.

Scope boundary

This PR is a submission pointer: full source, demo, and assets are hosted externally. It adds a single file (hackathon/fetch/README.md) and does not vendor Fetch into DimOS or modify any DimOS runtime code.

Validation

  • pytest -q in the project repo: 76 passed, all providers mocked (no real API calls) — covers policy normalization, middleware routes, TTS, conversation tools, and photo saving.
  • This PR changes markdown only — one added file.

Copilot AI review requested due to automatic review settings May 28, 2026 16:18
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

Adds a hackathon submission README for "Fetch", a Unitree Go2 project by Team Pivot, as a pointer to an external repository.

Changes:

  • New markdown file under hackathon/fetch/ summarizing the project.
  • Links to external source repo and (pending) demo video.
  • No DimOS runtime code modified.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread hackathon/fetch/README.md Outdated
Comment thread hackathon/fetch/README.md Outdated
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented May 28, 2026

Greptile Summary

This PR adds a single markdown file (hackathon/fetch/README.md) as a hackathon submission pointer for Team Pivot's "Fetch" project — a Unitree Go2 robot dog that autonomously trades Cokes for branded instant photos using a vision-LLM decision loop on top of DimOS's WebRTC stack. No DimOS runtime code is modified.

  • Adds hackathon/fetch/README.md describing the Fetch system architecture, demo links, and integration with DimOS primitives (WebRTC, teleop pattern).
  • All external URLs in the file resolve; the demo video link now points to a live YouTube URL.
  • The "How to Run" command (python -m dimos.experimental.fetch.iphone_middleware) references a module not vendored in this repo — flagged in a prior review thread and acknowledged in the Scope Boundary section.

Confidence Score: 5/5

This PR adds only a markdown README with no runtime code changes — it is safe to merge.

The change is a single documentation file under hackathon/. No DimOS source is touched, no dependencies are added, and no logic is changed. All external links in the README resolve to real resources.

No files require special attention.

Important Files Changed

Filename Overview
hackathon/fetch/README.md Adds a hackathon submission pointer (README only). No DimOS runtime code is touched. All external links resolve and the demo video URL is present. Previously flagged issues (broken run command, missing video link) are noted in existing threads.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[Camera Frame
phone / Record3D / Go2] --> B[Vision LLM
Gemini 2.5 Flash-Lite]
    B --> C{Decision
state machine}
    C -->|scan| D[Scan crowd
cmd_vel]
    C -->|approach| E[Obstacle-aware
approach]
    C -->|trade| F[Wave + one-liner
offer Coke]
    C -->|photo| G[Snap photo
dance]
    D --> H[Go2 WebRTC
DimOS runtime]
    E --> H
    F --> I[TTS / Gemini Live
Cartesia Sonic]
    G --> J[Branded Polaroid
logo composite]
    J --> K[Mirror to iCloud /
Google Drive]
    K --> L[Xiaomi mini-printer
physical print]
    H --> A
    I --> A
Loading

Reviews (6): Last reviewed commit: "Document camera/LiDAR capture, iCloud/Dr..." | Re-trigger Greptile

Comment thread hackathon/fetch/README.md
Comment thread hackathon/fetch/README.md Outdated
@Wenjix Wenjix requested a review from a team May 28, 2026 16:48
@leshy leshy added the hackaton label May 29, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants