Skip to content

echo-homebase/echo-monorepo

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

140 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Contributors Forks Stargazers Issues MIT License LinkedIn


Echo Logo

Echo

An AI-powered workflow automation platform — create, record, and run desktop & browser workflows using voice, chat, or visual recording, powered by the EchoPrism vision-language agent.
Explore the docs »

View Demo · Report Bug · Request Feature

Table of Contents
  1. About The Project
  2. Getting Started
  3. Usage
  4. Roadmap
  5. Contributing
  6. License
  7. Contact
  8. Acknowledgments

About The Project

Echo Dashboard Screenshot

Echo is an AI-powered workflow automation platform. Create and edit desktop and browser workflows (from recordings, voice, or chat), then run them via the EchoPrism vision-language agent — which executes steps (navigate, click, type, scroll) on your desktop. Use the web dashboard to manage workflows and runs, and the Electron desktop app for voice-driven control and running your workflows locally.

(back to top)

Architecture

Architecture Diagram

(back to top)

Agent Diagram

Agent Diagram

(back to top)

Built With

Next React Python FastAPI Firebase Google Cloud Electron Docker

(back to top)

EchoPrism chat: Composio v3 and workflows

  • Chat (Gemini + WebSocket) uses the Composio Tool Router pattern: composio.create(user_id=…) with the LangGraph provider, session.tools() meta tools (COMPOSIO_SEARCH_TOOLS, COMPOSIO_MULTI_EXECUTE_TOOL, COMPOSIO_MANAGE_CONNECTIONS, …), and execution through those tools — not bulk get_raw_composio_tools in the agent loop. Toolkits allowed in chat follow COMPOSIO_CHAT_TOOLKITS (see scripts/doppler-env-reference.md).
  • Workflow api_call steps continue to use Composio tools.execute (with optional dangerously_skip_version_check) for deterministic execution and HITL in the Run HUD; they do not require the chat Tool Router session path.
flowchart LR
  subgraph chat [Chat agent]
    WS[WebSocket /ws/chat]
    G[Gemini generate_content]
    TR[Composio Tool Router session + meta tools]
    WS --> G
    G --> TR
  end
  subgraph wf [Workflows]
    API[api_call → tools.execute]
  end
  TR --> CA[Composio API]
  API --> CA
Loading

Langfuse (optional): chat turns, Composio execution, scores, and prompt labels — see scripts/doppler-env-reference.md (“Langfuse billing hygiene”). Tests: from agent/, pytest (fast PR gate); optional slow tests use -m slow when added.

(back to top)

Getting Started

To run the full stack locally or deploy from scratch, follow the phases below.

Prerequisites

Install the following tools before proceeding:

  • Node.js 18+ — nodejs.org or nvm install 18
  • pnpmnpm install -g pnpm
  • Python 3.11+ — python.org or pyenv install 3.11
  • Docker — for building and deploying images
  • gcloud CLIInstall guide
    gcloud auth login
    gcloud auth application-default login
  • Firebase CLI
    npm install -g firebase-tools
    firebase login
  • Doppler (optional but recommended) — for secrets management
    brew install dopplerhq/cli/doppler
    doppler login

Composio (integrations)

Third-party OAuth for Slack, GitHub, and Google runs through Composio managed auth. Echo still uses Firebase for product login; Composio stores provider tokens keyed by Firebase uid.

  1. Create a Composio project and add the toolkits you need (Composio manages default auth configs; no per-toolkit env IDs required for connect links).
  2. On the API server, set COMPOSIO_API_KEY. Optional: COMPOSIO_OAUTH_CALLBACK_URL or FRONTEND_ORIGIN so OAuth can return to your app after session.authorize.
  3. On the Echo Prism agent, set COMPOSIO_API_KEY (and optional COMPOSIO_CHAT_TOOLKITS, COMPOSIO_CHAT_TOOL_LIMIT) so workflow api_call steps execute tools.
  4. Users open Dashboard → Integrations and connect each app; the UI calls GET /api/composio/link?toolkit=….

Workflow api_call steps and chat tools use Composio tool slugs (slug + arguments); execution is Composio-only.

Google APIs: Convenience methods and rest / google_rest are documented in echo_prism_agent/integrations/google.py and scope groups in google_scopes.py. Enable matching APIs and OAuth scopes in Google Cloud and your Composio Google auth config.

Local backend: pnpm run dev:backend sets PYTHONPATH=../agent so integration modules load from echo_prism_agent/integrations/.

(back to top)

Phase 1: GCP Setup

  1. Go to Google Cloud Console and create or select a project with billing enabled.

  2. In APIs & Services → Enable APIs, enable:

    • Cloud Run API
    • Cloud Scheduler API
    • Firestore API
    • Cloud Storage API
    • Gemini API
  3. Go to Cloud Storage → Buckets, create a bucket with Uniform bucket-level access, and note the name (e.g. echo-assets-prod).

(back to top)

Phase 2: Firebase Setup

  1. Go to Firebase Console and create a new project or link your existing GCP project.

  2. Enable authentication: Authentication → Sign-in method → enable Email/Password and Google.

  3. Create Firestore: Firestore Database → Create database → choose Native mode.

  4. Register your web app: Project Settings → Your apps → Add web app (</>) and copy the config object.

  5. Deploy Firestore rules from the project root:

    cd firebase && firebase deploy --only firestore:rules

(back to top)

Phase 3: Service Accounts & IAM

Use the default compute service account for Cloud Run and ensure it has:

  • Firestore: Cloud Datastore User (or Firestore roles)
  • Storage: Storage Object Admin
  • Cloud Run Jobs: Run Jobs Executor

(back to top)

Phase 4: Gemini API Key

  1. Go to Google AI Studio
  2. Sign in, select your GCP project, and create an API key
  3. Copy the key — you'll need it for GEMINI_API_KEY

(back to top)

Phase 5: Local Development

Clone and install:

git clone https://github.com/JasonMun7/echo.git
cd echo
pnpm install
pnpm run install:backend

Option A: Doppler (recommended)

doppler setup   # select project and dev config

Then run each service in a separate terminal:

# Terminal 1 – backend
pnpm run dev:backend

# Terminal 2 – frontend
pnpm run dev

# Terminal 3 – desktop app
pnpm run dev:desktop

# Terminal 4 – Echo Prism agent (LangGraph + OpenRouter + Gemini; `agent/`)
pnpm run dev:agent

# Terminal 5 – LiveKit voice worker (optional; run from repo root so `agent.*` imports resolve)
pnpm run dev:livekit-agent

Set NEXT_PUBLIC_ECHO_AGENT_URL (web) and VITE_ECHO_AGENT_URL (desktop) to http://localhost:8083 in Doppler for local agent access. Set OPENROUTER_API_KEY for GUI inference (Kimi + muscle-mem); override with ECHOPRISM_MUSCLE_MODEL if needed. Install the sibling package: from agent/ run pip install -e ../muscle-mem-agent (required for the Worker, semantic verification, and tool registry). See agent/echo_prism_agent/muscle/MUSCLE_MIGRATION.md for the migration map.

Option B: .env files

# Web app
cd apps/web && cp .env.local.example .env.local
# Edit .env.local with Firebase config and NEXT_PUBLIC_API_URL=http://localhost:8000

# Backend
cd backend && cp .env.example .env
# Edit .env with ECHO_GCP_PROJECT_ID, ECHO_GCS_BUCKET, GEMINI_API_KEY

Local URLs:

Environment Variables Reference:

Variable Required Description
ECHO_GCP_PROJECT_ID Yes GCP project ID
ECHO_GCS_BUCKET Yes GCS bucket name
GEMINI_API_KEY Yes Gemini API key
NEXT_PUBLIC_API_URL Yes Backend URL (web)
NEXT_PUBLIC_ECHO_AGENT_URL Yes Echo Prism agent URL (web)
NEXT_PUBLIC_FIREBASE_* Yes Firebase config (web)
VITE_API_URL Yes Backend URL (desktop)
VITE_ECHO_AGENT_URL Yes Echo Prism agent URL (desktop)
OPENROUTER_API_KEY Recommended OpenRouter key for LangGraph/UI-Tars inference
LIVEKIT_URL Voice only LiveKit server URL
LIVEKIT_API_KEY Voice only LiveKit API key
LIVEKIT_API_SECRET Voice only LiveKit API secret
ECHO_CLOUD_RUN_REGION No Default us-central1

See scripts/doppler-env-reference.md for the full reference.

(back to top)

Phase 6: Deploy to Cloud Run

pnpm run deploy
# or with explicit env:
GEMINI_API_KEY=your-key ECHO_GCS_BUCKET=your-bucket \
  ./scripts/deploy.sh YOUR_GCP_PROJECT_ID us-central1

The script builds and pushes Docker images, deploys frontend and backend as Cloud Run services, and deploys the Echo Prism agent (LangGraph) to Cloud Run (pnpm run deploy:agent).

To deploy the LiveKit voice worker (optional):

pnpm run deploy:livekit-agent

Requires LIVEKIT_URL, LIVEKIT_API_KEY, LIVEKIT_API_SECRET, LIVEKIT_AGENT_SECRET, ECHOPRISM_AGENT_URL, and GEMINI_API_KEY.

Forks: use your GCP project and ensure Doppler prd includes NEXT_PUBLIC_FIREBASE_* (and optional desktop download URLs) so the full deploy builds a complete web image. See scripts/deploy/README.md and scripts/doppler-env-reference.md (section Fork / alternate GitHub).

(back to top)

Usage

Visit the live demo to check out our web app. Make sure to follow the instructions in our releases page to ensure the desktop app can be ran.

  1. Create a workflow — record a screen capture, describe steps via chat, or use voice on the desktop app
  2. Edit steps — review and modify the auto-generated workflow steps in the dashboard
  3. Run — trigger a run from the desktop app; EchoPrism executes each step via vision-language grounding
  4. Monitor — watch the execution and click Ctrl + Shift + V to interrupt for user steering

(back to top)

Roadmap

  • Mobile app automation — Allow Echo to automate tasks on phones as well
  • Fine tuning — Improve model accuracy by training on user data with Vertex AI
  • Expanded integrations — Add third-party app connectors like Slack, Notion, and G-Suite
  • Workflow marketplace — Create a library of community-shared automations users can install and customize
  • Schedule workflows — Allow users to schedule workflows to run at specific times
  • Reduce costs — Optimize OpenRouter / Gemini calls for vision steps

See the open issues for a full list of proposed features and known issues.

(back to top)

Contributing

Contributions are what make the open source community such an amazing place to learn, inspire, and create. Any contributions you make are greatly appreciated.

If you have a suggestion that would make this better, please fork the repo and create a pull request. You can also open an issue with the tag "enhancement". Don't forget to give the project a star!

  1. Fork the Project
  2. Create your Feature Branch (git checkout -b feature/AmazingFeature)
  3. Commit your Changes (git commit -m 'Add some AmazingFeature')
  4. Push to the Branch (git push origin feature/AmazingFeature)
  5. Open a Pull Request

(back to top)

Top Contributors

contrib.rocks image

License

Distributed under the MIT License. See LICENSE.txt for more information.

(back to top)

Contact

Jason Munjason.mun484@gmail.com · LinkedIn

Andrew Cheungandrewcheung360@gmail.com · LinkedIn

Project Link: https://github.com/JasonMun7/echo

(back to top)

Acknowledgments

  • OpenRouter — UI-Tars–compatible models for LangGraph inference
  • LiveKit — Real-time voice and video infrastructure
  • Gemini — Vision-language model powering EchoPrism
  • UI-TARS — GUI agent model for automated UI interaction
  • Best-README-Template

(back to top)

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors