Skip to content

Qusto/crmcopilot

Repository files navigation

AI Sales Copilot

AI Sales Copilot side panel running next to a SaluteJazz call — live focus points, objection answers, and pain-point highlights
Side panel during a live call: focus points, ready objection answers, and pain-point highlights, generated from the uploaded knowledge base.

Real-time sales assistant for live calls. A Chrome extension captures both audio channels of an in-browser SIP call (rep mic + client tab audio), streams them to a Python backend over WebSocket, and the side panel shows transcripts, talk-ratio coaching, and grounded LLM hints in seconds. Pre-call briefing from uploaded knowledge-base files, post-call evaluation + auto-drafted follow-up email and CRM note.

Status: working MVP, demo-ready. Not a managed product — see Limitations.

Works on any browser-based call

The extension attaches to whatever tab hosts the call — no provider-side integration. Same side panel, three different platforms:

Side panel running next to a Yandex Telemost call Side panel running next to a Zoom join page
Yandex Telemost Zoom

SaluteJazz is shown in the hero above. Capture is done via Chrome tabCapture + offscreen getUserMedia — any in-browser call qualifies.


How it works

┌──────────────────────────┐         ┌────────────────────────────────────┐
│ Chrome Extension (MV3)   │         │ Backend (FastAPI, Python 3.11)     │
│                          │         │                                    │
│  service worker          │         │  /ws  WebSocket                    │
│   ├─ tabCapture (client) │  PCM    │   └─► STT (Deepgram /              │
│   ├─ getUserMedia (rep)  │ ──────► │       SaluteSpeech / Yandex)       │
│   └─ offscreen + worklet │  16kHz  │   └─► Orchestrator                 │
│                          │         │       ├─ TalkRatioTracker          │
│  Side panel (Preact)     │ ◄────── │       └─ LLM (OpenRouter, SGR)     │
│   ├─ Brief panel         │  hints  │                                    │
│   ├─ Live-call panel     │  trans- │  /api/v1/briefing  (RAG over docs) │
│   └─ Evaluation report   │  cripts │  /api/v1/evaluation                │
└──────────────────────────┘         │  /api/v1/upload (PDF/XLSX/MD → KB) │
                                     │                                    │
                                     │  Redis  (session state)            │
                                     │  ChromaDB (briefing vectors)       │
                                     └────────────────────────────────────┘

The Preact side panel uses @preact/signals for reactive state. The backend pins LLM output to a Pydantic schema (Schema-Guided Reasoning) so the frontend gets a stable contract for hints, talk ratio, briefing blocks, and evaluation.


Quick start

1. Backend

uv sync                                      # install deps into .venv
cp backend/.env.example backend/.env
$EDITOR backend/.env                          # fill in API keys
uv run uvicorn backend.main:app --reload --port 8000

In another terminal:

docker compose up redis                      # or any Redis on :6379

2. Chrome extension

cd extension
pnpm install
pnpm run build                               # outputs to extension/dist

In Chrome → chrome://extensions → enable Developer Mode → Load unpacked → select extension/dist.

Click the extension icon to open the side panel. Upload knowledge-base files, hit Prepare for Call, then start an audio capture.

3. Test mode (no SIP call required)

The extension has a built-in test toggle that opens the WS without capturing real audio — useful for iterating on UI/hints with synthetic transcripts. It's labeled in the side panel.


Configuration

All backend config is in backend/.env (loaded by pydantic-settings). Full reference in backend/.env.example. Key vars:

Variable Purpose
STT_PROVIDER deepgram (default) / salutespeech / yandex
DEEPGRAM_API_KEY Deepgram WebSocket STT
SBER_SPEECH_API_KEY, SBER_SPEECH_SCOPE SaluteSpeech (gRPC). Requires the public Russian Trusted Root CA cert at backend/certs/russian_trusted_root_ca.pem (already in repo)
YANDEX_SPEECHKIT_API_KEY Yandex SpeechKit
OPENROUTER_API_KEY LLM gateway. Default models: google/gemini-2.5-flash (primary), openai/gpt-4.1-mini (fallback)
REDIS_URL redis://localhost:6379
LOG_LEVEL, VAD_THRESHOLD, SESSION_IDLE_TIMEOUT_S, HINT_CONTEXT_UTTERANCES Runtime tuning

You only need keys for the providers you want to use. Backend will start without keys; calls to unconfigured providers return clear errors.


Tech stack

Backend — Python 3.11, FastAPI, Uvicorn, Pydantic v2 (with Field(description=...) SGR), Redis, ChromaDB + sentence-transformers, OpenRouter, Deepgram SDK, SaluteSpeech via gRPC, Yandex SpeechKit, loguru. Managed with uv.

Extension — Manifest V3, TypeScript (strict), Preact 10 + @preact/signals, Vite + vite-plugin-web-extension, Vitest + @testing-library/preact. Audio worklet for resample/VU.

Toolingruff, mypy, pytest (≈370 tests), pnpm, vitest, just, optional Docker Compose (backend + Redis).


Common commands

just --list                  # discover all recipes
just dev                     # backend + extension watch
just test                    # uv run pytest
just lint                    # ruff + mypy

uv run pytest backend/tests
cd extension && pnpm test
cd extension && pnpm run build   # bumps manifest version, builds worklet + extension

Project layout

See docs/STRUCTURE.md for a full annotated tree. Top level:

backend/      FastAPI app — REST + /ws WebSocket, STT, LLM, briefing, evaluation
extension/    Chrome MV3 extension — service worker, offscreen audio, side panel
specs/        Living feature specs (FEAT-NNN) + status registry
docs/         Design docs, plans, architecture notes
tests/        Cross-component E2E scenarios (browser)
scripts/      health_check.py, tooling installers
.claude/      Agent / skill / rule definitions for the Claude Code workflow
justfile      Unified command runner

Limitations

  • STT providers are regional. Deepgram is global; SaluteSpeech and Yandex SpeechKit work primarily for Russian and require Russian provider accounts.
  • SaluteSpeech needs a non-default root CA. The public Russian Ministry of Digital Development cert is bundled at backend/certs/russian_trusted_root_ca.pem; without it the gRPC TLS handshake fails on most systems. This is not a private credential.
  • LLM cost. Each client utterance can trigger one OpenRouter request. Tune HINT_CONTEXT_UTTERANCES and the cooldown in live-call/hooks/useHintCooldown.ts.
  • Audio capture is browser-bound. Anything outside a Chrome tab (native softphones, OS audio) is out of scope by design.
  • Side panel is a Preact island inside a 2000-line vanilla TS host. Strangler-fig in progress; see specs/FEAT-009-frontend-arch-fixes.md.

License

This project is source-available for non-commercial use under the PolyForm Noncommercial 1.0.0 license. You may freely use, modify, self-host, and share it for personal projects, study, research, evaluation, demos, hobby work, or any other non-commercial purpose.

Commercial use & paid integrations

For deploying AI Sales Copilot inside a real sales process, embedding it into a commercial CRM, building it into a product, or commissioning paid integration / quality / on-call work — please get in touch. A commercial license is straightforward to arrange.

The non-commercial restriction exists so the author can sustain and improve the project when it is used in revenue-generating contexts.


Contributing

This was built as a hack/demo, not as a community project. Feel free to fork; PR review cadence is best-effort. Issues with reproduction steps welcome.

About

Real-time AI sales copilot — Chrome side-panel + FastAPI backend. Live transcripts, talk-ratio coaching, grounded LLM hints, evaluation + follow-up. Source-available, non-commercial.

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors