Skip to content

Pedrom2002/rushtalk

Repository files navigation

RushTalk

A real-time voice & text communication platform for gamers
Built with Rust, Go, Svelte 5, and Tauri 2.0

CI status License: MIT Platforms i18n


Why I built this

I wanted to find out, end-to-end, what makes a voice-chat app fast — not just "use WebRTC and call it a day". Discord and TeamSpeak feel snappy because every layer is tuned: the audio thread doesn't allocate, the jitter buffer adapts to the network, the codec is configured for voice not music. Reading about it isn't the same as making it work.

So I built RushTalk as a way to prove I could:

  1. Write a real-time audio engine in Rust without leaning on a black-box library. The capture path is zero-allocation, runs on a dedicated thread with elevated priority, and the resampler (resampler.rs) is hand-written linear interpolation that fits in 100 lines.
  2. Design a Go backend with Clean Architecture that doesn't collapse under its own ceremony. Domain entities are pure Go, repos are interfaces, application services orchestrate, infrastructure is swappable. Took some discipline to keep the layering honest.
  3. Ship a desktop app that doesn't feel like a webview — Tauri 2.0 with proper IPC, deep-link OAuth callbacks, native file dialogs, system tray. ~20MB binary instead of Electron's 200MB.
  4. Cover the boring stuff that actually matters — Prometheus metrics with bounded cardinality, audit logging for sensitive ops, OpenAPI spec, i18n (en + pt-PT), error boundaries, CI/CD, K8s manifests with HPA + probes.

Trade-offs I made consciously:

  • No echo cancellation. Adds 5-10ms of latency for a feature most gamers don't need (they wear headsets). Documented the choice rather than shipping a half-baked AEC.
  • LiveKit instead of raw libwebrtc. I wanted to focus on the audio engine and backend architecture, not on TURN server tuning. LiveKit is just transport here — the encoder, processing pipeline, and mixer are all mine.
  • Hand-written OpenAPI YAML instead of swag/ogen. Spec is the source of truth, no codegen step, no extra dep. The trade-off is manual maintenance — fine for a project this size.
  • Linear resampling instead of sinc/cubic/rubato. Voice bandwidth is far below Nyquist at 48kHz so aliasing is inaudible. One f32 of state, zero deps.

What I learned: real-time audio is humbling, OAuth flows have more edge cases than you'd think (auto-accept on mutual friend request was a fun one to discover), and Tauri's deep-link plugin requires registering a custom URL scheme with the OS — so OAuth flows only work in a packaged build, not tauri dev.


Overview

RushTalk is a full-stack, production-grade voice communication application — think Discord, rebuilt from scratch with a focus on low-latency audio and modern architecture. It features a custom zero-allocation Rust audio engine with sub-60ms latency, a Clean Architecture Go backend, and a Svelte 5 desktop client powered by Tauri 2.0.

Key Highlights

  • Custom real-time audio engine in Rust with adaptive jitter buffer, Opus encoding, and packet loss concealment
  • Full-stack monorepo spanning 4 languages (Rust, Go, TypeScript, SQL) with ~12,000+ lines of code
  • Production-ready infrastructure — Docker, Kubernetes, CI/CD pipelines, monitoring
  • All tests passing across 3 test suites (Rust, Go, Vitest)

Architecture

flowchart TB
    subgraph DESKTOP["🖥️ Desktop Client — Tauri 2.0"]
        SVELTE["Svelte 5 Frontend<br/>(18 components, 12 stores)"]
        IPC["Tauri IPC<br/>(40+ commands)"]
        AUDIO["Rust Audio Engine<br/>(capture → pipeline → Opus)<br/>(LiveKit → mixer → playback)"]
        SVELTE -.IPC.-> IPC
        IPC --> AUDIO
        AUDIO --> IPC
    end

    subgraph SFU["☁️ LiveKit SFU (WebRTC)"]
        ROUTER["PCM routing<br/>per-room, per-participant"]
    end

    subgraph BACKEND["🟢 Go Backend — Clean Architecture"]
        HTTP["Echo HTTP Handlers<br/>(auth, friends, reactions,<br/>channels, voice, files)"]
        WS["WS Hub<br/>(presence, typing,<br/>messages, voice state,<br/>friends, reactions)"]
        APP["Application Services<br/>(auth, oauth, friendship)"]
        DOMAIN["Domain Layer<br/>(entities, repos,<br/>permissions bitmask)"]
        INFRA["Infrastructure<br/>(PostgreSQL pgx v5,<br/>Redis, MinIO S3,<br/>LiveKit Twirp Admin)"]
        HTTP --> APP
        WS --> APP
        APP --> DOMAIN
        DOMAIN --> INFRA
    end

    AUDIO <-.WebRTC PCM.-> ROUTER
    SVELTE <-.HTTPS.-> HTTP
    SVELTE <-.WebSocket.-> WS
    HTTP <-.token mint+room CRUD.-> SFU

    classDef desktop fill:#fef3c7,stroke:#f59e0b,color:#92400e
    classDef sfu fill:#ede9fe,stroke:#8b5cf6,color:#5b21b6
    classDef backend fill:#d1fae5,stroke:#10b981,color:#065f46
    class DESKTOP,SVELTE,IPC,AUDIO desktop
    class SFU,ROUTER sfu
    class BACKEND,HTTP,WS,APP,DOMAIN,INFRA backend
Loading

Tech Stack

Layer Technology Purpose
Desktop Runtime Tauri 2.0 Native wrapper, IPC bridge, auto-updater
Frontend Svelte 5 + TypeScript Reactive UI, state management
Audio Engine Rust (cpal + audiopus) Capture, processing, Opus codec, playback
Voice Transport LiveKit (WebRTC SFU) Multi-user voice routing, TURN/STUN
Backend API Go (Echo framework) REST endpoints, business logic
Real-time Events WebSocket + Redis Pub/Sub Chat, presence, typing, voice state
Database PostgreSQL 16 Users, servers, channels, messages (16 tables)
Cache / Sessions Redis 7 JWT blocklist, rate limiting, presence
Object Storage MinIO (S3-compatible) File uploads, avatars
Monitoring Prometheus + Grafana Metrics, alerting
CI/CD GitHub Actions Tests, multi-platform builds, Docker push
Deployment Kubernetes + Docker HPA, TLS (cert-manager), health checks

Features

Voice Communication

  • Multi-user voice channels via LiveKit SFU with < 60ms latency
  • Custom audio processing pipeline: high-pass filter, noise suppression, AGC, VAD
  • Opus encoding at 32kbps with adaptive Forward Error Correction
  • Adaptive jitter buffer (EMA-based, 10–80ms range)
  • Packet Loss Concealment — repeats last decoded frame for up to 50ms
  • Speaking indicators with real-time animation
  • Per-user volume control and mute/deafen

Text Chat

  • Per-channel message history with keyset pagination
  • Real-time delivery via WebSocket
  • Message grouping (consecutive messages from same author within 5 minutes)
  • File attachments via S3-compatible storage (100 MB cap, MIME whitelist)
  • Typing indicators with throttled emit + 5-second TTL gc
  • Message reactions — emoji chips with optimistic toggle, inline picker, real-time updates via WS

Server & Channel Management

  • Create and join servers via invite codes
  • Channel types: text, voice, category
  • Role-based permission system (64-bit bitmask with channel overwrites)
  • Member management with role assignment

Social

  • Friendships — send by username, auto-accept on mutual request, accept / deny / cancel / unfriend / block flow, real-time WS notifications, badge for incoming requests on the user panel

Audio Settings

  • Input/output device selection with live switching
  • Automatic sample rate conversion — linear-interpolation resampler engages when the device's native rate isn't 48kHz (44.1k/96k/16k/etc.) so the rest of the pipeline stays at a fixed frame size
  • Hot-swap watchdog — automatically rebuilds the capture stream when cpal reports a dead device (mic unplugged, driver crash); user is notified via toast and recovery is attempted in-place
  • Noise suppression levels (Off / Low / Standard / Aggressive)
  • Voice activation detection with configurable threshold
  • Push-to-talk mode
  • Real-time diagnostics (underruns, PLC frames, latency)
  • Mic loopback test

Gaming Overlay

  • Always-on-top transparent window
  • Global hotkey toggle (Shift + `)
  • Shows active voice channel participants
  • Configurable opacity and position

Authentication

  • Email/password registration and login
  • OAuth: Steam (OpenID 2.0) and Google (OAuth 2.0 + PKCE) — browser-mediated with deep-link callback (rushtalk://auth/callback)
  • JWT RS256 with 15-minute access tokens and 7-day sliding refresh
  • Automatic token refresh on 401

Configuring OAuth providers

OAuth is optional — the API runs without it. To enable:

Google (OAuth 2.0 + PKCE, OIDC)

  1. Google Cloud Console → Create Credentials → OAuth client ID → Web application
  2. Authorized redirect URI: http://localhost:8080/auth/oauth/google/callback (prod: your public API domain)
  3. Set env: GOOGLE_CLIENT_ID, GOOGLE_CLIENT_SECRET, GOOGLE_REDIRECT_URI

Steam (OpenID 2.0)

  1. Steam Web API key for your domain
  2. Set env: STEAM_API_KEY, STEAM_REALM (e.g. http://localhost:8080), OAUTH_STEAM_CALLBACK_URL (absolute URL Steam will redirect back to)

Desktop deep link

  • OAUTH_DESKTOP_CALLBACK_URL=rushtalk://auth/callback (the desktop app registers this scheme via tauri-plugin-deep-link)

The backend holds client secrets, so the desktop build ships no provider credentials. PKCE is enforced on the Google flow for defence-in-depth.

Internationalisation

  • Built on svelte-i18n with English and European Portuguese locales out of the box
  • Auto-detect: persisted user choice → browser locale → English fallback
  • All user-facing components localised (login, register, OAuth callback, chat, friends panel, settings modal, channel sidebar, error boundary, toasts)

Observability

  • /metrics Prometheus endpoint with per-route HTTP histograms, WS connections gauge, voice rooms gauge, event-publish counter
  • Structured audit log for sensitive ops (kicks, role assignments, channel/message deletes) — audit:true tag for downstream filtering
  • Tauri-side audio diagnostics (underruns, PLC frames) refreshed every 2s in Settings

Developer experience

  • OpenAPI 3.0 spec at /openapi.yaml, Swagger UI at /docs (live exploration of all 21 endpoints; no codegen step required)
  • CONTRIBUTING.md with local setup, conventions, and test commands
  • Full test coverage on the new modules (Go service tests, Vitest store tests, Rust resampler tests)

Platform Support

  • Windows (x86_64)
  • macOS (ARM64 + Intel)
  • Linux (x86_64 AppImage)

Project Structure

RushTalk/
├── apps/
│   ├── api/                              # Go backend
│   │   ├── cmd/server/main.go            # Entry point, dependency injection
│   │   ├── config/                       # Environment configuration
│   │   └── internal/
│   │       ├── domain/                   # Business entities, repository interfaces
│   │       │   ├── user/                 # User, OAuth, Presence
│   │       │   ├── server/               # Server, Member, Role, Permissions
│   │       │   ├── channel/              # Channel, Message, Overwrite
│   │       │   └── voice/                # VoiceRoom, VoiceParticipant
│   │       ├── application/              # Use-case services
│   │       │   ├── auth/                 # Login, OAuth, JWT refresh
│   │       │   ├── user/                 # Profile management
│   │       │   ├── server/               # Server CRUD, membership
│   │       │   ├── channel/              # Channel CRUD, messages
│   │       │   └── voice/                # Room join/leave, LiveKit tokens
│   │       ├── infrastructure/           # External adapters
│   │       │   ├── postgres/             # pgx v5 repositories + migrations
│   │       │   ├── redis/                # Session blocklist, rate limit
│   │       │   ├── storage/              # S3/MinIO client
│   │       │   └── livekit/              # Token generation
│   │       └── interface/
│   │           ├── http/                 # Echo handlers + middleware
│   │           └── ws/                   # WebSocket hub + events
│   │
│   └── desktop/                          # Tauri desktop application
│       ├── src/                          # Svelte 5 frontend
│       │   ├── routes/                   # Pages (main, login, register, overlay)
│       │   └── lib/
│       │       ├── components/           # 18 Svelte components
│       │       │   ├── chat/             # ChatPanel
│       │       │   ├── layout/           # AppLayout, ServerRail, ChannelSidebar
│       │       │   ├── overlay/          # OverlayWindow
│       │       │   ├── ui/               # SettingsModal, ContextMenu
│       │       │   └── voice/            # VoicePanel, VoiceChannelUsers
│       │       ├── stores/               # 9 Svelte stores (auth, voice, messages...)
│       │       ├── services/             # API client, WebSocket, Tauri bridge
│       │       └── types/                # TypeScript type definitions
│       └── src-tauri/                    # Rust backend for Tauri
│           └── src/
│               ├── commands/             # IPC command handlers (40+)
│               └── state/                # AppState (AudioEngine + VoiceRoom)
│
├── crates/                               # Rust workspace
│   ├── rushtalk-audio/                   # Audio engine (1,400+ LOC)
│   │   └── src/
│   │       ├── capture.rs                # Mic input, spin-yield loop
│   │       ├── playback.rs               # Speaker output, PLC, mixer tick
│   │       ├── processing/               # Pipeline, gain, VAD, noise suppression
│   │       ├── codec/                    # Opus encoder/decoder
│   │       ├── buffer/                   # Adaptive jitter buffer
│   │       ├── mixer.rs                  # Multi-user audio mixer
│   │       └── diagnostics.rs            # Runtime metrics
│   ├── rushtalk-voice/                   # LiveKit integration (650+ LOC)
│   │   └── src/
│   │       ├── connection.rs             # WebRTC publish/subscribe
│   │       └── room.rs                   # VoiceRoom management
│   └── rushtalk-protocol/                # Shared types (350 LOC)
│       └── src/
│           ├── models.rs                 # Domain models
│           ├── events.rs                 # Event definitions
│           └── errors.rs                 # Error types
│
├── deploy/
│   ├── k8s/                              # Kubernetes manifests
│   ├── prometheus/                       # Monitoring configuration
│   └── livekit/                          # LiveKit server config
│
├── .github/workflows/
│   ├── ci.yml                            # Test pipeline (Go + Rust + Vitest)
│   └── release.yml                       # Multi-platform build + Docker push
│
├── docker-compose.yml                    # Local dev stack (7 services)
├── Makefile                              # Development commands
└── Cargo.toml                            # Rust workspace configuration

Audio Engine Deep Dive

The audio engine is the core differentiator of this project — a custom, low-latency audio pipeline built entirely in Rust.

TX (Transmit) Path

Microphone (cpal, 48kHz mono, 5ms buffers)
    │
    ▼
Raw PCM Ring Buffer (lock-free SPSC, 200ms capacity)
    │
    ▼
CaptureProcessor (dedicated thread, elevated priority)
    ├── High-pass filter (100Hz cutoff)
    ├── Noise suppression (configurable level)
    ├── Automatic Gain Control
    ├── Voice Activity Detection
    ├── Input volume scaling
    │
    ├──► LiveKit NativeAudioSource (PCM i16, WebRTC handles Opus internally)
    │
    └──► Opus Encoder (32kbps, adaptive FEC) → Encoded Ring Buffer

RX (Receive) Path

LiveKit NativeAudioStream (per remote participant)
    │
    ▼
RX Bridge Task (tokio, 10ms polling)
    │
    ▼
Decoded Audio Ring Buffer (lock-free SPSC, 64 packets)
    │
    ▼
PlaybackProcessor (dedicated thread, elevated priority)
    ├── Per-user frame accumulation
    ├── Packet Loss Concealment (repeat last frame, max 50ms)
    ├── Multi-user AudioMixer (time-based, 10ms ticks)
    ├── Stale user garbage collection (2s timeout)
    │
    ▼
Mixed PCM Ring Buffer (lock-free SPSC, 200ms capacity)
    │
    ▼
Speaker (cpal, 48kHz, volume scaling + deafen)

Performance Characteristics

Metric Value
End-to-end latency < 60ms (same region)
Frame size 480 samples (10ms @ 48kHz)
Hardware buffer 240 samples (5ms) when supported
Opus bitrate 32 kbps
Jitter buffer range 10–80ms (adaptive)
Hot-path allocations Zero (pre-allocated buffers)
Thread priority Max (via thread-priority crate)

Database Schema

16 tables across 3 migrations (310 lines of SQL):

-- Core
Users, UserOAuth, UserPresence, Sessions

-- Social
Servers, ServerMembers, ServerRoles, Friendships, Invites

-- Communication
Channels, ChannelOverwrites, Messages, Attachments, MessageReactions

-- Voice
VoiceRooms, VoiceParticipants

Permission system uses a 64-bit bitmask with role-based resolution and per-channel overwrites (allow/deny), computed entirely in memory — no DB round-trip for authorization checks.


Getting Started

Prerequisites

  • Rust 1.75+ (with cargo — required for desktop app + audio crate)
  • Go 1.24+ (backend API)
  • Node.js 22+ with npm (Svelte frontend)
  • Docker + Docker Compose (Postgres, Redis, MinIO, LiveKit)
  • golang-migrate CLI for schema migrations: go install -tags 'postgres' github.com/golang-migrate/migrate/v4/cmd/migrate@latest

Run make doctor to verify your environment before going further — it checks every prerequisite, prints what's missing, and exits non-zero so it's CI-friendly too.

Quick Start (5 minutes)

# 1. Clone + verify environment
git clone https://github.com/Pedrom2002/rushtalk.git
cd rushtalk
make doctor

# 2. Spin up infrastructure (≈30s on first run while images pull)
make dev-up
make migrate-up

# 3. Seed demo data — creates a "demo" user (password: demodemo),
#    a "RushTalk Demo" server with #general + Lobby voice channel,
#    plus a few seed messages so the chat isn't empty on first launch.
make seed

# 4. Start the desktop app (compiles the Rust crates the first time — ~2 min)
cd apps/desktop && npm install && npm run tauri dev

Then log in with demo@rushtalk.dev / demodemo. The API runs in the api container; if you want to iterate on the Go code instead, stop it with docker compose stop api and run cd apps/api && go run ./cmd/server.

Original step-by-step

If you'd rather not use the seed script:

# Start infrastructure (Postgres, Redis, MinIO, LiveKit, Prometheus, Grafana)
make dev-up

# Run database migrations
make migrate-up

# Start the Go API
cd apps/api && go run ./cmd/server

# Start the desktop app (in a new terminal)
cd apps/desktop && npm install && npm run tauri dev

Available Commands

make dev-up          # Start Docker services
make dev-down        # Stop Docker services
make dev-logs        # View service logs
make test            # Run all test suites (Go + Rust + Vitest)
make build-api       # Build Go API binary
make build-desktop   # Build Tauri desktop app
make migrate-up      # Run database migrations
make migrate-down    # Rollback migrations
make k8s-apply       # Deploy to Kubernetes

Testing

# All tests
make test

# Individual suites
cargo test -p rushtalk-audio -p rushtalk-protocol    # 25/25 pass
cd apps/api && go test ./...                          # All packages pass (auth, oauth, friendship, server, livekit, handler)
cd apps/desktop && npm test                           # 28/28 pass
cd apps/desktop && npm run check                      # 0 errors, 584 files
Suite Framework Tests Status
Rust Audio cargo test 25 (incl. 3 resampler) All passing
Go Backend go test (table-driven) All packages All passing
Frontend Vitest 28 (voice, friends, typing, reactions) All passing
Type Check svelte-check 584 files Zero errors

Deployment

Docker

# Build API image
docker build -t rushtalk-api -f apps/api/Dockerfile .

# Run full stack
docker-compose up -d

The API image uses a multi-stage build with a distroless runtime (~20 MB).

Kubernetes

# Apply manifests
make k8s-apply

# Includes:
# - Deployment with HPA (1-10 replicas, 70% CPU target)
# - Nginx Ingress with cert-manager TLS
# - Health checks (liveness + readiness probes)
# - Prometheus metrics endpoint

CI/CD

The GitHub Actions pipeline runs 3 parallel test suites on every push and PR. On version tags (v*), it builds:

  • Windows .exe installer
  • macOS .dmg (ARM64 + Intel)
  • Linux AppImage
  • Docker image pushed to GHCR

Configuration

Environment Variables (Backend)

Variable Description Default
DATABASE_URL PostgreSQL connection string postgres://...
REDIS_URL Redis connection string redis://localhost:6379
JWT_PRIVATE_KEY_PATH RS256 private key for signing keys/private.pem
JWT_PUBLIC_KEY_PATH RS256 public key for verification keys/public.pem
LIVEKIT_API_KEY LiveKit API key
LIVEKIT_API_SECRET LiveKit API secret
S3_ENDPOINT MinIO/S3 endpoint localhost:9000
CORS_ORIGINS Allowed CORS origins http://localhost:1420

Logs

Application logs are written to %APPDATA%/rushtalk/rushtalk.log (Windows) on release builds. In development, logs print to stdout.

Set RUST_LOG for granular control:

RUST_LOG=rushtalk=debug,rushtalk_audio=trace cargo tauri dev

Technical Decisions

Decision Choice Rationale
Audio codec Opus via audiopus 0.2 Industry standard for voice, adaptive FEC
Audio I/O cpal 0.15 Cross-platform (WASAPI/ALSA/CoreAudio)
Ring buffers rtrb (lock-free SPSC) Zero-allocation, real-time safe
Desktop framework Tauri 2.0 Smaller than Electron, native Rust backend
Frontend Svelte 5 Minimal runtime, reactive stores, fast
Backend framework Echo (Go) High performance, built-in middleware
Database driver pgx v5 Statement caching, connection pooling
Voice transport LiveKit Open-source SFU, handles TURN/STUN
Auth tokens JWT RS256 Stateless verification, secure refresh flow
Permissions 64-bit bitmask In-memory resolution, no DB round-trip
Release builds LTO + strip + codegen-units=1 Maximum optimization for production

License

MIT


Built with Rust, Go, Svelte, and a lot of audio engineering.

About

Low-latency voice chat for gamers — built with Tauri 2, Svelte 5, Rust audio engine, Go backend & LiveKit WebRTC

Topics

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors