CarryTalk

CarryTalk is a Tauri desktop app for real-time transcription and translation. It combines a Svelte-based desktop UI with a Rust-powered native layer to capture audio, stream speech data, and persist session output locally.

Overview

CarryTalk is designed around a live transcription workflow:

capture audio from available sources
start, pause, resume, and stop transcription sessions
display transcript updates in real time
optionally show translated output alongside the original text
save session data locally for later access and recovery

The current codebase includes a Soniox-based provider flow, desktop settings management, localized UI resources, and a native session pipeline built with Tauri and Rust.

Features

Real-time session lifecycle with start, pause, resume, and stop controls
Live transcript rendering with timestamps and support for original and translated text
Audio source configuration for microphone, system audio, or mixed capture modes depending on runtime capabilities
Device selection support for available audio inputs and outputs
Provider and API key management through the settings flow
Local session persistence using session folders, manifests, and JSONL transcript parts
Interrupted session recovery on app startup
Desktop-friendly UI with theme and language preferences
Built-in localization resources for English and Vietnamese

Tech Stack

Frontend

Svelte 5
TypeScript
Vite
Tailwind CSS 4
Tauri JavaScript APIs

Native/Desktop

Tauri 2
Rust
Tokio
WebSocket-based streaming with tokio-tungstenite
Audio capture with cpal
Audio resampling with rubato
Local secret handling with aes-gcm and argon2

Installation

Prerequisites

Before running the app, make sure your environment satisfies the system requirements for building Tauri applications on your platform.

Clone the repository

git clone https://github.com/tuannt39/carry-talk.git
cd carry-talk

Install dependencies

npm install

Running the App

Frontend development server

npm run dev

Run the Tauri desktop app in development mode

npm run tauri -- dev

Type and Svelte checks

npm run check

Build the frontend

npm run build

Build the desktop app

npm run tauri -- build

Usage

Launch the application in development or from a built desktop bundle.
Open the settings panel and configure the current provider settings.
Add or update the required API key.
Choose the desired audio capture mode and device configuration.
Start a session to begin receiving live transcript updates.
View original and translated transcript text in the main transcript area.
Stop the session when finished. Session data is stored locally for recovery and listing.

Project Structure

carry-talk/
├── src/
│   ├── App.svelte                 # Application shell and startup flow
│   ├── main.ts                    # Frontend entrypoint
│   └── lib/
│       ├── components/            # UI components such as controls, settings, transcript view
│       ├── services/              # Tauri command and event wrappers
│       ├── stores/                # Frontend state stores
│       ├── i18n/                  # Localization resources
│       └── types/                 # Shared frontend types
├── src-tauri/
│   ├── src/
│   │   ├── main.rs                # Native entrypoint
│   │   ├── lib.rs                 # App bootstrap and shared state wiring
│   │   ├── commands.rs            # Tauri command surface
│   │   ├── session_manager.rs     # Session orchestration and streaming pipeline
│   │   ├── storage.rs             # Local session persistence and recovery
│   │   ├── settings.rs            # App settings persistence
│   │   └── secrets.rs             # Encrypted secret storage
│   └── tauri.conf.json            # Tauri app and build configuration
├── package.json                   # Frontend scripts and dependencies
└── README.md

How It Works

At a high level, CarryTalk follows this flow:

The frontend loads app settings, current session state, and audio runtime capabilities.
When a session starts, the backend manages audio capture, streaming, transcript buffering, and local storage.
Transcript and session events are emitted back to the frontend through Tauri events.
The UI renders incoming transcript segments and keeps the visible session state in sync.

Acknowledgements

This project was adapted and inspired by the following projects:

Contributing

Contributions are welcome.

If you want to contribute:

Fork the repository.
Create a feature branch.
Make focused, reviewable changes.
Run the relevant local checks.
Open a pull request describing the change clearly.

License

This project is licensed under the MIT License.

Contact

For questions, feedback, or support, please open an issue on GitHub:

https://github.com/tuannt39/carry-talk/issues

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.github		.github
docs		docs
images		images
src-tauri		src-tauri
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
index.html		index.html
package-lock.json		package-lock.json
package.json		package.json
svelte.config.js		svelte.config.js
tsconfig.json		tsconfig.json
vite.config.ts		vite.config.ts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CarryTalk

Overview

Features

Tech Stack

Frontend

Native/Desktop

Installation

Prerequisites

Clone the repository

Install dependencies

Running the App

Frontend development server

Run the Tauri desktop app in development mode

Type and Svelte checks

Build the frontend

Build the desktop app

Usage

Project Structure

How It Works

Acknowledgements

Contributing

License

Contact

About

Uh oh!

Releases 1

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

CarryTalk

Overview

Features

Tech Stack

Frontend

Native/Desktop

Installation

Prerequisites

Clone the repository

Install dependencies

Running the App

Frontend development server

Run the Tauri desktop app in development mode

Type and Svelte checks

Build the frontend

Build the desktop app

Usage

Project Structure

How It Works

Acknowledgements

Contributing

License

Contact

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Contributors

Uh oh!

Languages