My Voice

Clone any voice and generate closer to natural-sounding speech from text. Runs locally on your machine and requires cloning.

What it does

My Voice lets you clone any voice from a short audio sample and generate speech from text. Upload 10-30 seconds of someone speaking, enter your text, and get naturalish-sounding audio in their voice.

Features

Voice Cloning - Clone any voice from a short audio sample
Text-to-Speech - Convert text into speech using the cloned voice
Batch Generation - Process multiple URLs at once with bulk import
Multi-language Support - 16+ languages including English, Spanish, French, German, Chinese, Japanese
URL Content Extraction - Fetch article text directly from URLs with paragraph preservation
Browser Recording - Record voice samples directly in the browser
100% Local - All processing happens on your machine, nothing leaves your computer
GPU Acceleration - Uses CUDA (NVIDIA) or MPS (MacOS) automatically when available for faster generation

Quick Start

# Install ffmpeg and Python 3.11 (required)
brew install ffmpeg python@3.11  # macOS
# sudo apt install ffmpeg python3.11  # Linux
# choco install ffmpeg python311  # Windows

# Install Python dependencies
pip3 install TTS flask flask-cors pydub beautifulsoup4 requests

# Clone and run
git clone https://github.com/97115104/myvoice.git
cd myvoice
python3 server.py

First run downloads the XTTS model (~1.8GB). Server starts on http://localhost:5123.

Open the UI at http://localhost:5123/ui.

How to use

Single Generation

Start the server - Run python3 server.py to start the local TTS server
Open the UI - Go to http://localhost:5123/ui in your browser
Provide a voice sample - Upload an audio file (MP3, M4A, WAV) or record directly in the browser. 10-30 seconds of clear speech works best.
Enter your text - Type the text you want to convert, or fetch content from a URL.
Generate - Click generate and wait for the AI to synthesize your audio.
Download - Download the audio file.

Batch Generation

Open the batch page - Go to http://localhost:5123/batch in your browser
Upload a voice sample - Same as single generation
Add URLs - Either add URLs one at a time with "Preview & Add", or use "Bulk Import" to paste multiple URLs at once (one per line)
Set output directory - Choose where to save the generated files
Generate All - Click to process all queued items sequentially

Requirements

Python 3.9-3.11 (TTS package doesn't support Python 3.12+)
ffmpeg - For audio conversion (brew install ffmpeg on macOS)
~4GB disk space - For the XTTS model
GPU (optional) - CUDA GPU speeds up generation significantly

Supported Languages

English, Spanish, French, German, Italian, Portuguese, Polish, Turkish, Russian, Dutch, Czech, Arabic, Chinese, Japanese, Korean, Hindi

Tips for best results

Clear samples: Use audio with minimal background noise
Right length: 10-30 seconds of continuous speech
Match languages: Best quality when sample language matches output language
WAV format: Tends to produce best quality

Privacy

Everything runs locally on your machine which means voice samples are processed locally and never uploaded, text processing happens on your computer and there is no telemetry or API calls to external services.

Technical details

Model: XTTS v2 by Coqui AI (~1.8GB)
Backend: Flask server running on localhost:5123
Frontend: Static HTML/CSS/JS

GPU Acceleration (Windows/Linux with NVIDIA GPU)

For faster generation on NVIDIA GPUs, install CUDA PyTorch:

pip install torch torchaudio --index-url https://download.pytorch.org/whl/cu121

macOS users: Apple Silicon (M1/M2/M3/M4) doesn't support CUDA - that's NVIDIA-only. XTTS will run on CPU. Generation takes ~15-30 seconds per chunk but works reliably.

API Reference

The server exposes these endpoints:

GET  /api/health              # Server status check
POST /api/tts                 # Generate speech (form data: text, voice, language, speed)
POST /api/batch-tts           # Batch generate speech with file saving
POST /api/fetch-url           # Extract text from URL (preserves paragraphs)
GET  /api/tags                # Ollama-compatible model list

License

MIT

Created by 97 115 104 · View source · Other projects

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
assets/examples		assets/examples
css		css
js		js
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
batch.html		batch.html
index.html		index.html
requirements.txt		requirements.txt
server.py		server.py
setup.bat		setup.bat
setup.sh		setup.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

My Voice

What it does

Features

Quick Start

How to use

Single Generation

Batch Generation

Requirements

Supported Languages

Tips for best results

Privacy

Technical details

GPU Acceleration (Windows/Linux with NVIDIA GPU)

API Reference

License

About

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

My Voice

What it does

Features

Quick Start

How to use

Single Generation

Batch Generation

Requirements

Supported Languages

Tips for best results

Privacy

Technical details

GPU Acceleration (Windows/Linux with NVIDIA GPU)

API Reference

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Contributors

Uh oh!

Languages