How to use

(Suggestion) Python == 3.11

Clone this repository

git clone https://github.com/ouor/vits.git

Choose cleaners

Fill "text_cleaners" in config.json
Edit text/symbols.py
Text cleaner is korean by default
Remove unnecessary imports from text/cleaners.py

Install dependencies

python3.XX-dev is required for building monotonic alignment search. If you have multiple python versions, install the required version.

sudo apt install software-properties-common
sudo add-apt-repository ppa:deadsnakes/ppa
sudo apt update
sudo apt install python3.11-dev python3.11-venv

Create conda environment

conda create -n vits python=3.11
conda activate vits
conda install conda-forge::uv

Install requirements

uv pip install -r requirements.txt

Create datasets

Single speaker

"n_speakers" should be 0 in config.json

path/to/XXX.wav|transcript

Example

trains/korean/datasets/train/001.wav|안녕하세요.
...
trains/korean/datasets/val/001.wav|안녕하세요.

Mutiple speakers

Speaker id should start from 0

path/to/XXX.wav|speaker id|transcript

Example

trains/korean/datasets/train/001.wav|0|안녕하세요.
...
trains/korean/datasets/val/001.wav|0|안녕하세요.

Preprocess

If you have done this, set "cleaned_text" to true in config.json

# Single speaker
python preprocess.py --text_index 1 --filelists trains/sample/filelist_train.txt trains/sample/filelist_val.txt

# Mutiple speakers
python preprocess.py --text_index 2 --filelists trains/sample/filelist_train.txt trains/sample/filelist_val.txt

Build monotonic alignment search

cd monotonic_align
mkdir monotonic_align
python setup.py build_ext --inplace
cd ..

Train

# Single speaker
python train.py -c trains/sample/config.json -m trains/sample/models

# Mutiple speakers
python train_ms.py -c trains/sample/config.json -m trains/sample/models

Place pre-trained models in "trains/sample/models". Like "trains/sample/models/G_0.pth" and "trains/sample/models/D_0.pth"

Tensorboard

tensorboard --logdir=trains/sample/models

Inference

See inference.ipynb

Running in Docker

docker run -itd --gpus all --name "Container name" -e NVIDIA_DRIVER_CAPABILITIES=compute,utility -e NVIDIA_VISIBLE_DEVICES=all "Image name"

Name		Name	Last commit message	Last commit date
Latest commit History 96 Commits
Libtorch C++ Infer		Libtorch C++ Infer
example		example
monotonic_align		monotonic_align
text		text
.dockerignore		.dockerignore
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
attentions.py		attentions.py
colab.ipynb		colab.ipynb
commons.py		commons.py
data_utils.py		data_utils.py
inference.ipynb		inference.ipynb
losses.py		losses.py
mel_processing.py		mel_processing.py
models.py		models.py
modules.py		modules.py
preprocess.py		preprocess.py
requirements.txt		requirements.txt
train.py		train.py
train_ms.py		train_ms.py
transforms.py		transforms.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

How to use

Clone this repository

Choose cleaners

Install dependencies

Create conda environment

Install requirements

Create datasets

Single speaker

Mutiple speakers

Preprocess

Build monotonic alignment search

Train

Tensorboard

Inference

Running in Docker

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

How to use

Clone this repository

Choose cleaners

Install dependencies

Create conda environment

Install requirements

Create datasets

Single speaker

Mutiple speakers

Preprocess

Build monotonic alignment search

Train

Tensorboard

Inference

Running in Docker

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages