Coqui.ai TTS

AyaAbout 3 min

Coqui.ai TTS

Underlined "TTS*" and "Judy*" are internal 🐸TTS models that are not released open-source. They are here to show the potential. Models prefixed with a dot (.Jofish .Abe and .Janice) are real human voices.

Features

高效能深度學習模型，用於文本到語音（Text2Speech）任務。
支援 Text2Spec 模型（Tacotron、Tacotron2、Glow-TTS、SpeedySpeech）。
語者編碼器（Speaker Encoder）計算語者嵌入（speaker embeddings）。
多種 Vocoder 模型（MelGAN、Multiband-MelGAN、GAN-TTS、ParallelWaveGAN、WaveGrad、WaveRNN）。
支援多語言 TTS。

Installation

🐸TTS is tested on Ubuntu 18.04 with python >= 3.9, < 3.12..

Using docker is relatively simple.

Dockerfile :

# COPY . .
# 使用官方的 Python 鏡像作為基礎鏡像
FROM python:3.9-slim

# 設定工作目錄
WORKDIR /app

RUN apt-get update && \
apt-get install -y --no-install-recommends \
build-essential \
libsndfile1-dev \
&& rm -rf /var/lib/apt/lists/*

RUN pip install --upgrade pip
RUN pip install TTS soundfile

# EXPOSE 5000

# 設定入口點（如果有需要的腳本）
# ENTRYPOINT ["python", "your_script.py"]

# 或進入互動式環境
CMD ["bash"]

執行 :

docker build -t coqui-tts .
docker run -it --rm -v /path/to/local/dir:/app/data coqui-tts
- /path/to/local/dir -> 替換成自己的路徑
with gpu -> docker run -it --gpus all -v /path/to/local/dir:/app coqui-tts

unsupport in windows(ref)

Sample code

List available 🐸TTS models

models = TTS().list_models().list_models()
for model in models:
    print(model)

Running a multi-speaker and multi-lingual model

import torch
from TTS.api import TTS

tts = TTS("tts_models/multilingual/multi-dataset/xtts_v2").to('cpu')
text = "Hello world."
wav = tts.tts(text="Hello world!", speaker_wav='speaker.wav', language="en")

Example voice conversion

import torch
from TTS.api import TTS

tts = TTS(model_name="voice_conversion_models/multilingual/vctk/freevc24").to("cpu")
tts.voice_conversion_to_file(source_wav="a.wav", target_wav="b.wav", file_path="output.wav")