This guide explains how to build and run PyThaiTTS using Docker.
To build the Docker image, run the following command from the root directory of the repository:
docker build -t pythaitts:latest .This will create a Docker image named pythaitts:latest with all dependencies installed.
To run the demo script that demonstrates Thai text-to-speech synthesis:
docker run --rm pythaitts:latestThe demo will:
- Initialize the PyThaiTTS model (default: lunarlist_onnx)
- Generate speech from Thai text
- Save the output to a WAV file
- Display the waveform information
To start an interactive shell inside the container:
docker run --rm -it pythaitts:latest /bin/bashTo run your own Python script:
docker run --rm -v $(pwd)/your_script.py:/app/custom.py pythaitts:latest python custom.pyTo save generated audio files to your host machine:
docker run --rm -v $(pwd)/output:/app/output pythaitts:latest python -c "
from pythaitts import TTS
tts = TTS()
tts.tts('สวัสดีครับ', filename='output/hello.wav')
"This will save the generated hello.wav file to the output directory on your host machine.
Inside the container, you can use PyThaiTTS as follows:
from pythaitts import TTS
# Initialize TTS with default model
tts = TTS()
# Generate speech and save to file
file_path = tts.tts("ภาษาไทย ง่าย มาก มาก", filename="output.wav")
print(f"Audio saved to: {file_path}")
# Generate speech and get waveform
waveform = tts.tts("ภาษาไทย ง่าย มาก มาก", return_type="waveform")
print(f"Waveform shape: {waveform.shape}")PyThaiTTS supports multiple models:
- lunarlist_onnx (default): ONNX-optimized model, CPU-only
- khanomtan: KhanomTan TTS model
- lunarlist: Original Lunarlist model
To use a different model:
from pythaitts import TTS
# Using KhanomTan model
tts = TTS(pretrained="khanomtan", version="1.0")- Docker installed on your system
- At least 2GB of available disk space
- Internet connection for downloading models on first run
If you encounter issues with model downloads, ensure:
- You have a stable internet connection
- The Hugging Face Hub is accessible from your network
- You have sufficient disk space for model files
- The first run will download model files from Hugging Face Hub, which may take some time depending on your internet connection
- Generated audio files are in WAV format
- The default model (lunarlist_onnx) runs on CPU and doesn't require GPU support