This repository is a fork of melotts.axcl, which is an implementation of the MeloTTS text-to-speech model runing on LLM8850 accelerator card.
In order to provide continus audio synthesis service, we have added a server implementation in Python that interacts with the melotts C++ binary. The server listens for incoming text requests, processes them using the melotts model, and returns the generated audio files. In this way, the program does have to load the model for each request, significantly improving performance for multiple requests.
Building this project requires cmake, make sure to install it first:
sudo apt update
sudo apt install -y cmakeClone this repository and run the aarch64 build script:
cd
git clone https://github.com/PiSugar/melotts.axcl.git
cd melotts.axcl
sudo chmod +x build_aarch64.sh
./build_aarch64.sh
You can clone the model repositories and link them in arguments.json for easier management.
Run in the root directory of the bash serve.sh, which will start the server at http://localhost:8802.
The server uses the arguments.json file to configure the model paths and parameters. Make sure to update the paths in arguments.json to point to the correct model files you downloaded.
For example, for English models, the arguments.json should look like this:
{
"encoder": "/home/pi/MeloTTS-English-ax650/encoder-en.onnx",
"decoder": "/home/pi/MeloTTS-English-ax650/decoder-en-br.axmodel",
"lexicon": "/home/pi/MeloTTS-English-ax650/lexicon-en.txt",
"token": "/home/pi/MeloTTS-English-ax650/tokens-en.txt",
"g": "/home/pi/MeloTTS-English-ax650/g-en-br.bin",
"volume": "4"
}The server accepts POST requests with a JSON payload containing the text to be synthesized. The request format is as follows:
curl -X POST http://localhost:8802/synthesize \
-H "Content-Type: application/json" \
-d '{"sentence": "hello, i'm a student from some where", "outputPath": "/path/to/output.wav"}'If outputPath is not provided, the server will create a temporary file and delete it after returning the base64 encoded audio data.
Response:
{
"success": true,
"base64": "wav_file_in_base64_format"
}The base64 is always provided when outputPath is not given in the request body.
Error Response:
{
"success": false,
"error": "Error message here"
}If the melotts process is not running correctly, use /restart endpoint to restart it:
curl -X POST http://localhost:8802/restartA systemd service file melotts.service is provided to run the server as a background.
To enable and start the service, use the following commands:
sudo bash startup.sh