-
Notifications
You must be signed in to change notification settings - Fork 0
This project is a text-to-speech (TTS) converter that uses the Coqui TTS library. It reads a text file, splits the text into chunks that do not exceed the character limit, and converts each chunk into an audio file. The project is configured to use the GPU for faster processing.
License
igormedeiros/voice-clone-narrator
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
# Text-to-Speech Converter (TTS) using Coqui TTS This project is a text-to-speech (TTS) converter that uses the Coqui TTS library. It reads a text file, splits the text into chunks that do not exceed the character limit, and converts each chunk into an audio file. The project is configured to use the GPU for faster processing. ## Requirements - Python 3.7 or higher - CUDA and cuDNN configured correctly to use the GPU - ffmpeg ## Installation 1. Clone this repository: ```bash git clone https://github.com/igormedeiros/voice-clone-narrator.git cd voice-clone-narrator ``` 2. Create a virtual environment and activate it: ```bash python -m venv venv source venv/bin/activate # On Windows use: venv\Scripts\activate ``` 3. Install the dependencies: ```bash pip install -r requirements.txt ``` 4. Ensure ffmpeg is installed: ```bash sudo apt-get install ffmpeg # For Debian/Ubuntu based systems ``` ## Usage Run the script with the necessary parameters: ```bash python main.py --text_file "path/to/your/textfile.txt" --output_path "output.wav" --speed 1.0 --language "pt" --sample_voice "./voices/sample-voice.wav" --thumb "path/to/your/thumbnail.jpg" Parameters --text_file: Path to the text file to be converted to speech. --output_path: Path to save the output audio file. --speed: Speech speed (optional, default is 1.0). --language: Speech language (optional, default is "pt"). --sample_voice: Path to the sample voice file. --thumb: Path to the thumbnail image for MP4 output (optional). ## Logging Each run of the program generates a log file named "narrator" followed by the date and time of execution, with the extension .log. ## Output - If `--thumb` is provided, the output will be an MP4 file with the audio and the thumbnail image as a static video. - If `--thumb` is not provided, the output will be a WAV audio file. ## Project Structure voice-clone-narrator/ ├── LICENSE ├── README.md ├── requirements.txt ├── main.py └── voices/ ## Contribution Feel free to contribute to the project! Just follow the steps below: 1. Fork the project 2. Create a new branch (git checkout -b feature/new-feature) 3. Commit your changes (git commit -am 'Add new feature') 4. Push to the branch (git push origin feature/new-feature) 5. Create a new Pull Request
About
This project is a text-to-speech (TTS) converter that uses the Coqui TTS library. It reads a text file, splits the text into chunks that do not exceed the character limit, and converts each chunk into an audio file. The project is configured to use the GPU for faster processing.
Resources
License
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published