Skip to content

This project is a text-to-speech (TTS) converter that uses the Coqui TTS library. It reads a text file, splits the text into chunks that do not exceed the character limit, and converts each chunk into an audio file. The project is configured to use the GPU for faster processing.

License

Notifications You must be signed in to change notification settings

igormedeiros/voice-clone-narrator

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

# Text-to-Speech Converter (TTS) using Coqui TTS

This project is a text-to-speech (TTS) converter that uses the Coqui TTS library. It reads a text file, splits the text into chunks that do not exceed the character limit, and converts each chunk into an audio file. The project is configured to use the GPU for faster processing.

## Requirements

- Python 3.7 or higher
- CUDA and cuDNN configured correctly to use the GPU
- ffmpeg

## Installation

1. Clone this repository:
    ```bash
    git clone https://github.com/igormedeiros/voice-clone-narrator.git
    cd voice-clone-narrator
    ```

2. Create a virtual environment and activate it:
    ```bash
    python -m venv venv
    source venv/bin/activate  # On Windows use: venv\Scripts\activate
    ```

3. Install the dependencies:
    ```bash
    pip install -r requirements.txt
    ```

4. Ensure ffmpeg is installed:
    ```bash
    sudo apt-get install ffmpeg  # For Debian/Ubuntu based systems
    ```

## Usage

Run the script with the necessary parameters:

```bash
python main.py --text_file "path/to/your/textfile.txt" --output_path "output.wav" --speed 1.0 --language "pt" --sample_voice "./voices/sample-voice.wav" --thumb "path/to/your/thumbnail.jpg"

Parameters
--text_file: Path to the text file to be converted to speech.
--output_path: Path to save the output audio file.
--speed: Speech speed (optional, default is 1.0).
--language: Speech language (optional, default is "pt").
--sample_voice: Path to the sample voice file.
--thumb: Path to the thumbnail image for MP4 output (optional).

## Logging

Each run of the program generates a log file named "narrator" followed by the date and time of execution, with the extension .log.

## Output

- If `--thumb` is provided, the output will be an MP4 file with the audio and the thumbnail image as a static video.
- If `--thumb` is not provided, the output will be a WAV audio file.

## Project Structure

voice-clone-narrator/
├── LICENSE
├── README.md
├── requirements.txt
├── main.py
└── voices/

## Contribution

Feel free to contribute to the project! Just follow the steps below:

1. Fork the project
2. Create a new branch (git checkout -b feature/new-feature)
3. Commit your changes (git commit -am 'Add new feature')
4. Push to the branch (git push origin feature/new-feature)
5. Create a new Pull Request

About

This project is a text-to-speech (TTS) converter that uses the Coqui TTS library. It reads a text file, splits the text into chunks that do not exceed the character limit, and converts each chunk into an audio file. The project is configured to use the GPU for faster processing.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages