🎙️ ToneSwap: Real-Time Voice Conversion App

ToneSwap is a real-time voice conversion application that transforms a speaker's voice into a target voice while preserving the speech content, tone, and emotion. It supports various audio formats and provides a user-friendly web interface via Gradio.

🚀 Features

🎤 Converts your voice to another person's voice using voice samples
🧠 Zero-shot voice conversion using pretrained models (FreeVC + WavLM + HiFi-GAN)
📁 Supports uploading or recording source and target audio in any format
🔁 Automatically converts input to mono 16kHz .wav using FFmpeg
🌐 Easy-to-use Gradio web interface

📦 Prerequisites

Before running the app, download and place the following files and models in the correct directories:

✅ 1. Pretrained Checkpoints

Download FreeVC model checkpoints and place them in:

checkpoints/

📥 Download Checkpoints

✅ 2. WavLM-Large Model

Download the WavLM-Large model files from the official Microsoft repository and place them in:

wavlm/

📥 WavLM GitHub

✅ 3. HiFi-GAN Vocoder (Optional, for SR training or fine-tuning)

Clone the HiFi-GAN repository and download generator_v1 pretrained model. Place it in:

hifigan/

📥 HiFi-GAN GitHub

🧪 Setup Instructions

Clone this repo:

git clone https://github.com/<your-username>/ToneSwap.git
cd ToneSwap

Install Python dependencies:
```
pip install -r requirements.txt
```
Install FFmpeg (for audio format conversion): Windows: FFmpeg Download
Install Gradio:
```
pip install gradio
```

##✅ How to Run

python app.py

⚠️ Note

Due to GitHub's file size limitations, this repository does not include pretrained models or checkpoints. You must download and place them manually as instructed above.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
__pycache__		__pycache__
configs		configs
filelists		filelists
hifigan		hifigan
logs		logs
resources		resources
speaker_encoder		speaker_encoder
tips-for-synthesizing-24KHz-wavs-from-16kHz-wavs		tips-for-synthesizing-24KHz-wavs-from-16kHz-wavs
wavlm		wavlm
README.md		README.md
convert.txt		convert.txt
data_utils.py		data_utils.py
downsample.py		downsample.py
inference.py		inference.py
losses.py		losses.py
mel_processing.py		mel_processing.py
models.py		models.py
modules.py		modules.py
preprocess_flist.py		preprocess_flist.py
preprocess_spk.py		preprocess_spk.py
preprocess_sr.py		preprocess_sr.py
preprocess_ssl.py		preprocess_ssl.py
requirements.txt		requirements.txt
train.py		train.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!

Repository files navigation

🎙️ ToneSwap: Real-Time Voice Conversion App

🚀 Features

📦 Prerequisites

✅ 1. Pretrained Checkpoints

✅ 2. WavLM-Large Model

✅ 3. HiFi-GAN Vocoder (Optional, for SR training or fine-tuning)

🧪 Setup Instructions

python app.py

⚠️ Note

About

Uh oh!

Releases

Packages

Languages

Uh oh!

Uh oh!

rahulprajapati08/ToneSwap

Folders and files

Latest commit

History

Repository files navigation

🎙️ ToneSwap: Real-Time Voice Conversion App

🚀 Features

📦 Prerequisites

✅ 1. Pretrained Checkpoints

✅ 2. WavLM-Large Model

✅ 3. HiFi-GAN Vocoder (Optional, for SR training or fine-tuning)

🧪 Setup Instructions

python app.py

⚠️ Note

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages