Emotional Offline Voice Assistant is an AI-driven voice assistant capable of understanding and expressing emotions. It can interact with users in a more human-like manner, providing a more engaging and natural user experience. This project is designed to work offline, ensuring user privacy and data security.
- Offline voice recognition and processing for enhanced privacy
- Emotion recognition and expression capabilities
- Natural language understanding for improved user interactions
- Customizable voice and personality
- Cross-platform compatibility
To install the Emotional Offline Voice Assistant, follow these steps:
-
Clone the repository:
Now clone this repository and dowload the pretrained speech model.
git clone https://github.com/TPODAvia/Voice-Assistant
cd Voice-Assistant
curl -LJO "https://github.com/TPODAvia/Voice-Assistant/releases/download/v0.0.1-alpha/StyleTTS.zip"
unzip StyleTTS.zip
cd ../..Get the Microsoft Visual Studio:
cd Voice-Assistant
python -m venv venvif strugles of creating venv theen execute this code:
Set-ExecutionPolicy -ExecutionPolicy RemoteSigned -Scope CurrentUser -Force
python -m venv venv
# or
C:\Users\vboxuser\AppData\Local\Programs\Python\Python311\python.exe -m venv venv
Activate the venv
./venv/Script/activate
# or
.\venv\Scripts\activate
# or
./venv/Scripts/Activate.ps1- Install the required dependencies:
Install torch with cuda https://pytorch.org/get-started/locally/ To test that the cuda is working:
# activate your venv first
cd Voice-Assistant
python docs/cuda_test.pyInstall espeak-NG For Windows:
Download and install espeak-ng https://github.com/espeak-ng/espeak-ng
Add to the system variables:
PHONEMIZER_ESPEAK_LIBRARY = C:\Program Files\eSpeak NG\libespeak-ng.dll
PHONEMIZER_ESPEAK_PATH = C:\Program Files\eSpeak NGFor Linux:
sudo apt-get install libasound-dev portaudio19-dev libportaudio2 libportaudiocpp0 python3-tk python3-dev sox python3-pil python3-pil.imagetk espeak -yInstall pip dependencies
cd Voice-Assistant
pip install -r requirements.txt pip install PyAudio
# pip install transformers==4.39.0 # new realeases get errors when switching to offline mode
# pip install openwakeword
# pip install SoundFile torchaudio munch torch pydub pyyaml librosa git+https://github.com/resemble-ai/monotonic_align.git- Configure the voice assistant settings as needed:
cd /home/vboxuser/Voice-Assistant/Irene-Voice-Assistant/options/core.json
sudo nano core.json- Optionaly, the online voice assistance can be executed in the
Voice_assistant_onlinefolder:
cd Voice_assistant_online
python voice_assistant_online.pyTo start the Emotional Offline Voice Assistant, run the following command:
cd Voice-Assistant/Irene-Voice-Assistant
python3 runva_neuralnet.pyOnce the voice assistant is running, you can interact with it using your microphone or by providing text input.
To create a profiling image using gprof2dot from the provided script, follow these steps:
-
Profile the Script with
cProfile: Modify the script to use thecProfilemodule to collect profiling data. -
Run the Script to Generate Profiling Data: Execute the modified script to generate a
.proffile containing the profiling data. -
Convert Profiling Data to a Dot File: Use
gprof2dotto convert the.proffile to a.dotfile. -
Generate an Image from the Dot File: Use Graphviz to convert the
.dotfile to an image format like PNG.
Here is the step-by-step process:
Add the cProfile module to your script to collect profiling data:
import sys
import cProfile
if __name__ == "__main__":
profiler = cProfile.Profile()
profiler.enable()
main()
profiler.disable()
profiler.dump_stats('profile_data.prof')Run the modified script to generate the profile_data.prof file:
python your_script.pyTerminate the script with Ctrl+C after a few seconds to ensure profiling data is collected.
Use gprof2dot to convert the .prof file to a .dot file:
gprof2dot -f pstats profile_data.prof -o profile_data.dotUse Graphviz to convert the .dot file to an image format like PNG:
dot -Tpng profile_data.dot -o profile_data.pngAfter these steps, you will have a profile_data.png file that visually represents the profiling data of your script.
We welcome contributions to the Emotional Offline Voice Assistant project. If you're interested in contributing, please read our contribution guidelines and code of conduct before getting started.
This project is licensed under the MIT License.