Skip to content

Record speech with microphone, translate and synthesize translated speech (en->de, de->en)

Notifications You must be signed in to change notification settings

curious-broccoli/speech_pipeline

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

66 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Speech pipeline

Linux python project to:

  • recognize human speech (German or English), from either a microphone or a video
  • then translate it to English or German
  • then convert it into speech (text-to-speech).

Installation

Tested on Ubuntu 22.04.1 LTS with Python 3.10.4 and pip 22.2.2

  • Clone and change to the repository and bash install.sh
  • Confirm the installation of the programs it needs
  • Activate the virtual environment source ~/venv_speech_pipeline/bin/activate

Models

All machine learning models will automatically be downloaded the first time they are needed:

  • Vosk models in ~/.cache/vosk/ (more than 1 GB each)
  • Marian models in working/git directory
  • TTS models in ~/.local/share/tts/

Usage

Run python3 process_speech {mic,video} --help for more information

From a video file

Run python3 process_speech.py video [file]

From a microphone

Run python3 process_speech.py mic

About

Record speech with microphone, translate and synthesize translated speech (en->de, de->en)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published