Project DeepSpeech

DeepSpeech is an open source Speech-To-Text engine, using a model trained by machine learning techniques based on Baidu's Deep Speech research paper. Project DeepSpeech uses Google's TensorFlow to make the implementation easier.

NOTE: This documentation applies to the MASTER version of DeepSpeech only. Documentation for the latest stable version is published on deepspeech.readthedocs.io.

To install and use deepspeech all you have to do is:

# Create and activate a virtualenv
virtualenv -p python3 $HOME/tmp/deepspeech-venv/
source $HOME/tmp/deepspeech-venv/bin/activate

# Install DeepSpeech
pip3 install deepspeech

# Download pre-trained English model and extract
curl -LO https://github.com/mozilla/DeepSpeech/releases/download/v0.6.1/deepspeech-0.6.1-models.tar.gz
tar xvf deepspeech-0.6.1-models.tar.gz

# Download example audio files
curl -LO https://github.com/mozilla/DeepSpeech/releases/download/v0.6.1/audio-0.6.1.tar.gz
tar xvf audio-0.6.1.tar.gz

# Transcribe an audio file
deepspeech --model deepspeech-0.6.1-models/output_graph.pbmm --scorer deepspeech-0.6.1-models/kenlm.scorer --audio audio/2830-3980-0043.wav

A pre-trained English model is available for use and can be downloaded using the instructions below. A package with some example audio files is available for download in our release notes.

Quicker inference can be performed using a supported NVIDIA GPU on Linux. See the release notes to find which GPUs are supported. To run deepspeech on a GPU, install the GPU specific package:

# Create and activate a virtualenv
virtualenv -p python3 $HOME/tmp/deepspeech-gpu-venv/
source $HOME/tmp/deepspeech-gpu-venv/bin/activate

# Install DeepSpeech CUDA enabled package
pip3 install deepspeech-gpu

# Transcribe an audio file.
deepspeech --model deepspeech-0.6.1-models/output_graph.pbmm --scorer deepspeech-0.6.1-models/kenlm.scorer --audio audio/2830-3980-0043.wav

Please ensure you have the required CUDA dependencies.

See the output of deepspeech -h for more information on the use of deepspeech. (If you experience problems running deepspeech, please check required runtime dependencies).

Table of Contents

Name		Name	Last commit message	Last commit date
Latest commit History 2,569 Commits
.github		.github
bin		bin
data		data
doc		doc
examples		examples
images		images
native_client		native_client
taskcluster		taskcluster
util		util
.cardboardlint.yml		.cardboardlint.yml
.compute		.compute
.gitattributes		.gitattributes
.gitignore		.gitignore
.gitmodules		.gitmodules
.pylintrc		.pylintrc
.readthedocs.yml		.readthedocs.yml
.taskcluster.yml		.taskcluster.yml
.travis.yml		.travis.yml
BIBLIOGRAPHY.md		BIBLIOGRAPHY.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.rst		CONTRIBUTING.rst
DeepSpeech.py		DeepSpeech.py
Dockerfile		Dockerfile
GRAPH_VERSION		GRAPH_VERSION
ISSUE_TEMPLATE.md		ISSUE_TEMPLATE.md
LICENSE		LICENSE
README.rst		README.rst
RELEASE.rst		RELEASE.rst
SUPPORT.rst		SUPPORT.rst
VERSION		VERSION
bazel.patch		bazel.patch
build-python-wheel.yml-DISABLED_ENABLE_ME_TO_REBUILD_DURING_PR		build-python-wheel.yml-DISABLED_ENABLE_ME_TO_REBUILD_DURING_PR
evaluate.py		evaluate.py
evaluate_tflite.py		evaluate_tflite.py
lm_optimizer.py		lm_optimizer.py
requirements.txt		requirements.txt
requirements_eval_tflite.txt		requirements_eval_tflite.txt
requirements_tests.txt		requirements_tests.txt
requirements_transcribe.txt		requirements_transcribe.txt
stats.py		stats.py
transcribe.py		transcribe.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Project DeepSpeech

About

Releases

Packages

Languages

License

marc48/DeepSpeechM

Folders and files

Latest commit

History

Repository files navigation

Project DeepSpeech

About

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages