This repository contains scripts used to test deepspeech ( (v0.2.0-alpha.8). This is just for prototyping. The results, on speech that is noisy or very dissimilar to the training data, are really bad.
For the language model a text corpus is used, also provided by the people of the " German speech data corpus" (
- (~30h)
- (~50h)
- (~150h)
First the following path have to be defined:
- tuda_corpus_path: Path where the German Distant Speech Corpus is stored.
- voxforge_corpus_path: Path where the Voxforge German Speech data is stored.
- swc_corpus_path: Path where the Spoken Wikipedia Corpus is stored.
- text_corpus_path: Path where the text corpus is stored.
- exp_path: A directory where all output files are written to.
- kenlm_bin: Path to the kenLM tool
- deepspeech: Path to the cloned DeepSpeech repository
The commands are expected to be executed from the path where this repository is cloned. Take a look at
as an example for executing all the commands.
pip install -r requirements.txt
For requirements regarding DeepSpeech checkout their repository. For the native-client with gpu use:
python3 util/ --target native_client --branch "v0.2.0-alpha.8" --arch gpu
- Download the text corpus from and store it to
. - Download the German Distant Speech Corpus (TUDA) from and store it to
. - Download the Spoken Wikipedia Corpus (SWC) from and prepare it according to
- Download the Voxforge German Speech data (via pingu python library):
from audiomate.corpus import io
dl = io.VoxforgeDownloader(lang='de')
# creates the csv files defining the audio data used for training
./ $tuda_corpus_path $voxforge_corpus_path $exp_path/data
# First the text is normalized and cleaned.
./ $text_corpus_path $exp_path/clean_vocab.txt
# KenLM is used to build the LM
$kenlm_bin/lmplz --text $exp_path/clean_vocab.txt --arpa $exp_path/ --o 3
$kenlm_bin/build_binary -T -s $exp_path/ $exp_path/lm.binary
# The deepspeech tools are used to create the trie
$deepspeech/native_client/generate_trie data/alphabet.txt $exp_path/lm.binary $exp_path/clean_vocab.txt $exp_path/trie
./ $deepspeech $(realpath data/alphabet.txt) $exp_path
I Test of Epoch 19 - WER: 0.667205, loss: 69.56213065852289, mean edit distance: 0.287312
I --------------------------------------------------------------------------------
I WER: 0.333333, loss: 0.544307, mean edit distance: 0.200000
I - src: "p c b"
I - res: "p c "
I --------------------------------------------------------------------------------
I WER: 0.500000, loss: 0.533773, mean edit distance: 0.142857
I - src: "oder bundesrat"
I - res: "der bundesrat "
I --------------------------------------------------------------------------------
I WER: 1.000000, loss: 0.102555, mean edit distance: 0.125000
I - src: "handlung"
I - res: "handlunge"
I --------------------------------------------------------------------------------
I WER: 1.000000, loss: 0.152456, mean edit distance: 0.500000
I - src: "erde"
I - res: "er "
I --------------------------------------------------------------------------------
I WER: 1.000000, loss: 0.152456, mean edit distance: 0.500000
I - src: "erde"
I - res: "er "
I --------------------------------------------------------------------------------
I WER: 1.000000, loss: 0.555418, mean edit distance: 0.500000
I - src: "form"
I - res: "vor "
I --------------------------------------------------------------------------------
I WER: 1.000000, loss: 0.714687, mean edit distance: 0.250000
I - src: "werk"
I - res: "wer "
I --------------------------------------------------------------------------------
I WER: 1.000000, loss: 0.851070, mean edit distance: 0.200000
I - src: "texte"
I - res: "text "
I --------------------------------------------------------------------------------
I WER: 1.000000, loss: 0.912912, mean edit distance: 0.600000
I - src: "misst"
I - res: "mit "
I --------------------------------------------------------------------------------
I WER: 2.000000, loss: 0.898193, mean edit distance: 0.250000
I - src: "beilagen"
I - res: "bei lagen "
I --------------------------------------------------------------------------------