Skip to content

Commit

Permalink
🔧 (server): Update list of models, remove speaker id model
Browse files Browse the repository at this point in the history
  • Loading branch information
pajowu committed Jan 3, 2022
1 parent 66fa6ad commit 1c3cdf0
Show file tree
Hide file tree
Showing 2 changed files with 43 additions and 17 deletions.
57 changes: 40 additions & 17 deletions server/app/models.yml
Original file line number Diff line number Diff line change
Expand Up @@ -3,23 +3,28 @@

English:
- name: small
url: http://alphacephei.com/vosk/models/vosk-model-small-en-us-0.15.zip
url: https://alphacephei.com/vosk/models/vosk-model-small-en-us-0.15.zip
description: Lightweight wideband model for Android and RPi
size: 40M
wer_speed: 9.85 (librispeech test-clean) 10.38 (tedlium)
- name: big
url: http://alphacephei.com/vosk/models/vosk-model-en-us-0.21.zip
description: Accurate wideband model
size: 1.6G
wer_speed: 5.43 (librispeech test-clean) 6.42 (tedlium) 40.63(callcenter)
url: https://alphacephei.com/vosk/models/vosk-model-en-us-0.22.zip
description: Accurate generic US English model
size: 1.8G
wer_speed: 5.69 (librispeech test-clean) 6.05 (tedlium) 29.78(callcenter)
- name: lgraph
url: https://alphacephei.com/vosk/models/vosk-model-en-us-0.22-lgraph.zip
description: Big US English model with dynamic graph
size: 128M
wer_speed: 7.82 (librispeech) 8.20 (tedlium)
Indian English:
- name: big
url: https://alphacephei.com/vosk/models/vosk-model-en-in-0.4.zip
description: Generic Indian English model for telecom and broadcast
size: 370M
wer_speed: TBD
- name: small
url: http://alphacephei.com/vosk/models/vosk-model-small-en-in-0.4.zip
url: https://alphacephei.com/vosk/models/vosk-model-small-en-in-0.4.zip
description: Lightweight Indian English model for mobile applications
size: 36M
wer_speed: TBD
Expand All @@ -34,18 +39,42 @@ Chinese:
description: Lightweight wideband model for Android and RPi
size: 32M
wer_speed: TBD
- name: big
url: https://alphacephei.com/vosk/models/vosk-model-cn-kaldi-multicn-2.zip
description: Original Wideband Kaldi multi-cn model from <a href="https://kaldi-asr.org/models/m11">Kaldi</a>
size: 195M
wer_speed: TBD
- name: lgraph
url: https://alphacephei.com/vosk/models/vosk-model-cn-kaldi-multicn-2-lgraph.zip
description: Original Wideband Kaldi multi-cn model from <a href="https://kaldi-asr.org/models/m11">Kaldi</a>
with dynamic graph
size: 101M
wer_speed: TBD
- name: big
url: https://alphacephei.com/vosk/models/vosk-model-cn-kaldi-cvte-2.zip
description: CVTE Kaldi model from <a href="https://kaldi-asr.org/models/m2">Kaldi</a>
size: 3.3G
wer_speed: TBD
Russian:
- name: big
url: https://alphacephei.com/vosk/models/vosk-model-ru-0.22.zip
description: Big mixed band Russian model for server processing
size: 1.5G
wer_speed: 5.74 (our audiobooks) 13.35 (open_stt audiobooks) 20.73 (open_stt youtube)
37.38 (openstt calls) 8.65 (golos crowd) 19.71 (sova devices)
- name: small
url: https://alphacephei.com/vosk/models/vosk-model-small-ru-0.22.zip
description: Lightweight wideband model for Android/iOS and RPi
size: 45M
wer_speed: 22.71 (openstt audiobooks) 31.97 (openstt youtube) 29.89 (sova devices)
11.79 (golos crowd)
Old Russian:
- name: big
url: https://alphacephei.com/vosk/models/vosk-model-ru-0.10.zip
description: Big narrowband Russian model for server processing
size: 2.5G
wer_speed: 5.71 (our audiobooks) 16.26 (open_stt audiobooks) 26.20 (public_youtube_700_val
open_stt) 40.15 (asr_calls_2_val open_stt)
- name: small
url: https://alphacephei.com/vosk/models/vosk-model-small-ru-0.15.zip
description: Lightweight wideband model for Android/iOS and RPi
size: 43M
wer_speed: 22.21 (openstt audiobooks) 30.89 (openstt youtube) 28.65 (sova devices)
French:
- name: small
url: https://alphacephei.com/vosk/models/vosk-model-small-fr-pguyot-0.3.zip
Expand Down Expand Up @@ -188,9 +217,3 @@ Swedish:
project</a>
size: 289M
wer_speed: TBD
Speaker identification model:
- name: big
url: https://alphacephei.com/vosk/models/vosk-model-spk-0.4.zip
description: Model for speaker identification, should work for all languages
size: 13M
wer_speed: TBD
3 changes: 3 additions & 0 deletions server/scripts/generate_models_list.py
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,9 @@
if current_lang == "English Other" or "not" in raw["Notes"].text.lower():
continue

if current_lang == "Speaker identification model":
continue

name = "big"
possible_names = ["small", "nano", "zamia", "linto-2.0", "linto-2.2", "lgraph"]
for possible_name in possible_names:
Expand Down

0 comments on commit 1c3cdf0

Please sign in to comment.