Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

whisper as stt engine #240

Open
royrogermcfreely opened this issue Dec 1, 2023 · 2 comments
Open

whisper as stt engine #240

royrogermcfreely opened this issue Dec 1, 2023 · 2 comments

Comments

@royrogermcfreely
Copy link

Is your feature request related to a problem? Please describe.
no

Describe the solution you'd like
use the whisper stt engine within sepia

Additional context
Home Assistant got the "Year of the Voice". there you can use whisper on an rpi4.
i tried it on a vm and got prettey good results

it seems there are 2 versions, didnt searched much about the diffrences

whisper: https://github.com/openai/whisper

fast-whisper: https://github.com/SYSTRAN/faster-whisper <- this one uses home assistant

i found also a docker image from rhasspy: https://hub.docker.com/r/rhasspy/wyoming-whisper

@fquirin
Copy link
Contributor

fquirin commented Apr 26, 2024

Sorry for the late reply. I'm currently still taking a little break from the project, but I'm determined to resume work later this year.

As for whisper I actually have a working beta version. Unfortunately I did not finish the release before I took a break, but it was already working pretty well. As soon as I resume work, this will be the first task.

@fquirin
Copy link
Contributor

fquirin commented Apr 26, 2024

Two additional things I should mention.

  1. Whisper is pretty demanding for STT. The smallest model will run fast enough on a Raspberry Pi 5 to get OK user experience, but isn't very accurate. The larger models will require better hardware and tend to hallucinate quite a bit. Nevertheless support will come for everyone to play around with their favorite service ^^.

  2. I've also made a PoC for Nvidia NeMo. My hope is that their models will evolve pretty quickly with better support for custom vocabulary. We'll see.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants