Replies: 2 comments
-
Stock Whisper doesn't generate accurate timestamps; more often than not, they're only accurate to the nearest whole second. Derived projects like WhisperX employ additional AI models (e.g. wav2vec v2) to improve timestamp accuracy. |
Beta Was this translation helpful? Give feedback.
0 replies
-
I can get whisper.cpp to run. Python ecosystem is a dumpster fire. Or something like purgatory. Blessings to ggerganov! |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I used this software to create a subtitle for "The Princes in the Tower" with the program Subtitleedit which uses Whisper as its STT-engine.
I was in awe how accurate it created the text for this movie. It's the only time I used it thus far, because it was alas still a lot of work to get a good subtitle, but this wasn't for the text it generated.
It actually needed only a few corrections.
No, the problem was only with the timing.
I totally do not comprehend why a very difficult task as converting spoken text to written text can be done, but then having the timing wrong on so many places.
I do not know how Whisper is integrated into Subtitleedit, but when I asked the programmer, he said it was the only weak point of Whisper.
Is this timing problem recognised?
BTW... I used the CPP-engine and I haven't tried any others, yet.
Beta Was this translation helpful? Give feedback.
All reactions