Replies: 1 comment
-
For me it worked with the large one. You'll want to also add the |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi,
I'm using --translate (translate from source language to English), and sometimes the transcription is done.
But many times transcription it is not done, and I see note about the language or text appears in the origin language.
Also, the results are missing information about the use of transcription, having this information would be very helpful.
I'm running Whisper CPP on several long recordings 20min - 2h with default settings, 15 threads (-p 1 -t 15).
I've tried multilanguage models: base, small, medium, large.
I'm using WAV files, not streaming.
Examples:
model=small:
[00:32:12.000 --> 00:32:15.000] (speaking in Spanish)
[00:33:18.000 --> 00:33:27.000] (speaking in Spanish)
[00:33:28.000 --> 00:33:31.000] (speaking in Spanish)
[00:33:31.000 --> 00:33:45.000] (speaking in Spanish)
[00:33:45.000 --> 00:33:48.000] (speaking in Spanish)
model=small, , --max-len 1
[00:01:32.000 --> 00:01:32.580] Much
[00:01:32.580 --> 00:01:32.840] os
[00:01:32.840 --> 00:01:33.850] gracias
[00:01:33.850 --> 00:01:34.130] ,
[00:01:34.130 --> 00:01:34.950] amigos
[00:01:34.950 --> 00:01:35.000] .
[00:01:35.000 --> 00:01:35.160]
[00:01:35.160 --> 00:01:35.570] Hola
[00:01:35.570 --> 00:01:35.870] ,
[00:01:35.870 --> 00:01:36.560] amigo
[00:01:36.560 --> 00:01:37.000] .
[00:01:37.000 --> 00:01:37.000] ¿
[00:01:37.000 --> 00:01:37.000] Cómo
[00:01:37.000 --> 00:01:37.000] estás
[00:01:37.000 --> 00:01:37.000] ?
[00:01:37.000 --> 00:01:37.160] Ma
[00:01:37.160 --> 00:01:37.650] ñana
[00:01:37.650 --> 00:01:37.770] ir
[00:01:37.770 --> 00:01:37.940] é
[00:01:37.940 --> 00:01:38.020] a
[00:01:38.020 --> 00:01:38.190] la
[00:01:38.190 --> 00:01:38.710] ciudad
[00:01:38.710 --> 00:01:39.030] .
model=medium
[00:02:36.480 --> 00:02:37.960] [SPEAKING SPANISH]
[00:02:37.960 --> 00:02:40.960] [SPEAKING SPANISH]
[00:02:40.960 --> 00:02:41.960] [SPEAKING SPANISH]
[00:02:41.960 --> 00:02:42.960] [SPEAKING SPANISH]
model=large, --max-len 1
[00:01:32.000 --> 00:01:34.010] [
[00:01:34.010 --> 00:01:36.000] S
[00:01:36.000 --> 00:01:42.010] pan
[00:01:42.010 --> 00:01:47.980] ish
[00:01:47.980 --> 00:01:50.000] ]
I see a lot of similar cases.
Love Whisper CPP,
Piotr
Beta Was this translation helpful? Give feedback.
All reactions