Skip to content

Some speech zone with low-volume or high distortion are not detected #45

Answered by Jeronymous
olevanss asked this question in Q&A
Discussion options

You must be logged in to vote

Can you please describe more precisely what is the problem:

  • what kind of phrases are problematic? Can it be things like speech disfluencies?
  • Do your 3 plots corresponds to the 3 cases you describe? (the first and last ones do not exhibit obviously wrong behaviors).

If you are using the trainscribe() function in python you can :

  • try beam_size=5, best_of=5, temperature=(0.0, 0.2, 0.4, 0.6, 0.8, 1.0) (for cases where some phrases are missing from the transcription)
  • try trust_whisper_timestamps = False (then no need to tune refine_whisper_precision, it will just recompute all timestamps)
  • try a larger model

Replies: 1 comment 3 replies

Comment options

You must be logged in to vote
3 replies
@olevanss
Comment options

@Jeronymous
Comment options

@olevanss
Comment options

Answer selected by olevanss
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants