Replies: 1 comment
-
Hi @fury88 The Coqui AI model has a habit of being this way. Whilst there is no absolute fix, the quality of the base Audio sample you are using matters immensely, the clearer the better and anything from 6-30 seconds in length. Finetuning is also a possibility for resolving this by training the model on the voice you are wanting to re-create. Please bear in mind that if you are using a non-human audio sample, e.g. a cartoon voice, XTTS is tuned to recreate human sounding speech, which also therefore has an impact, hence finetuning. You can also play with the Temperature and Repetition penalty to keep the audio generation closer to the original samples. Finally, you can try using multi-sample generation to improve the quality of the voice, based on giving the XTTS model more samples to work with. I've just added this is in this commit b7aa3a7 so you would need to update to be able to use this. If you continue with issues, I would suggest researching the Coqui forum as its their model and code that is generating the TTS, Alltalk is purely handing it over. Thanks |
Beta Was this translation helpful? Give feedback.
-
Hi all!
I have a lot of text that generates TTS via AllTalk. It's using the xtts_v2 as I like the voices currently in there. I'm in new territory with how to training models so I'll worry about that later. For some reason the current voice model likes to switch multiple accents within English while speaking the text. In one long text mp3 it switches from standard English, to English with a Southern Accent, to UK English, to Australian English accents. This is odd. Anyone know what it's doing this and is there a setting to NOT do this? It's pretty amusing though, I'll admit!
Secondly, during a certain set of text in between sentences the voice will randomly quirk out with strange sounds. I can't figure out why it's doing that either because there is nothing in the text as to why it would. This is also pretty amusing with the random outbursts. I'm not technically live with these yet but I'm in the testing phase and hopefully can release what I'm doing to the general public soon!
Thanks for all the help and looking forward to contributing back where I can.
-MF
Beta Was this translation helpful? Give feedback.
All reactions