Helping French TTS to pronounce numbers #739
Conversation
…TTS to pronounce numbers of all kinds
services/espnet-tts-fr/src/app.py
Outdated
| def isfloat(num): | ||
| try: | ||
| float(num) | ||
| return True | ||
| except ValueError: | ||
| return False |
There was a problem hiding this comment.
I added this because there's (supposedly) no built-in function in Python to check if a number is a float.
services/espnet-tts-fr/src/app.py
Outdated
| logger.debug(segment.split()) | ||
| for s in segment.split(): | ||
| try: | ||
| logger.debug(f'Performing on: "{s}"') |
There was a problem hiding this comment.
This is going to get very chatty very quickly...
There was a problem hiding this comment.
Not merging right now, but do you anticipate this being useful going forward? I'm a little worried about the slim possibility of this leaking some personal info.
There was a problem hiding this comment.
I find this helpful to debug what component the TTS couldn't convert to numbers sometimes, but it can definitely be case specific (log in the except statement) rather than logging out everything.
You might be right... In that case, I would have to change the logs of multilang-support as well because it has been logging the whole segment out everytime, both in English and French 🫤
There was a problem hiding this comment.
Ah, yeah please do that too then!
|
@JRegimbal I just reformat the |
services/espnet-tts-fr/src/app.py
Outdated
| s = num2words(float(tempns), lang='fr') | ||
| elif "-" in s: | ||
| logger.debug("case 3: -") | ||
| logger.debug("Case: has a - as separator") |
There was a problem hiding this comment.
Hmm, this would likely fail for cases where two separators are used. E.g., "39,5-40,0". We could probably do something with regexps with the - separator as well. Get the fields we need excluding text with ([\d,.]+)-([\d,.]+), replace with de $1 à $2 and then apply another find and replace on $1 and $2 to replace occurrences of , with .. Does this make sense?
There was a problem hiding this comment.
Yes, it does. I will try them out.
services/espnet-tts-fr/src/app.py
Outdated
| s = num2words(s, lang='fr') | ||
| elif "," in s: | ||
| logger.debug(f"Case: '{s}' has , as separator") | ||
| logger.debug("Case: has a , as separator") |
There was a problem hiding this comment.
Hmm, does this still get run for text uses of ,?
There was a problem hiding this comment.
Could also catch with a regexp find replace ((\d+),(\d+) to $1.$2).
There was a problem hiding this comment.
It catches text with ,, but it won't perform any changes since on the next line, it couldn't cast the text to float.
I will look into regex and string processing more, thanks for the suggestion! I genuinely forgot about regex and went straight for the simpliest implementation.
|
|
@notkaramel you can make a new issue, but I'd put it in backlog for right now and bring it up tomorrow if you're still looking for something to do. There may be a higher priority issue. |
Changes Made
espnet-tts-fr) struggles with numerical numbers #723num2wordslibrary to help convert numbers in numeric form to text (e.g. "42" to "quarante deux")multilang-supporttranlated to Metropolitan French where they have a different notation system.Testings
Please ensure you've followed the checklist and provide all the required information before requesting a review.
If you do not have everything applicable to your PR, it will not be reviewed!
If you don't know what something is or if it applies to you, ask!
Don't delete below this line.
Required Information
Coding/Commit Requirements
New Component Checklist (mandatory for new microservices)
docker-compose.ymlandbuild.yml..github/workflows.