FLAC in browser #5

carbonrider · 2019-11-06T15:14:43Z

Hi,

Heads up: This question is to understand which approach will have better results.

We are facing some issues with French speech recognition (we are using Google AI services). The current process is as follows

Record using MediaRecorder JS API --> Save as webm ---> Extract FLAC using FFMPEG --> Invoke Google Speech API

After looking at your library, it seems that you are converting audio bits received from MediaRecorder to FLAC in browser itself (which implies server side processing is not required).

Now moving to the main point, do you think any of the above approach will have impact on the Transcription quality?

russaa · 2019-11-06T16:00:47Z

it probably depends on which audio codec is used in webm:
if it is a lossy codec (e.g. opus or vorbis), there could be some quality loss, depending on compression/encoding parameters.

carbonrider · 2019-11-06T16:28:11Z

Hi @russaa
Thanks for your reply. Could you please provide snippet or a reference URL which will outline appropriate codec and compression configuration to be passed to MediaRecorder API in JS?

Sorry, this isn't relevant to your project. But I am looking for some insight on how to record good quality speech that can be transcribed with good quality.

PS: I am having no issues with English speaker, it is non-english speaker/languages causing problems.

russaa · 2019-11-06T17:41:49Z

I haven't really used MediaRecorder, but I would probably have a look at the mozilla-dev resources, e.g.
Web audio codec guide and The "codecs" parameter in common media types.

But you probably also need to check, if the specific browser supports the container-format & codec you want to use, see MediaRecorder.isTypeSupported().
Some browsers may even support directly encoding to FLAC, or some other lossless format (I have not tried this myself) ... and use a JavaScript library like libflac.js or flac.js as fallback for browsers that do not "natively" support audio-encoding to FLAC.

For comparison of various lossy codecs, the Opus website provides some information.
Although the general recommendation is, to avoid compressing to lossy formats in intermediate steps.

If you do not have bandwidth concerns, you could also send the raw WAV data to your server.

Also: maybe the French speech recognition of Google AI just generally delivers poorer results(?)
As a test, you could try by sending the uncompressed WAV data to your server, and check, if the recognition results improve.

wingedrasengan927 · 2020-01-27T08:21:35Z

Record using MediaRecorder JS API --> Save as webm ---> Extract FLAC using FFMPEG --> Invoke Google Speech API
Hello @carbonrider,
I've tried the process as you with the webm mimType but was not able to invoke the google cloud speech-to-text api as it said the audio is not supported.
So it would be really helpful if you could provide the code link. Thanks.

carbonrider · 2020-01-27T08:45:19Z

Hello @carbonrider,
I've tried the process as you with the webm mimType but was not able to invoke the google cloud speech-to-text api as it said the audio is not supported.
So it would be really helpful if you could provide the code link. Thanks.

Hi @wingedrasengan927
You should invoke FFMPEG using following arguments
-vn -acodec flac -ac 1

This should give you FLAC file which can be used with Google Speech API.

Also, I would like to recommend that record the voice using separate stream and use libflac.js to encode audio stream. This approach really improves quality.

wingedrasengan927 · 2020-01-28T17:04:14Z

thank you for the advice @carbonrider. would definitely try it out.

wingedrasengan927 · 2020-01-28T18:23:29Z

Tested it. Works like charm. Thank you so much!

carbonrider · 2020-01-29T03:25:27Z

Awesome @wingedrasengan927

Vishav3 · 2021-07-19T19:41:56Z

thank you for the advice @carbonrider. would definitely try it out.

I stuck into the similar problem. Could you please share your working snippet?

luan-dang-techlabs · 2023-11-01T17:11:38Z

@carbonrider - can you tell me which ffmpeg you are using?

luan-dang-techlabs · 2023-11-01T17:32:59Z

@wingedrasengan927 @carbonrider - do you both have sample code I can look at?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FLAC in browser #5

FLAC in browser #5

carbonrider commented Nov 6, 2019

russaa commented Nov 6, 2019

carbonrider commented Nov 6, 2019

russaa commented Nov 6, 2019 •

edited

Loading

wingedrasengan927 commented Jan 27, 2020

carbonrider commented Jan 27, 2020

wingedrasengan927 commented Jan 28, 2020

wingedrasengan927 commented Jan 28, 2020

carbonrider commented Jan 29, 2020

Vishav3 commented Jul 19, 2021

luan-dang-techlabs commented Nov 1, 2023

luan-dang-techlabs commented Nov 1, 2023

FLAC in browser #5

FLAC in browser #5

Comments

carbonrider commented Nov 6, 2019

russaa commented Nov 6, 2019

carbonrider commented Nov 6, 2019

russaa commented Nov 6, 2019 • edited Loading

wingedrasengan927 commented Jan 27, 2020

carbonrider commented Jan 27, 2020

wingedrasengan927 commented Jan 28, 2020

wingedrasengan927 commented Jan 28, 2020

carbonrider commented Jan 29, 2020

Vishav3 commented Jul 19, 2021

luan-dang-techlabs commented Nov 1, 2023

luan-dang-techlabs commented Nov 1, 2023

russaa commented Nov 6, 2019 •

edited

Loading