Chore/small changes to wav2vec2 finetuning #54

saattrupdan · 2023-12-05T12:12:47Z

This PR implements the following:

Fixes an error related to decoding, where duplicate characters would be collapsed (e.g., "hotellet" -> "hotelet"). This was due to the removal of pad tokens during decoding, which we shouldn't do, since they act as CTC boundaries.
Small changes (config changes, some logging)

With this setup, we get quite close to reproducing the original setup. More precisely, we arrive at around 16 WER on Common Voice 9.0, which is not too far away from the SOTA 11 WER, and seems to mainly be due to overfitting. Here are the training plots:

Notably, I sample NST-da and CV9.0 equally, so we go through a lot of Common Voice samples in 120k steps! The sampling ratios should probably be changed, and the masking probabilities should probably be increased.

In any case, I think we're probably close enough to a reproduction to trust the framework and start training models on more data, in any case.

Oh, and I tried freezing the parameters, and also to increase the warmup time. Freezing the parameters didn't help at all, and the increased warmup time might have helped a little bit.

…cate characters

…characters_to_keep

sorenmulli · 2023-12-06T09:21:53Z

Awesome with these results! Good job making a much nicer ASR framework give good results :D
lol @ such good news are in a PR titled "chore/small changes ..."
Just request review when ready :)

…-100

…special tokens

…2SeqWithPadding

saattrupdan · 2023-12-13T12:20:23Z

Awesome with these results! Good job making a much nicer ASR framework give good results :D lol @ such good news are in a PR titled "chore/small changes ..." Just request review when ready :)

Alright, ready for review now. I'm currently optimising the training hyperparameters and I'll just leave the config change for another PR.

AJDERS

LGTM!

saattrupdan added 6 commits November 28, 2023 09:51

chore: Update configs

afdc325

style: Logging, kwargs

77bc20f

debug: Breakpoint

8a15b80

fix: Do not remove special tokens when decoding, as it prevents dupli…

26ccd56

…cate characters

chore: Remove breakpoint

309a60d

docs: Add note

bc07ea0

saattrupdan self-assigned this Dec 5, 2023

saattrupdan requested a review from sorenmulli December 5, 2023 12:12

saattrupdan added 3 commits December 5, 2023 16:05

docs: Always print sample predictions when computing metrics

e6f3f43

chore: Deal with word delimiters

5e20bd4

chore: Update configs

6478ee6

saattrupdan removed the request for review from sorenmulli December 5, 2023 16:15

fix: Do not hardcode max_seconds_per_example, and add | and space to …

ed4d97c

…characters_to_keep

saattrupdan added 10 commits December 7, 2023 13:00

fix: Ensure that we pad with pad_token when using a LM decoder

6270399

fix: Ensure that pad_token is chosen when all logits for a token are …

4c0d09a

…-100

fix: Do not add special tokens to vocab, as then they won't count as …

5d15643

…special tokens

fix: Update padding kwargs in Whisper analogous to Wav2Vec2

4818ec8

fix: Padding with a WhisperProcessor

7a9d4bb

fix: Add max_seconds_per_example to Whisper Processor

56d617c

chore: Change config

61d2b04

fix: Add max_seconds_per_example as argument to DataCollatorSpeechSeq…

2f7573b

…2SeqWithPadding

fix: Typo in config max_seconds_per_example

6430fd7

fix: Remove labels kwarg from Whisper tokenizer padding

6b80161

saattrupdan requested review from sorenmulli and AJDERS December 13, 2023 12:19

AJDERS approved these changes Dec 14, 2023

View reviewed changes

saattrupdan merged commit d9d09de into main Dec 14, 2023
5 checks passed

saattrupdan deleted the chore/small-changes-to-wav2vec2-finetuning branch December 14, 2023 10:22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Chore/small changes to wav2vec2 finetuning #54

Chore/small changes to wav2vec2 finetuning #54

saattrupdan commented Dec 5, 2023 •

edited

Loading

sorenmulli commented Dec 6, 2023 •

edited

Loading

saattrupdan commented Dec 13, 2023

AJDERS left a comment

Chore/small changes to wav2vec2 finetuning #54

Chore/small changes to wav2vec2 finetuning #54

Conversation

saattrupdan commented Dec 5, 2023 • edited Loading

sorenmulli commented Dec 6, 2023 • edited Loading

saattrupdan commented Dec 13, 2023

AJDERS left a comment

Choose a reason for hiding this comment

saattrupdan commented Dec 5, 2023 •

edited

Loading

sorenmulli commented Dec 6, 2023 •

edited

Loading