Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Chore/small changes to wav2vec2 finetuning #54

Merged
merged 20 commits into from
Dec 14, 2023

Conversation

saattrupdan
Copy link
Collaborator

@saattrupdan saattrupdan commented Dec 5, 2023

This PR implements the following:

  1. Fixes an error related to decoding, where duplicate characters would be collapsed (e.g., "hotellet" -> "hotelet"). This was due to the removal of pad tokens during decoding, which we shouldn't do, since they act as CTC boundaries.
  2. Small changes (config changes, some logging)

With this setup, we get quite close to reproducing the original setup. More precisely, we arrive at around 16 WER on Common Voice 9.0, which is not too far away from the SOTA 11 WER, and seems to mainly be due to overfitting. Here are the training plots:
W B Chart 05_12_2023, 13_11_29
W B Chart 05_12_2023, 13_11_43
W B Chart 05_12_2023, 13_12_19

Notably, I sample NST-da and CV9.0 equally, so we go through a lot of Common Voice samples in 120k steps! The sampling ratios should probably be changed, and the masking probabilities should probably be increased.

In any case, I think we're probably close enough to a reproduction to trust the framework and start training models on more data, in any case.

Oh, and I tried freezing the parameters, and also to increase the warmup time. Freezing the parameters didn't help at all, and the increased warmup time might have helped a little bit.

@saattrupdan saattrupdan self-assigned this Dec 5, 2023
@saattrupdan saattrupdan removed the request for review from sorenmulli December 5, 2023 16:15
@sorenmulli
Copy link
Collaborator

sorenmulli commented Dec 6, 2023

Awesome with these results! Good job making a much nicer ASR framework give good results :D
lol @ such good news are in a PR titled "chore/small changes ..."
Just request review when ready :)

@saattrupdan
Copy link
Collaborator Author

Awesome with these results! Good job making a much nicer ASR framework give good results :D lol @ such good news are in a PR titled "chore/small changes ..." Just request review when ready :)

Alright, ready for review now. I'm currently optimising the training hyperparameters and I'll just leave the config change for another PR.

Copy link
Contributor

@AJDERS AJDERS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@saattrupdan saattrupdan merged commit d9d09de into main Dec 14, 2023
5 checks passed
@saattrupdan saattrupdan deleted the chore/small-changes-to-wav2vec2-finetuning branch December 14, 2023 10:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants