This repository is an example of fairseq extension for NLP2 course.
Please follow the installation instruction of fairseq
- Get access to project drive (ask your TA)
- Read project description pdf NLP2_ET_2023.pdf and suggested papers
- Follow GroupC-fairseq notebook in order to get familiar with running model training and evaluation
- data for training is in iwslt14 folder
- you have pretrained checkpoint at checkpoint_best.pt
- if you are not familiar with NMT you can read https://evgeniia.tokarch.uk/blog/neural-machine-translation/
- Some notes on fairseq extension https://evgeniia.tokarch.uk/blog/extending-fairseq-incomplete-guide/
- Objective implementation:
- check
rl_criterion.py
incriterion
folder it gives you a hint how to start working on your objective - Nice explanation of RL for NMT https://www.cl.uni-heidelberg.de/statnlpgroup/blog/rl4nmt/
- You can pick any metric and import any library of your choice
- check
- Run the training with your new objective function:
- You can strat fine-tuning to get better/faster results, find
checkpoint_best.pt
in the drive - set
criterion._name
to the name of your implemented criterion - It's enough to fine-tune for <1k steps
- You can strat fine-tuning to get better/faster results, find