Telegram Bot for English - Arabic Neural Machine Translation
It can be found here -> https://t.me/english_arabic_translator_bot
The bot is deployed on Oracle Cloud
The OpenSubtitles dataset for English-Arabic languages is used to train the Seq2Seq model link to download
To download and preprocess a file in order to remove extra characters and clean up data, run
python data/get_dataset.py --sample_size 5000000 --max_text_len 150
Tokenization is performed using YouTokenToMe BPE-tokenizer
The implementation of the Transformer in PyTorch with 6 layered decoder and encoder and 8 multi attention heads with Glorot initialized parameters.
Reference
- Attention Is All You Need paper
- Understanding the difficulty of training deep feedforward neural networks paper
For the training learning rate 0.00005 is used with warm up for 30000 iterations
The implementation of the method Voita et al. in PyTorch paper
2 experiments of the model attention heads pruning were carried out with λ = 0.05 experiment_1 and λ = 0.01 experiment_2
For λ = 0.05 91 retained heads, for λ = 0.01 89 retained heads.
Reference
- https://github.com/lena-voita/the-story-of-heads
- Are Sixteen Heads Really Better than One? paper
- Analyzing Multi-Head Self-Attention: Specialized Heads Do the Heavy Lifting, the Rest Can Be Pruned paper
- Learning Sparse Neural Networks through L0 Regularization paper