English to Japanese Translator by pytorch π (Transformer from scratch)
- English to Japanese translator by Pytorch.
- The neural network architecture is Transformer.
- The layers for Transfomer are implemented from scratch by pytorch. (you can find them under layers/transformer/)
- Parallel corpus(dataset) is kftt.
-
Transformer is a neural network model proposed in the paper βAttention Is All You Needβ
-
As the paper's title said, transformer is a model based on Attention mechanism. Transformer does not use recursive calculation when training like RNN,LSTM
-
Many of the models that have achieved high accuracy in various tasks in the NLP domain in recent years, such as BERT, GPT-3, and XLNet, have a Transformer-based structure.
Install dependencies & create a virtual environment in project by running:
$ poetry install
set PYTHONPATH
export PYTHONPATH="$(pwd)"
Download & unzip parallel corpus(kftt) by running:
$ poetry run python ./utils/download.py
The directory structure is as below.
.
βββ const
βΒ Β βββ path.py
βββ corpus
βΒ Β βββ kftt-data-1.0
βββ figure
βββ layers
βΒ Β βββ transformer
βΒ Β βββ Embedding.py
βΒ Β βββ FFN.py
βΒ Β βββ MultiHeadAttention.py
βΒ Β βββ PositionalEncoding.py
βΒ Β βββ ScaledDotProductAttention.py
βΒ Β βββ TransformerDecoder.py
βΒ Β βββ TransformerEncoder.py
βββ models
βΒ Β βββ Transformer.py
βΒ Β βββ __init__.py
βββ mypy.ini
βββ pickles
βΒ Β βββ nn/
βββ poetry.lock
βββ poetry.toml
βββ pyproject.toml
βββ tests
βΒ Β βββ conftest.py
βΒ Β βββ layers/
βΒ Β βββ models/
βΒ Β βββ utils/
βββ train.py
βββ utils
βββ dataset/
βββ download.py
βββ evaluation/
βββ text/
You can train model by running:
$ poetry run python train.py
epoch: 1
--------------------Train--------------------
train loss: 10.104473114013672, bleu score: 0.0,iter: 1/4403
train loss: 9.551202774047852, bleu score: 0.0,iter: 2/4403
train loss: 8.950608253479004, bleu score: 0.0,iter: 3/4403
train loss: 8.688143730163574, bleu score: 0.0,iter: 4/4403
train loss: 8.4220552444458, bleu score: 0.0,iter: 5/4403
train loss: 8.243291854858398, bleu score: 0.0,iter: 6/4403
train loss: 8.187620162963867, bleu score: 0.0,iter: 7/4403
train loss: 7.6360859870910645, bleu score: 0.0,iter: 8/4403
....
- For each epoch, the model at that point is saved under pickles/nn/
- When the training is finished, loss.png is saved under figure/
