Skip to content

YadaYuki/transformer-from-scratch

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

79 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

English to Japanese Translator by pytorch πŸ™Š (Transformer from scratch)

Overview

  • English to Japanese translator by Pytorch.
  • The neural network architecture is Transformer.
  • The layers for Transfomer are implemented from scratch by pytorch. (you can find them under layers/transformer/)
  • Parallel corpus(dataset) is kftt.

Transformer

image

  • Transformer is a neural network model proposed in the paper β€˜Attention Is All You Need’

  • As the paper's title said, transformer is a model based on Attention mechanism. Transformer does not use recursive calculation when training like RNN,LSTM

  • Many of the models that have achieved high accuracy in various tasks in the NLP domain in recent years, such as BERT, GPT-3, and XLNet, have a Transformer-based structure.

Requirements

Setup

Install dependencies & create a virtual environment in project by running:

$ poetry install

set PYTHONPATH

export PYTHONPATH="$(pwd)"

Download & unzip parallel corpus(kftt) by running:

$ poetry run python ./utils/download.py

Directories

The directory structure is as below.

.
β”œβ”€β”€ const
β”‚Β Β  └── path.py
β”œβ”€β”€ corpus
β”‚Β Β  └── kftt-data-1.0
β”œβ”€β”€ figure
β”œβ”€β”€ layers
β”‚Β Β  └── transformer
β”‚Β Β      β”œβ”€β”€ Embedding.py
β”‚Β Β      β”œβ”€β”€ FFN.py
β”‚Β Β      β”œβ”€β”€ MultiHeadAttention.py
β”‚Β Β      β”œβ”€β”€ PositionalEncoding.py
β”‚Β Β      β”œβ”€β”€ ScaledDotProductAttention.py
β”‚Β Β      β”œβ”€β”€ TransformerDecoder.py
β”‚Β Β      └── TransformerEncoder.py
β”œβ”€β”€ models
β”‚Β Β  β”œβ”€β”€ Transformer.py
β”‚Β Β  └── __init__.py
β”œβ”€β”€ mypy.ini
β”œβ”€β”€ pickles
β”‚Β Β  └── nn/
β”œβ”€β”€ poetry.lock
β”œβ”€β”€ poetry.toml
β”œβ”€β”€ pyproject.toml
β”œβ”€β”€ tests
β”‚Β Β  β”œβ”€β”€ conftest.py
β”‚Β Β  β”œβ”€β”€ layers/
β”‚Β Β  β”œβ”€β”€ models/
β”‚Β Β  └── utils/
β”œβ”€β”€ train.py
└── utils
    β”œβ”€β”€ dataset/
    β”œβ”€β”€ download.py
    β”œβ”€β”€ evaluation/
    └── text/

How to run

You can train model by running:

$ poetry run python train.py

epoch: 1
--------------------Train--------------------

train loss: 10.104473114013672, bleu score: 0.0,iter: 1/4403

train loss: 9.551202774047852, bleu score: 0.0,iter: 2/4403

train loss: 8.950608253479004, bleu score: 0.0,iter: 3/4403

train loss: 8.688143730163574, bleu score: 0.0,iter: 4/4403

train loss: 8.4220552444458, bleu score: 0.0,iter: 5/4403

train loss: 8.243291854858398, bleu score: 0.0,iter: 6/4403

train loss: 8.187620162963867, bleu score: 0.0,iter: 7/4403

train loss: 7.6360859870910645, bleu score: 0.0,iter: 8/4403

....
  • For each epoch, the model at that point is saved under pickles/nn/
  • When the training is finished, loss.png is saved under figure/

Reference

Licence

MIT

About

Transformer from scratch πŸ™Š (English to Japanese Translator by PyTorch)

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages