Skip to content

Latest commit

 

History

History
14 lines (8 loc) · 955 Bytes

README.md

File metadata and controls

14 lines (8 loc) · 955 Bytes

Attention-is-all-you-need

Paper implementation in PyTorch. This implementation is based on the Transformer architecture presented in the paper "Attention Is All You Need" by Vaswani et al.

Coded by following Umar Jamil's video on Transformers (YouTube link).

Dataset

The dataset used in this implementation is sourced from Hugging Face's datasets: cfilt/iitb-english-hindi.

References

  • Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, L., & Polosukhin, I. (2023). Attention Is All You Need. arXiv preprint arXiv:1706.03762.

  • Kunchukuttan, A., Mehta, P., & Bhattacharyya, P. (2018). The IIT Bombay English-Hindi Parallel Corpus. In Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018). Miyazaki, Japan: European Language Resources Association (ELRA). Paper link.