Attention-is-all-you-need

Paper implementation in PyTorch. This implementation is based on the Transformer architecture presented in the paper "Attention Is All You Need" by Vaswani et al.

Coded by following Umar Jamil's video on Transformers (YouTube link).

Dataset

The dataset used in this implementation is sourced from Hugging Face's datasets: cfilt/iitb-english-hindi.

References

Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, L., & Polosukhin, I. (2023). Attention Is All You Need. arXiv preprint arXiv:1706.03762.
Kunchukuttan, A., Mehta, P., & Bhattacharyya, P. (2018). The IIT Bombay English-Hindi Parallel Corpus. In Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018). Miyazaki, Japan: European Language Resources Association (ELRA). Paper link.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Attention-is-all-you-need

Dataset

References

Files

README.md

Latest commit

History

README.md

File metadata and controls

Attention-is-all-you-need

Dataset

References