Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement Full Transformer Model in Rust #7

Merged
merged 5 commits into from
Dec 19, 2024
Merged

Conversation

JakubSchwenkbeck
Copy link
Owner

@JakubSchwenkbeck JakubSchwenkbeck commented Dec 19, 2024

Implement Full Transformer Model in Rust

Resolves #4

Description

This pull request introduces a complete implementation of a Transformer model based on the "Attention is All You Need" paper. The model includes:

  1. Separate Encoder and Decoder Inputs:

    • Handles two distinct input sequences, one for the encoder and one for the decoder.
  2. Embedding Layer:

    • Converts tokenized input sequences into dense vector embeddings.
  3. Transformer Encoder:

    • Processes the encoder input to produce encoded representations.
  4. Transformer Decoder:

    • Utilizes the encoded representation and decoder input to generate context-aware outputs.
  5. Output Projection and Softmax:

    • Projects the decoder output into vocabulary space and applies softmax to generate token probabilities.

@JakubSchwenkbeck
Copy link
Owner Author

MERGE ON WORKING TESTS

@JakubSchwenkbeck JakubSchwenkbeck merged commit 23f9866 into main Dec 19, 2024
1 check passed
@JakubSchwenkbeck JakubSchwenkbeck deleted the Transformer/Model branch December 19, 2024 21:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Integrate Layers into an Encoder/Decoder
1 participant