Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Transformer Embedding Enhancements #6

Merged
merged 8 commits into from
Dec 19, 2024
Merged

Transformer Embedding Enhancements #6

merged 8 commits into from
Dec 19, 2024

Conversation

JakubSchwenkbeck
Copy link
Owner

Transformer Embedding Enhancements

This PR introduces the following changes related to embeddings in the transformer model:

Key Changes:

1. Retrieving Tokens from Decoded Embeddings

  • Implemented the retrieve_tokens method, which calculates cosine similarities between decoded embeddings and the weights to retrieve the closest matching tokens. This allows us to convert the final embeddings back into their corresponding tokens.
pub fn retrieve_tokens(&self,
                        decoded_embeddings: Array2<f32>,
                        vocab: &HashMap<String, usize>) -> Vec<String> 

2. Handling Probabilities for Token Prediction

  • Added functionality to handle the output probabilities from the model. This enhancement ensures that the most probable token (the one with the highest probability) is selected and returned, helping in token prediction from the model’s output.
pub fn predict_tokens(probabilities: ArrayView2<f32>, vocab: &HashMap<String, usize>) -> Vec<String> 

3. Adding Positional Encoding

  • Introduced a sinusoidal positional encoding mechanism to represent the position of each token in a sequence. The positional encoding alternates between sine and cosine functions and is added to the token embeddings. This allows the model to capture the relative positions of tokens in the input sequence, an essential feature for transformer models.
pub fn sinusoidal_pos_encoding(pos: usize, index: usize, embedding_size: usize) -> f32 

4. Embedding Initialization

  • The initialization of embeddings was adjusted to make them more distinct. This change ensures that each token in the vocabulary has a highly distinguishable embedding, which can help the model better differentiate between tokens and improve learning efficiency.

These changes significantly improve the transformer model by enhancing its ability to predict tokens, represent token positions, and handle diverse embedding initialization.

@JakubSchwenkbeck
Copy link
Owner Author

MERGE ON SUCESSFULL TEST

@JakubSchwenkbeck JakubSchwenkbeck merged commit cfcc03c into main Dec 19, 2024
1 check passed
@JakubSchwenkbeck JakubSchwenkbeck deleted the embeddings branch December 19, 2024 21:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant