Transformer Embedding Enhancements #6

JakubSchwenkbeck · 2024-12-19T17:36:22Z

Transformer Embedding Enhancements

This PR introduces the following changes related to embeddings in the transformer model:

Key Changes:

1. Retrieving Tokens from Decoded Embeddings

Implemented the retrieve_tokens method, which calculates cosine similarities between decoded embeddings and the weights to retrieve the closest matching tokens. This allows us to convert the final embeddings back into their corresponding tokens.

pub fn retrieve_tokens(&self,
                        decoded_embeddings: Array2<f32>,
                        vocab: &HashMap<String, usize>) -> Vec<String>

2. Handling Probabilities for Token Prediction

Added functionality to handle the output probabilities from the model. This enhancement ensures that the most probable token (the one with the highest probability) is selected and returned, helping in token prediction from the model’s output.

pub fn predict_tokens(probabilities: ArrayView2<f32>, vocab: &HashMap<String, usize>) -> Vec<String>

3. Adding Positional Encoding

Introduced a sinusoidal positional encoding mechanism to represent the position of each token in a sequence. The positional encoding alternates between sine and cosine functions and is added to the token embeddings. This allows the model to capture the relative positions of tokens in the input sequence, an essential feature for transformer models.

pub fn sinusoidal_pos_encoding(pos: usize, index: usize, embedding_size: usize) -> f32

4. Embedding Initialization

The initialization of embeddings was adjusted to make them more distinct. This change ensures that each token in the vocabulary has a highly distinguishable embedding, which can help the model better differentiate between tokens and improve learning efficiency.

These changes significantly improve the transformer model by enhancing its ability to predict tokens, represent token positions, and handle diverse embedding initialization.

JakubSchwenkbeck · 2024-12-19T17:36:42Z

MERGE ON SUCESSFULL TEST

JakubSchwenkbeck added 5 commits December 19, 2024 17:50

Embedding tokes: retrieve from decoded emb and retrieve from probability

9c3e6b3

added example to test for errors and consistency

d52859c

added 3D->2D flattening without let binding

21d0865

debug embedding

da46653

implemented fully functional positional (sinusoidal) embedding

86dc59a

JakubSchwenkbeck added 3 commits December 19, 2024 18:38

removed debugging prints

39fdea5

removed debugging prints

63877bf

linted settings

04c3181

JakubSchwenkbeck merged commit cfcc03c into main Dec 19, 2024
1 check passed

JakubSchwenkbeck deleted the embeddings branch December 19, 2024 21:10

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Transformer Embedding Enhancements #6

Transformer Embedding Enhancements #6

JakubSchwenkbeck commented Dec 19, 2024

JakubSchwenkbeck commented Dec 19, 2024

Transformer Embedding Enhancements #6

Transformer Embedding Enhancements #6

Conversation

JakubSchwenkbeck commented Dec 19, 2024

Transformer Embedding Enhancements

Key Changes:

1. Retrieving Tokens from Decoded Embeddings

2. Handling Probabilities for Token Prediction

3. Adding Positional Encoding

4. Embedding Initialization

JakubSchwenkbeck commented Dec 19, 2024