French to English translation task notebook #12

Gaurav7888 · 2023-03-24T21:52:50Z

https://colab.research.google.com/drive/14KegLD0ymq4vTRzCjUvP77w9l-IGCsnj?usp=sharing

ShubhamBhut · 2023-03-27T06:45:25Z

Hello! It appears that when using the model to make predictions on external input, the tokenization process may differ from what was used on the original input data. As a result, the model is unable to predict the correct output if the input is not from the dataset. Even if the same text (as external input) is mentioned in the dataset, still it is giving wrong prediction.

Below is the code I used for external prediction, the data preprocessing process is same as for the input dataset tmp_x. I got wrong prediction for this despite that bonjour is clearly mentioned multiple times in the training data

text = ["Bonjour", "mon cheri"]
text[0]
preprocess_x, tk_x = tokenize(text)
preprocess_x[0]
tmp_text = pad(preprocess_x, preproc_french_sentences.shape[1])
tmp_text = tmp_text.reshape((-1, preproc_french_sentences.shape[-2])) 

logits_to_text(loaded_model.predict(tmp_text[[1]])[0], english_tokenizer)

Gaurav7888 · 2023-03-27T07:48:07Z

Yes, for that we can use a tokenizer built on llm from hugging face and then it will give better results.
I tried to build my own using the dataset.

Tirth678 · 2024-07-21T06:23:33Z

Hello hope your day is going great,
This is my first contribution so apologies for any mistake

You can try redefining 'tokenize' and 'pad' as

from keras.preprocessing.text import Tokenizer
from keras.preprocessing.sequence import pad_sequences

def tokenize(sentences):
tokenizer = Tokenizer()
tokenizer.fit_on_texts(sentences)
return tokenizer.texts_to_sequences(sentences), tokenizer

def pad(sequences, length):
return pad_sequences(sequences, maxlen=length, padding='post')

if this goes well you can also try to reshape your input as
predictions = loaded_model.predict(tmp_text)
print("Predictions:", predictions)

I hope it serves you well

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

French to English translation task notebook #12

French to English translation task notebook #12

Gaurav7888 commented Mar 24, 2023

ShubhamBhut commented Mar 27, 2023

Gaurav7888 commented Mar 27, 2023

Tirth678 commented Jul 21, 2024 •

edited

Loading

French to English translation task notebook #12

French to English translation task notebook #12

Comments

Gaurav7888 commented Mar 24, 2023

ShubhamBhut commented Mar 27, 2023

Gaurav7888 commented Mar 27, 2023

Tirth678 commented Jul 21, 2024 • edited Loading

Tirth678 commented Jul 21, 2024 •

edited

Loading