Creating embeddings before training model #2930

jinaduuthman · 2024-09-10T22:36:51Z

@tomaarsen, Hi,
I am using the Sbert Trainer method and specifically using the Triplet pairs. [query,positive,negative]
Now I need to add some text to the query([query+some_long_text, positive, negative]) and it would be longer than the max_seq_len and I don't want it truncated.

I read it somewhere that I can create an embedding for the some_long_text and pass this to the model training. I think this looks weird since I am concatenating the embedding with text that way. I have also read one thread here thatcreating embedding before feeding into the model makes the model not to adjust the pretrained weights, is there any better way to do this?

Note that I am using the MultipleNegativeRankingLoss.

jinaduuthman · 2024-09-11T11:26:13Z

@tomaarsen

tomaarsen · 2024-09-25T13:41:57Z

Hello!

Apologies for the delayed response.

I read it somewhere that I can create an embedding for the some_long_text and pass this to the model training.

I haven't heard about this yet.

I have also read one thread here thatcreating embedding before feeding into the model makes the model not to adjust the pretrained weights, is there any better way to do this?

I think this was likely referring to if you create all embeddings before training a model and then using those, then you're not iteratively updating the model weights like is required for actually training a better model. There's a few reasons why that doesn't work, but in short, then gradient descent doesn't work.

I don't think there's a convenient way to avoid the truncation, I'm afraid.

Tom Aarsen

jinaduuthman · 2024-09-25T15:21:48Z

Thank you for your response.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Creating embeddings before training model #2930

Creating embeddings before training model #2930

jinaduuthman commented Sep 10, 2024

jinaduuthman commented Sep 11, 2024

tomaarsen commented Sep 25, 2024

jinaduuthman commented Sep 25, 2024

Creating embeddings before training model #2930

Creating embeddings before training model #2930

Comments

jinaduuthman commented Sep 10, 2024

jinaduuthman commented Sep 11, 2024

tomaarsen commented Sep 25, 2024

jinaduuthman commented Sep 25, 2024