Added encode args to be used during eval #3480

sw241395 · 2025-08-04T19:59:26Z

In some of the eval methods, you can't pass args to the encode method.
Therefore I made some edits too the eval code to enable it.

This will be useful for some think like the jinaai/jina-embeddings-v3 model where you pass in the LoRA adapter you want to use.

Let me know if you want me to change anything.

sw241395 · 2025-08-26T20:34:27Z

It seems like the tests are failing due to:

 WARNING  huggingface_hub.utils._http:_http.py:315 HTTP Error 429 thrown while requesting HEAD https://huggingface.co/sentence-transformers/average_word_embeddings_levy_dependency/resolve/main/0_WordEmbeddings/model.safetensors

Too many requests I suspect from the github/containers IP. I have run them locally and the seems to be ok.

A temporary work around could be to include the model in the tests folderstructure, then during the set up of the tests copy them into the appropriate .cache dir. But i understand this is quite hacky and not great practice.

tomaarsen · 2025-08-27T09:52:17Z

Hello!

Thank you for opening this. Indeed, the test failures are due to rate limits from the various GitHub actions runners. It's unrelated to this PR.
I'm still considering how to best approach your proposal. I think there's two other considerations:

should additional kwargs be passed to __call__, or perhaps via the evaluator initialization instead?
For the InformationRetrievalEvaluator, you might want different parameters for the queries than for the documents. This might mean adding two parameters instead?

Or perhaps we recognize that if a model is custom, then we don't necessarily have 100% compatibility with the rest of the library, and users of those custom models are expected to make the required changes on their side?
It's a bit tricky, I think.

Tom Aarsen

sw241395 · 2025-09-07T19:54:03Z

Hello!

Thank you for opening this. Indeed, the test failures are due to rate limits from the various GitHub actions runners. It's unrelated to this PR. I'm still considering how to best approach your proposal. I think there's two other considerations:
1. should additional kwargs be passed to `__call__`, or perhaps via the evaluator initialization instead?

2. For the [InformationRetrievalEvaluator](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator), you might want different parameters for the queries than for the documents. This might mean adding two parameters instead?
Or perhaps we recognize that if a model is custom, then we don't necessarily have 100% compatibility with the rest of the library, and users of those custom models are expected to make the required changes on their side? It's a bit tricky, I think.
* Tom Aarsen

Hey Tom

Yeah I agree that custom models wont have 100% compatibility to the library.

I am happy to move the args to the init if you feel this is more appropriate. Ive seen in some of the inits they already have args that are used in the call so kinda makes sense too be consistent in that manor.

E.G https://github.com/UKPLab/sentence-transformers/blob/master/sentence_transformers/evaluation/MSEEvaluator.py#L77

Unless you prefer to leave it in the call method then I can go back and fix the InformationRetievalEvaluator.

Thanks
SW

sw241395 added 2 commits August 4, 2025 20:55

Added encode args to be used during eval

d2d3e33

Fixing linting issues

207c2f8

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Added encode args to be used during eval #3480

Added encode args to be used during eval #3480

Uh oh!

sw241395 commented Aug 4, 2025

Uh oh!

sw241395 commented Aug 26, 2025

Uh oh!

tomaarsen commented Aug 27, 2025

Uh oh!

sw241395 commented Sep 7, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Added encode args to be used during eval #3480

Are you sure you want to change the base?

Added encode args to be used during eval #3480

Uh oh!

Conversation

sw241395 commented Aug 4, 2025

Uh oh!

sw241395 commented Aug 26, 2025

Uh oh!

tomaarsen commented Aug 27, 2025

Uh oh!

sw241395 commented Sep 7, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants