feat: add similarity search with distance score#125
feat: add similarity search with distance score#125juanfe88 wants to merge 6 commits intogoogleapis:mainfrom
Conversation
937a423 to
a1eb133
Compare
| query: List[float], | ||
| k: int = DEFAULT_TOP_K, | ||
| filters: Optional[BaseFilter] = None, | ||
| with_scores: Optional[bool] = False, |
There was a problem hiding this comment.
Thanks for adding the support!
Could we pass distance_result_field as parameter so that we could calculate the score based on any field (not limit to distance)?
There was a problem hiding this comment.
I left it as optional with a default value, but it can be overriden. I also added a test for these cases WDYT ? 😄
There was a problem hiding this comment.
Thanks! In this case, do we still need with_scores? Since distance_result_field is optional, with_scores field can be implicitly done by distance_result_field.
(ex: if distance_result_field is specified, return scores)
There was a problem hiding this comment.
You're right. I just refactored it so that we can get the scores just by passing the name of field
This change introduces the `distance_result_field` parameter to `similarity_search_with_score` in `FirestoreVectorStore`. Users can now specify a custom field name for the distance score returned in the results, defaulting to "distance". This provides flexibility and helps avoid potential field name conflicts when results are processed. The internal `_similarity_search` method was updated to accept and use this custom field name. A new test case verifies this functionality.
| query: List[float], | ||
| k: int = DEFAULT_TOP_K, | ||
| filters: Optional[BaseFilter] = None, | ||
| with_scores: Optional[bool] = False, |
There was a problem hiding this comment.
Thanks! In this case, do we still need with_scores? Since distance_result_field is optional, with_scores field can be implicitly done by distance_result_field.
(ex: if distance_result_field is specified, return scores)
Remove the `with_scores` parameter from the `_similarity_search` method. The `distance_result_field` parameter now solely determines if distance scores should be calculated and returned by the underlying `find_nearest` call. This change simplifies the method's API and internal logic by making the provision of `distance_result_field` the explicit way to request scores. The `similarity_search_with_score` method has been updated accordingly.
|
Hi @pcostell ! 👋 Hope you're well. I’m circling back to this PR after a while. I've verified that the changes are still working correctly. Please let me know if you need any other updates from me to get this merged. I also have some Async support features I'd love to contribute, but since the repo has been a little inactive recently, I wanted to make sure you are still reviewing new contributions before I open a new PR. Thanks! |
Thank you for opening a Pull Request! Before submitting your PR, there are a few things you can do to make sure it goes smoothly:
Fixes #<issue_number_goes_here> 🦕
I was not the one submitting the issue however I ran into the same problem and figured out I could try and fix it
#87