Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

retriever.retrieve(corpus, queries) with len(queries)==1 errors #170

Open
manestay opened this issue Apr 17, 2024 · 1 comment
Open

retriever.retrieve(corpus, queries) with len(queries)==1 errors #170

manestay opened this issue Apr 17, 2024 · 1 comment

Comments

@manestay
Copy link

manestay commented Apr 17, 2024

I have a somewhat silly use case. I'm running retriever.retrieve, with a queries dict with only 1 entry. However, this causes an IndexError with pytorch due to how pytorch indexes 2D arrays where the first dim is of size 1:

File "<dir>/script.py", line 83, in <module>
    results = retriever.retrieve(para_d, one_query_d)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<env>/beir/retrieval/evaluation.py", line 20, in retrieve
    return self.retriever.search(corpus, queries, self.top_k, self.score_function, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<env>/beir/retrieval/search/dense/exact_search.py", line 73, in search
    scores_dim = cos_scores[1]
                 ~~~~~~~~~~^^^
IndexError: index 1 is out of bounds for dimension 0 with size 1

I was able to fix this by changing the below line:

cos_scores_top_k_values, cos_scores_top_k_idx = torch.topk(cos_scores, min(top_k+1, len(cos_scores[1])), dim=1, largest=True, sorted=return_sorted)

And swapping out cos_scores[1] with scores_dim:

scores_dim = cos_scores[1] if cos_scores.shape[0] != 1 else cos_scores[0]

This monkey patch works for me, but I'm just curious if there's a more appropriate way to retrieve given only 1 query. Thanks!

@eyloncaplan
Copy link

As another dirty fix, I just added the same query twice (with the same relevant documents) so it does not complain about only one query.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants