You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
FAISS Indexes can be created and used for evaluation using the BEIR repository. We have added support to Flat-IP, HNSW, PQ, PCAMatrix, and BinaryFlat Indexes.
Faiss indexes use various compression algorithms useful for reducing Index memory sizes or improving retrieval speed.
You can also save your corpus embeddings as a faiss index, which wasn't possible with the exact search originally.
Check out how to evaluate dense retrieval using a faiss index [here] and dimension reduction using PCA [here].
Multilingual Datasets and Evaluation
Thanks to @julian-risch, we have added our first multilingual dataset to the BEIR repository - GermanQuAD (German SQuAD dataset).
We have changed Elasticsearch now to allow evaluation on languages apart from English, check it out [here].
We also have added a DPR model class which lets you load DPR models from Huggingface Repo, you can use this Class now for evaluation let's say the GermanDPR model [link].
DeepCT evaluation
We have transformed the original DeepCT code to be able to use tensorflow (tf) >v2.0 and now hosted the latest repo [here].
Using the hosted code, we are now able to use DeepCT for evaluation in BEIR using Anserini Retrieval, check [here].
Training Latest MSMARCO v3 Models
From the SentenceTransformers repository, we have integrated the latest training code for MSMARCO on custom manually provided hard negatives. This provides the state-of-the-art SBERT models trained on MSMARCO, check [here].
Using Multiple-GPU for question-generation
A big challenge was to use multiple GPUs for the generation of questions much faster. We have included Process-pools to generate questions much faster and now using multiple GPUs also in parallel, check [here].
Integration of Binary Passage Retrievers (BPR)
BPR (ACL'21, link) is now integrated within the BEIR benchmark. Now you can easily train a state-of-the-art BPR model on MSMARCO using the loss function described in the original paper, check [here].
You can also evaluate BPR now easily now in a zero-shot evaluation fashion, check [here].
We would soon open-source the BPR public models trained on MSMARCO.