This repository demonstrates experiments using Milvus for Full Text Search (FTS) and Hybrid Search capabilities. We explore the comparison between dense search, sparse search, and hybrid search for retrieving relevant data.
Deploy Milvus 2.5
Install PyMivus with Model Library.
pip install "pymilvus[model]" -U
Get a Voyage AI API Key.
Before running any experiments, ensure that the required environment variables and collections are properly set up.
You need to set up your VOYAGE_API
as an environment variable for authentication.
export VOYAGE_API="your_api_key_here"
The experiments require data to be inserted into Milvus. The following script will insert data into the collection milvus_standard
:
python milvus_hybrid_index_client.py
To compare Hybrid Search and Dense Search, run the following script:
python hybrid_compare_search_dense_client.py
This will output results comparing the performance and retrieval quality of hybrid search against dense vector search.
Similarly, to compare Hybrid Search and Sparse Search, run the following script:
python hybrid_compare_search_sparse_client.py
To insert data into Milvus with modified stopwords, use the script below. This will insert data into the collection milvus_rectify:
python milvus_hybrid_index_client_rectify.py
After inserting data with modified stopwords, you need to modify the collection name to milvus_rectify in the following scripts:
hybrid_compare_search_dense_client.py
hybrid_compare_search_sparse_client.py
Manually change the collection name in the script from milvus_standard
to milvus_rectify
.
After updating the collection name, execute the following scripts again to observe the effect of modified stopwords:
Hybrid vs Dense Comparison:
python hybrid_compare_search_dense_client.py
Hybrid vs Sparse Comparison:
python hybrid_compare_search_sparse_client.py
After running the comparisons, analyze the results to see how the different search types (Hybrid, Dense, Sparse) perform, especially when modified stopwords are introduced.