-
Notifications
You must be signed in to change notification settings - Fork 123
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Backport 2.x] Introduce a loading layer in NMSLIB. (#2185) #2211
Closed
0ctopus13prime
wants to merge
431
commits into
opensearch-project:2.x
from
0ctopus13prime:backport-2185-to-2.x
Closed
[Backport 2.x] Introduce a loading layer in NMSLIB. (#2185) #2211
0ctopus13prime
wants to merge
431
commits into
opensearch-project:2.x
from
0ctopus13prime:backport-2185-to-2.x
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This updates the CI system to use jdk-21, which is latest LTS supported version. Coming from opensearch-project/OpenSearch#10334 Signed-off-by: John Mazanec <jmazane@amazon.com>
* Update bwc workflow to include 2.12.0-SNAPSHOT Signed-off-by: Ryan Bogan <rbogan@amazon.com> * Update rolling upgrade version Signed-off-by: Ryan Bogan <rbogan@amazon.com> --------- Signed-off-by: Ryan Bogan <rbogan@amazon.com>
…t#1305) Signed-off-by: John Mazanec <jmazane@amazon.com>
…guide (opensearch-project#1302) * Add more description about running OpenSearch on MAC M1 to developer guide Signed-off-by: gaobinlong <gbinlong@amazon.com> * Change some wording Signed-off-by: gaobinlong <gbinlong@amazon.com> --------- Signed-off-by: gaobinlong <gbinlong@amazon.com>
Signed-off-by: Heemin Kim <heemin@amazon.com>
* Update CVE-affected dependency versions Signed-off-by: Daniel Widdis <widdis@gmail.com> * Change log Signed-off-by: Daniel Widdis <widdis@gmail.com> --------- Signed-off-by: Daniel Widdis <widdis@gmail.com>
…earch-project#1323) Signed-off-by: Naveen Tatikonda <navtat@amazon.com>
Signed-off-by: Naveen Tatikonda <navtat@amazon.com>
…s crash or leave cluster (opensearch-project#1317) * Initial implementation Signed-off-by: Ryan Bogan <rbogan@amazon.com> * Fix compile errors for tests Signed-off-by: Ryan Bogan <rbogan@amazon.com> * Temporary tests Signed-off-by: Ryan Bogan <rbogan@amazon.com> * Ensure backwards compatibility and add zombie to model state enum Signed-off-by: Ryan Bogan <rbogan@amazon.com> * Update current tests Signed-off-by: Ryan Bogan <rbogan@amazon.com> * Fix current integration tests Signed-off-by: Ryan Bogan <rbogan@amazon.com> * Fix unit tests with new changes Signed-off-by: Ryan Bogan <rbogan@amazon.com> * Add unit tests Signed-off-by: Ryan Bogan <rbogan@amazon.com> * Fix spotless Signed-off-by: Ryan Bogan <rbogan@amazon.com> * Add changelog entry Signed-off-by: Ryan Bogan <rbogan@amazon.com> * Delete temporary test file Signed-off-by: Ryan Bogan <rbogan@amazon.com> * Remove temporary changes to build.gradle Signed-off-by: Ryan Bogan <rbogan@amazon.com> * Add more backwards compatibility Signed-off-by: Ryan Bogan <rbogan@amazon.com> * Attempt to fix bwc tests Signed-off-by: Ryan Bogan <rbogan@amazon.com> * Fix spotless Signed-off-by: Ryan Bogan <rbogan@amazon.com> * Remove star imports Signed-off-by: Ryan Bogan <rbogan@amazon.com> * Add another unit test Signed-off-by: Ryan Bogan <rbogan@amazon.com> * Modify unit test to increase coverage Signed-off-by: Ryan Bogan <rbogan@amazon.com> * Change unit test to increase coverage Signed-off-by: Ryan Bogan <rbogan@amazon.com> * Add method description for clusterChanged Signed-off-by: Ryan Bogan <rbogan@amazon.com> * Address PR feedback Signed-off-by: Ryan Bogan <rbogan@amazon.com> * Refactor into TrainingJobClusterStateListener Signed-off-by: Ryan Bogan <rbogan@amazon.com> * Make node assignment final and added in the constructor of TrainingJob Signed-off-by: Ryan Bogan <rbogan@amazon.com> * Remove clusterService from TrainingJobRunner Signed-off-by: Ryan Bogan <rbogan@amazon.com> * Address PR Feedback Signed-off-by: Ryan Bogan <rbogan@amazon.com> * Add flag when node rejoins and check when serializing model Signed-off-by: Ryan Bogan <rbogan@amazon.com> * Address PR feedback Signed-off-by: Ryan Bogan <rbogan@amazon.com> * Address PR Feedback Signed-off-by: Ryan Bogan <rbogan@amazon.com> * Fix spotless Signed-off-by: Ryan Bogan <rbogan@amazon.com> * Test new version check for StreamInput Signed-off-by: Ryan Bogan <rbogan@amazon.com> * Remove check to test new method Signed-off-by: Ryan Bogan <rbogan@amazon.com> * Add version check for stream input/output logic Signed-off-by: Ryan Bogan <rbogan@amazon.com> * Address PR Feedback Signed-off-by: Ryan Bogan <rbogan@amazon.com> * Address PR Feedback Signed-off-by: Ryan Bogan <rbogan@amazon.com> * Address PR Feedback Signed-off-by: Ryan Bogan <rbogan@amazon.com> * Address PR Feedback Signed-off-by: Ryan Bogan <rbogan@amazon.com> * Address PR Feedback Signed-off-by: Ryan Bogan <rbogan@amazon.com> --------- Signed-off-by: Ryan Bogan <rbogan@amazon.com>
* Increase Lucene max dimension limit to 16,000 Signed-off-by: Junqiu Lei <junqiu@amazon.com>
…exing and search performance (opensearch-project#1353) Signed-off-by: Navneet Verma <navneev@amazon.com>
…#1307) Changes how security tests are executed. Instead of setting up docker container with security enabled, we now can directly spin up a gradle local cluster with security which we can use to run tests against. To enable this option, we just have to pass `-Dsecurity.enabled=true` as a flag. Along with this, some refactoring was done for the ODFERestTestCase for configuring the client and cleaning up. Signed-off-by: John Mazanec <jmazane@amazon.com>
* Fix flaky bwc tests Signed-off-by: Ryan Bogan <rbogan@amazon.com>
Fix script score queries not getting cached (opensearch-project#1367) Signed-off-by: Junqiu Lei <junqiu@amazon.com>
Recently, we have seen that TrainingJobRouteDecisionInfoTransportActionTests has been having failures on Windows. The failures are related to an unintialized cluster state. This does not have anything to do with the test itself. Most likely, it is the result of state dependence that happens with KNNSingleNodeTestCase. This change refactors the class to use mocks and a lighter weight base class, KNNTestCase. Signed-off-by: John Mazanec <jmazane@amazon.com>
…nsearch-project#1372) Signed-off-by: Navneet Verma <navneev@amazon.com>
* Throw proper exception to invalid k-NN query Signed-off-by: Junqiu Lei <junqiu@amazon.com> * Move PR to enhancement in CHANGELOG.md Signed-off-by: Junqiu Lei <junqiu@amazon.com> * Resolve PR feedback Signed-off-by: Junqiu Lei <junqiu@amazon.com> * Resolve PR feedback Signed-off-by: Junqiu Lei <junqiu@amazon.com> * Revert IT tests Signed-off-by: Junqiu Lei <junqiu@amazon.com> --------- Signed-off-by: Junqiu Lei <junqiu@amazon.com>
* Add Lucene Codec 9.9 Signed-off-by: Naveen Tatikonda <navtat@amazon.com> * Fix import statements for Lucene95 Codec Signed-off-by: Naveen Tatikonda <navtat@amazon.com> * Fix SegmentInfo Constructor in Test Signed-off-by: Naveen Tatikonda <navtat@amazon.com> * Temporarily Ignore Old Codec Tests Signed-off-by: Naveen Tatikonda <navtat@amazon.com> * Add CHANGELOG Signed-off-by: Naveen Tatikonda <navtat@amazon.com> * Delete Old Codec Tests Signed-off-by: Naveen Tatikonda <navtat@amazon.com> --------- Signed-off-by: Naveen Tatikonda <navtat@amazon.com>
* Add patch to support multi vector in faiss (opensearch-project#1358) Signed-off-by: Heemin Kim <heemin@amazon.com> * Initialize id_map as null (opensearch-project#1363) Signed-off-by: Heemin Kim <heemin@amazon.com> * Add support of multi vector in jni (opensearch-project#1364) Signed-off-by: Heemin Kim <heemin@amazon.com> * Multi vector support for Faiss HNSW (opensearch-project#1371) Apply the parentId filter to the Faiss HNSW search method. This ensures that documents are deduplicated based on their parentId, and the method returns k results for documents with nested fields. Signed-off-by: Heemin Kim <heemin@amazon.com> * Add data generation script for nested field (opensearch-project#1388) Signed-off-by: Heemin Kim <heemin@amazon.com> * Add perf test for nested field (opensearch-project#1394) Signed-off-by: Heemin Kim <heemin@amazon.com> --------- Signed-off-by: Heemin Kim <heemin@amazon.com>
Signed-off-by: Heemin Kim <heemin@amazon.com>
Signed-off-by: Heemin Kim <heemin@amazon.com>
* apply boost Signed-off-by: panguixin <panguixin@bytedance.com> * add change log Signed-off-by: panguixin <panguixin@bytedance.com> --------- Signed-off-by: panguixin <panguixin@bytedance.com>
…nsearch-project#1397) Signed-off-by: panguixin <panguixin@bytedance.com>
…arch-project#1415) * Remove default admin credentials Signed-off-by: Ryan Bogan <rbogan@amazon.com> * Update developer guide Signed-off-by: Ryan Bogan <rbogan@amazon.com> * Debug Signed-off-by: Ryan Bogan <rbogan@amazon.com> * Revert build.gradle changes Signed-off-by: Ryan Bogan <rbogan@amazon.com> * Update developer guide Signed-off-by: Ryan Bogan <rbogan@amazon.com> * Remove default password in favor of <admin-password> Signed-off-by: Ryan Bogan <rbogan@amazon.com> --------- Signed-off-by: Ryan Bogan <rbogan@amazon.com>
Signed-off-by: Ryan Bogan <rbogan@amazon.com>
Refactors integration tests that directly access the model system index. End users should not be directly accessing the model system index. It is supposed to be an implementation detail. We have written restful integration tests that directly access the model system index in order to initialize the cluster state. However, we should not do this because users should not be able to interact with it through restful APIs That being said, some of this implementation detail leaks out into the interface. For instance, in k-NN stats we have a stat that is the model system index status. So, in order to test this, we do need direct access to the system index. Similarly, for search, we execute the search against the system index and directly return the results. This is probably a bug - but we still need to test it. Signed-off-by: John Mazanec <jmazane@amazon.com>
* Fix flaky model tests in k-NN Signed-off-by: Ryan Bogan <rbogan@amazon.com> * Remove * imports Signed-off-by: Ryan Bogan <rbogan@amazon.com> * Minor change Signed-off-by: Ryan Bogan <rbogan@amazon.com> * Add changelog entry Signed-off-by: Ryan Bogan <rbogan@amazon.com> --------- Signed-off-by: Ryan Bogan <rbogan@amazon.com>
…ct#1431) Bumps [aiohttp](https://github.com/aio-libs/aiohttp) from 3.8.6 to 3.9.2. - [Release notes](https://github.com/aio-libs/aiohttp/releases) - [Changelog](https://github.com/aio-libs/aiohttp/blob/master/CHANGES.rst) - [Commits](aio-libs/aiohttp@v3.8.6...v3.9.2) --- updated-dependencies: - dependency-name: aiohttp dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…nsearch-project#2167) Refactored if/else to reduce nesting. Added unit test when one of the field doesn't have live docs. Signed-off-by: Vijayan Balasubramanian <balasvij@amazon.com>
…pensearch-project#2165) * Fix Faiss efficient filter exact search using byte vector datatype Signed-off-by: Naveen Tatikonda <navtat@amazon.com> * Address Review Comments Signed-off-by: Naveen Tatikonda <navtat@amazon.com> --------- Signed-off-by: Naveen Tatikonda <navtat@amazon.com>
…sampling Factor (opensearch-project#2172) Signed-off-by: VIKASH TIWARI <viktari@amazon.com>
…ct#2139) * Introducing a loading layer in FAISS native engine. Signed-off-by: Dooyong Kim <kdooyong@amazon.com> * Update change log. Signed-off-by: Dooyong Kim <kdooyong@amazon.com> * Added unit tests for Faiss stream support. Signed-off-by: Dooyong Kim <kdooyong@amazon.com> * Fix a bug to pass a KB size integer value as a byte size integer parameter. Signed-off-by: Dooyong Kim <kdooyong@amazon.com> * Fix a casting bugs when it tries to laod more than 4G sized index file. Signed-off-by: Dooyong Kim <kdooyong@amazon.com> * Added unit tests for new methods in JNIService. Signed-off-by: Dooyong Kim <kdooyong@amazon.com> * Fix formatting and removed nmslib_stream_support. Signed-off-by: Dooyong Kim <kdooyong@amazon.com> * Removing redundant exception message in JNIService.loadIndex. Signed-off-by: Dooyong Kim <kdooyong@amazon.com> * Fix a flaky testing - testIndexAllocation_closeBlocking Signed-off-by: Dooyong Kim <kdooyong@amazon.com> --------- Signed-off-by: Dooyong Kim <kdooyong@amazon.com> Signed-off-by: Doo Yong Kim <0ctopus13prime@gmail.com> Co-authored-by: Dooyong Kim <kdooyong@amazon.com>
…'. (opensearch-project#2181) Signed-off-by: Dooyong Kim <kdooyong@amazon.com> Co-authored-by: Dooyong Kim <kdooyong@amazon.com>
…se of shard level rescoring is disabled for oversampling factor (opensearch-project#2183) Signed-off-by: VIKASH TIWARI <viktari@amazon.com>
…ject#2160) Signed-off-by: Naveen Tatikonda <navtat@amazon.com> (cherry picked from commit fbec0aa) Co-authored-by: Naveen Tatikonda <navtat@amazon.com>
…ect#2195) Signed-off-by: Navneet Verma <navneev@amazon.com>
Signed-off-by: John Mazanec <jmazane@amazon.com>
…t search when there are no engine files (opensearch-project#2188) * Introduce new setting to configure when to build graph during segment creation (opensearch-project#2007) Added new updatable index setting "build_vector_data_structure_threshold", which will be considered when to build braph or not for native engines. This is noop for lucene. This depends on use lucene format as prerequisite. We don't need to add flag since it is only enable if lucene format is already enabled. Signed-off-by: Vijayan Balasubramanian <balasvij@amazon.com> * Add integration test for binary vector values (opensearch-project#2142) Signed-off-by: Vijayan Balasubramanian <balasvij@amazon.com> * Allow build graph greedily for quantization scenarios (opensearch-project#2175) Previosuly we only added support to build greedily for non quantization scenario. In this commit, we can remove that constraint, however, we cannot skip writing quanitization state since it is required irrespective of type of search is executed later. Signed-off-by: Vijayan Balasubramanian <balasvij@amazon.com> * Add exact search if no native engine files are available (opensearch-project#2136) * Add exact search if no engine files are in segments When graph is not available, plugin will return empty results. With this change, exact search will be performed when only no engine file is available in segment. We also don't need version check or feature flag because, option to not build vector data structure will only be available post 2.17. If an index is created using pre 2.17 version, segment will always have engine files and this feature will never be called during search. --------- Signed-off-by: Vijayan Balasubramanian <balasvij@amazon.com> * Add support for radial search in exact search (opensearch-project#2174) * Add support for radial search in exact search When threshold value is set, knn plugin will not be creating graph. Hence, when search request is trigged during that time, exact search will return valid results. However, radial search was never included as part of exact search. This will break radial search when threshold is added and radial search is requested. In this commit, new method is introduced to accept min score and return documents that are greater than min score, similar to how radial search is performed by native engines. This search is independent of engine, but, radial search is supported only for FAISS engine out of all native engines. Signed-off-by: Vijayan Balasubramanian <balasvij@amazon.com> --------- Signed-off-by: Vijayan Balasubramanian <balasvij@amazon.com>
* Bump Faiss commit from 33c0ba5 to 4eecd91 Signed-off-by: Naveen Tatikonda <navtat@amazon.com> * Update Faiss patches after commit bump Signed-off-by: Naveen Tatikonda <navtat@amazon.com> --------- Signed-off-by: Naveen Tatikonda <navtat@amazon.com>
Signed-off-by: Naveen Tatikonda <navtat@amazon.com>
Signed-off-by: Navneet Verma <navneev@amazon.com>
* Passing correct score mode in NativeEngineKNNVectorQuery * Ensuring visitor is called in KnnQuery Signed-off-by: Navneet Verma <navneev@amazon.com>
* Introduce a loading layer in NMSLIB. Signed-off-by: Dooyong Kim <kdooyong@amazon.com> * Added NMSLIB istream implementation. Signed-off-by: Dooyong Kim <kdooyong@amazon.com> * Fix integer overflow issue when passing read size for loading NMSLIB vector index. Signed-off-by: Dooyong Kim <kdooyong@amazon.com> * Added unit test for NMSLIB loading layer. Signed-off-by: Dooyong Kim <kdooyong@amazon.com> * Made a patch in NMSLIB to avoid frequently calling JNI for better loading index performance. Signed-off-by: Dooyong Kim <kdooyong@amazon.com> * Compliance constexpr function in C++11 having nullstatement. Signed-off-by: Dooyong Kim <kdooyong@amazon.com> --------- Signed-off-by: Dooyong Kim <kdooyong@amazon.com> Co-authored-by: Dooyong Kim <kdooyong@amazon.com>
0ctopus13prime
requested review from
heemin32,
navneet1v,
VijayanB,
vamshin,
jmazanec15,
naveentatikonda,
junqiu-lei,
martin-gaievski,
ryanbogan and
luyuncheng
as code owners
October 15, 2024 23:41
0ctopus13prime
changed the title
Introduce a loading layer in NMSLIB. (#2185)
[Backport 2.x] Introduce a loading layer in NMSLIB. (#2185)
Oct 15, 2024
…NMSLIB. Signed-off-by: Dooyong Kim <kdooyong@amazon.com>
0ctopus13prime
force-pushed
the
backport-2185-to-2.x
branch
from
October 18, 2024 03:08
1c2615e
to
ba74244
Compare
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Backport 7cf45c8 from #2185