-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Implement off-heap scoring for OSQ 4, 7, and 8 bit representations #15257
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Draft
mccullocht
wants to merge
38
commits into
apache:main
Choose a base branch
from
mccullocht:osq-offheap-vector
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+17,081
−15,954
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…2BinaryQuantizedVectorsFormat (apache#15222)
Slowly moving the legacy formats to the backwards codecs. I have most of the logic moved, but there are additional things to figure out. I don't think we can easily move the Lucene99ScalarQuantizedVectorScorer just yet, but we should be able to prevent users from using the old quantized formats.
…s for non-accountable query size (apache#15124) * Use RamUsageEstimator to calculate query size instead of the default 1024 bytes for non accountable queries * Use try-with-resources for directory and indexWriter * Adding changes to cache query size and queries per clause to reduce impact of repeated visit() calls during RamUsageEstimator.sizeOf() * Adding changelog entry * Making queries per clause list immutable * Adding unit test to verify query size is cached correctly * Renmaing QueryMetadata to Record * Changing query metadata to record type and removing boolean query changes
Fix some issues found by `actionlint`, `shellcheck`, and `zizmor -p` More issues remain, this is just incremental progress.
… cherry picked from the patch by @kaivalnp apache#15240 (apache#15241)
Co-authored-by: Kaival Parikh <kaivalp2000@gmail.com>
…e array filter applied and never cache dictionaries from custom locations (apache#15237) * Fix SmartChinese to only serialize data from classpath with a native array filter applied and never cache dictionaries from custom locations * fix exception handling * add CHANGES.txt * fix exception handling * fix typo in CHANGES.txt * Restore the code to regenerate the serialized file * Disallow any serialization in test-framework as early as possible (we can't do that via sysprops due to Gradle) * Install a serialization filter that only allows Gradle's test runner config deserialization * fix errorprone * use same logic like for security manager to install the filter * add new filter to CHANGES.txt * Improve performance by only installing a filter if we're not called from Gradle * simplify * Add a test for the deserialization filter * fix typo * improve test
…rce leak in existence check (apache#15248)
…pache#15242) Add .github/actionlint.yaml and .github/workflows/actions.yml to enable workflow validation with actionlint and security scanning with zizmor. High severity issues are addressed but other issues remain. This is just an incremental step
* ci: tune codeql for java to use default query pack Set security-extended queries only for actions and python. Security-extended for java contains too many noisy checks (e.g. every single place an implicit narrowing cast happens from a compound assignment).
Some of the security-extended checks were actually useful, it only has one extremely noisy rule, just like the default queries have one extremely noisy rule. Disable both of the noisy rules via configuration file instead.
… non-specific method (apache#15187)
@mccullocht maybe we only do the byte part of the comparisons off-heap? Then apply the corrections all on heap. I would assume applying corrections is pretty cheap, but even then if we did it in bulk, maybe on-heap bulk correction application is pretty fast? |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Partial implementation of #15155
So far this is not any faster than the alternative. On an AMD RYZEN AI MAX+ 395
DO NOT MERGE
Performance observations: on an avx512 host the profiles are quite different. the original path spends a most time in dotProductBody512 followed by Int512Vector.reduceLanes(). the new path spends much more time in reduceLanes() but also spends more time loading from a memory segment for the input vectors -- a 128 bit load from a memory segment instead of a heap array. this could be memory latency but in that case why doesn't the load into the heap array show up in the profile?