Implement off-heap scoring for OSQ 4, 7, and 8 bit representations #15257

mccullocht · 2025-09-29T18:34:40Z

Partial implementation of #15155

So far this is not any faster than the alternative. On an AMD RYZEN AI MAX+ 395

baseline:
Results:
recall  latency(ms)  netCPU  avgCpuCount     nDoc  topK  fanout  maxConn  beamWidth  quantized  visited  index(s)  index_docs/s  force_merge(s)  num_segments  index_size(MB)  vec_disk(MB)  vec_RAM(MB)  indexType
 0.913        1.635   1.630        0.997  1000000   100     100       32        250     8 bits     6824      0.00      Infinity            0.04             1         3759.67      3677.368      747.681       HNSW

candidate:
Results:
recall  latency(ms)  netCPU  avgCpuCount     nDoc  topK  fanout  maxConn  beamWidth  quantized  visited  index(s)  index_docs/s  force_merge(s)  num_segments  index_size(MB)  vec_disk(MB)  vec_RAM(MB)  indexType
 0.913        1.671   1.661        0.994  1000000   100     100       32        250     8 bits     6824      0.00      Infinity            0.04             1         3759.67      3677.368      747.681       HNSW

DO NOT MERGE
Performance observations: on an avx512 host the profiles are quite different. the original path spends a most time in dotProductBody512 followed by Int512Vector.reduceLanes(). the new path spends much more time in reduceLanes() but also spends more time loading from a memory segment for the input vectors -- a 128 bit load from a memory segment instead of a heap array. this could be memory latency but in that case why doesn't the load into the heap array show up in the profile?

…2BinaryQuantizedVectorsFormat (apache#15222)

Slowly moving the legacy formats to the backwards codecs. I have most of the logic moved, but there are additional things to figure out. I don't think we can easily move the Lucene99ScalarQuantizedVectorScorer just yet, but we should be able to prevent users from using the old quantized formats.

…s for non-accountable query size (apache#15124) * Use RamUsageEstimator to calculate query size instead of the default 1024 bytes for non accountable queries * Use try-with-resources for directory and indexWriter * Adding changes to cache query size and queries per clause to reduce impact of repeated visit() calls during RamUsageEstimator.sizeOf() * Adding changelog entry * Making queries per clause list immutable * Adding unit test to verify query size is cached correctly * Renmaing QueryMetadata to Record * Changing query metadata to record type and removing boolean query changes

Fix some issues found by `actionlint`, `shellcheck`, and `zizmor -p` More issues remain, this is just incremental progress.

@kaivalnp

… cherry picked from the patch by @kaivalnp apache#15240 (apache#15241)

Co-authored-by: Kaival Parikh <kaivalp2000@gmail.com>

…e array filter applied and never cache dictionaries from custom locations (apache#15237) * Fix SmartChinese to only serialize data from classpath with a native array filter applied and never cache dictionaries from custom locations * fix exception handling * add CHANGES.txt * fix exception handling * fix typo in CHANGES.txt * Restore the code to regenerate the serialized file * Disallow any serialization in test-framework as early as possible (we can't do that via sysprops due to Gradle) * Install a serialization filter that only allows Gradle's test runner config deserialization * fix errorprone * use same logic like for security manager to install the filter * add new filter to CHANGES.txt * Improve performance by only installing a filter if we're not called from Gradle * simplify * Add a test for the deserialization filter * fix typo * improve test

…rce leak in existence check (apache#15248)

…pache#15242) Add .github/actionlint.yaml and .github/workflows/actions.yml to enable workflow validation with actionlint and security scanning with zizmor. High severity issues are addressed but other issues remain. This is just an incremental step

…pache#15243)

…che#15250 (apache#15252)

apache#15250.

* ci: tune codeql for java to use default query pack Set security-extended queries only for actions and python. Security-extended for java contains too many noisy checks (e.g. every single place an implicit narrowing cast happens from a compound assignment).

Some of the security-extended checks were actually useful, it only has one extremely noisy rule, just like the default queries have one extremely noisy rule. Disable both of the noisy rules via configuration file instead.

… non-specific method (apache#15187)

…ectorization

…ent codec

benwtrent · 2025-10-01T14:04:34Z

@mccullocht maybe we only do the byte part of the comparisons off-heap? Then apply the corrections all on heap. I would assume applying corrections is pretty cheap, but even then if we did it in bulk, maybe on-heap bulk correction application is pretty fast?

thecoop and others added 28 commits September 24, 2025 09:32

Specify deltas when checking float values for corrections in Lucene10…

0336829

…2BinaryQuantizedVectorsFormat (apache#15222)

Remove unused method which does unsafe XML parsing (apache#15234)

79cb40f

Port and modernize gradle's changes-to-html (apache#15235)

92888a7

ci: partial cleanups for actions (apache#15216)

07a980f

Fix some issues found by `actionlint`, `shellcheck`, and `zizmor -p` More issues remain, this is just incremental progress.

Renamed :lucene:build-tools into actual path in the project. Majority…

8defe93

… cherry picked from the patch by @kaivalnp apache#15240 (apache#15241)

Move away from project.javaexec (apache#15245)

5392109

Co-authored-by: Kaival Parikh <kaivalp2000@gmail.com>

Add permission changes to changes entry (apache#15237)

5a59e0a

DOAP changes for release 9.12.3

c13b17c

Add bugfix version 9.12.3

8901ee5

Fix gradle code formatting (tidy).

b39cf5e

Clarify error message in BackwardsCompatibilityTestBase and fix resou…

46a43cd

…rce leak in existence check (apache#15248)

Upgrade to Gradle 9.1.0 (apache#15236)

74681c8

Re-enable forbiddenapis with JDK 24 signatures for now. apache#15246

d547896

ci: add CodeQL analysis workflow for java, python and github actions (a…

f8d10e9

…pache#15243)

Try to fix eclipse configuration after upgrading to gradle 9.1.0. apa…

2a68f37

…che#15250 (apache#15252)

Add gradlew eclipse to the set of checked commands upon gradle upgrade.

77b3ff4

apache#15250.

ci: further tune the codeql (apache#15254)

c461774

Some of the security-extended checks were actually useful, it only has one extremely noisy rule, just like the default queries have one extremely noisy rule. Disable both of the noisy rules via configuration file instead.

Change instance checks for PerFieldKnnVectorsFormat.FieldsReader to a…

1060ab4

… non-specific method (apache#15187)

introduce vector provider call to get scorer

c4d1e13

factor out that parts of score computation that are not amenable to v…

5862406

…ectorization

share with 102 binarized vectors

02edbf0

make qbvv public so I can use it in the accelerated code path

a022e05

feature complete

bb3d86c

github-actions bot added the module:core/codecs label Sep 29, 2025

fix tidy

403e8a6

mccullocht added 5 commits September 29, 2025 11:48

explicitly cast long -> float

8c331c1

fix toString tests

c89e371

fix bug in off heap sqvv that affects invariant checks in memory segm…

9fe47a9

…ent codec

try to create fewer memory segments

941d2f5

nodeSize

7c76c18

iverase force-pushed the main branch from 978b2d3 to 38fc368 Compare September 30, 2025 14:57

try to avoid allocating multiple memory segments

793c48a

github-actions bot added module:analysis module:core/index module:core/search module:benchmark module:queryparser module:test-framework module:misc module:core/hnsw module:build-infra labels Sep 30, 2025

mccullocht added 3 commits September 30, 2025 11:12

try flattening the corrective terms into the node

4a0cc68

cleanup

3e90948

try another formulation of vector handling

9e1d76e

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Implement off-heap scoring for OSQ 4, 7, and 8 bit representations #15257

Implement off-heap scoring for OSQ 4, 7, and 8 bit representations #15257

mccullocht commented Sep 29, 2025 •

edited

Loading

Uh oh!

benwtrent commented Oct 1, 2025

Uh oh!

Uh oh!

Implement off-heap scoring for OSQ 4, 7, and 8 bit representations #15257

Are you sure you want to change the base?

Implement off-heap scoring for OSQ 4, 7, and 8 bit representations #15257

Conversation

mccullocht commented Sep 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

benwtrent commented Oct 1, 2025

Uh oh!

Uh oh!

mccullocht commented Sep 29, 2025 •

edited

Loading